Auto Question Paper Generator
Automatic question paper generator from Wikipedia and PDF files.
As we all say “time is money”. We all agree to it equally as well. Time in a teacher’s life and in student’s life is no less important too. Teachers are found spending their worthy hours in mere the generation of class tests and assessments. Likewise, many self-learning students are found looking for ways to assess themselves. We looked upon this problem and came up with an intriguing statement that “Teachers and self-learners need an aid in assessment tests generation”. After undergoing quite a research in the domain and visiting different teaching institutes and interviewing a few privately learning students, we gathered a number of constructing bricks for our project. This research brought us to build E-Learning and Earning Online (ELEO). ELEO uses NLP, a subfield of Artificial Intelligence, to generate Multiple Choice Questions automatically from provided source. This source can either be a specified keyword against which the user wishes to generate a question paper or simply a (PDF) that the user can upload over the system. In the case of a Keyword, System scraps all the text from Wikipedia and convert it into multiple choice questions. And here ELEO comes with a methodology to summarize raw text from Wikipedia and also parse the text from provided content to generate Multiple Choice Questions (MCQS). The System finds all the Named Entities and POS (Parts of speech tags) in the content to create relevant questions. ELEO provides the user with a runtime generated Multiple Choice Questions test along with a feasibility to download the test in a Word format. This way the users can export the generated test for further use and if required, can also edit the test to customize it accordingly. Through ELEO teachers will save hours of their daily time and will be able to focus more on students and their learning. While self-learning students will be able to assess themselves thoroughly and completely for grades up to the mark.
THIS PROJECT IS BUILT USING
<p>Apart from Standford and NLTK in NATURAL LANGUAGE PROCESSING, there is a lot to enhance which system describes [14]. The methodology behind the Multiple-Choice Questions(MCQs) generation used in this proposed system includes: A. Text Parsing Either user would provide text in the form of PFD files or simply give any interesting topic, System has to parse the text especially when it comes from Wikipedia. STMS uses Beautiful Soup Python library to scrap the text from URL for a given keyword. Text parsing includes removal of References[n], special characters, and links in the text that has been scrapped from Wikipedia. Text refining is the basic approach of Pre-processing of the text. An example Python code to scrape text from URL. </p><p> </p><p>1) from urllib2 import urlopen 2) from bs4 import BeautifulSoup 3) url ="https://en.wikipedia.org/wiki/NLP" 4) allData = urlopen(url) 5) bs = BeautifulSoup(data, "lxml") 6) rawText = bs.findAll("p") #Find All <p> (paragraph tags in the HTML file) B. Text Summarization In this step, Regular Expression (REGEX) plays an important role to refine text. After refining text can be summarized using SUMY Python Library which includes (LUHN, LexRank, LSA) summarization algorithms to Summarize the text and moreover system selects an average number of sentences to generate question sentences. The importance of the sentences can be analysed using Keyword Frequency Method. </p><p> </p><p>1) import re 2) import sumy 3) from sumy.parsers.plaintext import PlaintextParser 4) from sumy.nlp.tokenizers import Tokenizer 5) from sumy.summarizers.lex_rank import LexRankSummarizer 6) from sumy.summarizers.luhn import LuhnSummarizer 7) Removal of References and extra spaces + special characters using RegularExpressions 8) article_text = re.sub('\[[0-9]*\]', ' ', article_text) 9) article_text = re.sub('\s+', ' ', article_text) 10) Summarization 11) summarizer = LexRankSummarizer() 12) summary1 = summarizer(original.document, 50) C. Question Formation After Sentences selection or text summarization, the system focuses on the question formation based on NER and SRL. SLR (Semantic Role Labelling), one of the approaches of Natural Language Processing is used to make the patterns of the words in the string of sentences. </p><p>The Named Entity Recognition (NER) comes with the same methodology to find Entities like (PERSONS, LOCATIONS, DATE, and ORGANIZATION). The System creates clusters of them and then generates questions accordingly. D. Wh- Question Approach In this approach, the system evaluates the sentences according to the NES. STMS classify the sentences like NEP (For PERSON), ORG (Organization) and LOC (Location). And generate questions accordingly. </p><p> </p><p>1) WHO/WHOM functions for NE(Person). 2) WHERE functions call for LOC (NE-Location) 3) WHEN functions call for (NE-Date). </p><p> </p><p>The WH- defined functions will create complete sentences accordingly.</p>
PROJECT STATUS
Average # of unique visitors per month
not provided
Average $ revenue per month
not provided
ITEMS BEING SOLD
THIS PROJECT IS LISTED IN
Artificial Intelligence
Data Mining
Machine Learning
Natural Language Processing