- Keyword Extraction: To get keywords which best define the document
- Summarization: to generate paragraph-wise Summaries of the document
- Creating a NLP model which generates automatic subjective answers using Information Retrieval and Summarization techniques
- Step 1: Raw Text Data to Organized DataFrame
- Step 2: Paragraph-wise Keyword Extraction
- Step 3: Vectorizing Keywords to form Representative Vectors for paragraphs
- Step 4: Summarizing Paragraphs to generate fixed length Answers
- Step 5: Query Question to Vector
- Step 6: Scoring Function to calculate Paragraph Scores
- Step 7: Selecting Best Answer based on Final Scores
- This file has Analysis and Visualizations of the text document we are working with.
- This file contains the Paragraph-wise Keyword Extraction using 6 different methods.
- This file contains the Paragraph-wise Summarization using 5 different methods
- This file is the final implementation of Subjective Answer Finder. The last cell contains a small GUI-like interface.
- Please install the required dependancies.
- Download glove encodings from here and place it in the same directory.
- Text format of my document is as follows:
- Chapter Name on the first line followed by a blank line.
- Paragraph-title followed by the paragraph description.
- A empty line after completion of each paragraph.
- 2 empty lines at the end of chapter before the Question/Answer section.
- This method can be used effectively for Information Retrieval purposes for obtaining relevant information from big text documents
- This Auto-Answering Model can also be used to find subjective answers to given Questions from Textbooks
- Accuracy ~ 75% ( Spacy + Model1)
- Many NLP based tasks such as Keyword Extraction, Vectorization and Summarization are performed which has many individual applications
