0% found this document useful (0 votes)
295 views

NLP Synopsis

The document discusses using text classification and summarization tools to automatically categorize and summarize large amounts of text data. It can identify useful information from datasets and give condensed summaries. The tools perform classification, summarization and sentiment analysis on input documents.

Uploaded by

Vibhakar Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
295 views

NLP Synopsis

The document discusses using text classification and summarization tools to automatically categorize and summarize large amounts of text data. It can identify useful information from datasets and give condensed summaries. The tools perform classification, summarization and sentiment analysis on input documents.

Uploaded by

Vibhakar Sharma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Auto Text Summarization with Classification

and Sentiment Analysis


Under the supervision

of

Dr. BHAWNA SURI


Associate Professor
Department of Computer Science & Engineering

by

Soumya Aggarwal, 07420802716


Vibhakar Raj Sharma, 08120802716

Department of Computer Science & Engineering


Bhagwan Parshuram Institute of Technology
PSP-4, Sec-17, Rohini, Delhi-89
ABSTRACT

In today’s world the volume of information is dramatically increasing, and the value of that
information is growing fast. Modern organizations deal with terabytes of text, such as email, that
often plays a significant role in their day-to-day operations. Even small and medium-sized
organizations are dealing with growing volumes of text that require rapid access and meaningful
analysis on a daily basis.

Identification of useful information from the available datasets is quite difficult and requires some
sort of a mechanism. One possible solution is to use a text classification and summarization tool.
Text categorizer automatically arranges a set of documents into predefined concepts (or categories)
and the Summarizer gives a condensed and meaningful depiction of input data such that the output
includes the most significant extracts of the source.
TABLE OF CONTENTS

Abstract...........................................................................................................................................i
Table of Contents............................................................................................................................ii
1.0 Introduction …………………………………………………………………………………..1
2.0 Problem Statement & Feasibility Study………………………………………………………2
3.0 Hardware and Software Requirements………………………………………………………..3
3.1 Hardware Requirements………………………………………………………………3
3.2 Software Requirements……………………………………………………………….3
4.0 Workload Matrix...……………………………………………………………………………4
5.0 Quality Paramters……………………………………………………………………………..5
Reference…...……………………………………………………………………………………..6
CHAPTER 1- INTRODUCTION

With the massive growth of information on the Internet, the conventional techniques of retrieving
information have become quite challenging as well as time consuming for finding relevant and
significant information effectively. A simple keyword-based search on the internet returns
thousands of lengthy documents, thus overwhelming the user. It is therefore essential to develop
tools that can efficiently assist users in the identification of the desired documents.

Text Classification and Summarization is done on the input documents. After obtaining the
summary of all the classified documents, sentiment analysis is done on each of them in-order to
identify whether the result of the summary is positive or negative.

Text classification has always been a vital application because it is used in ordering of the
documents to support data retrieval tasks. The text classification task can be defined as assigning
category to the documents based on the knowledge gained from the Knowledge Base (KB).

Text summarization is the process of generating short, fluent, and most importantly accurate
summary of a respectively longer text document (Brownlee, 2017a). The main idea behind
automatic text summarization is to be able to find a short subset of the most essential information
from the entire set and present it in a human-readable format.

Sentiment analysis helps to evaluate ideas, feelings and behavior, which is used to make decisions.
The task in sentiment analysis is basically to categorize the polarity of a given text in the document,
whether the expressed sentiment in a document is positive or negative. It not only helps the general
public, but also assists the companies with thorough evaluation of behaviour and opinions of the
customers who are using their products, thus helping them during the decision-making process.
CHAPTER 2- PROBLEM STATEMENT & FEASIBILITY STUDY

Today, our world is parachuted by the gathering and dissemination of huge amounts of data. With
such a big amount of data circulating in the digital space, there is need to develop machine learning
algorithms that can automatically shorten verbose texts, classify them, and deliver accurate
summaries that can fluently deliver the intended information.
The aim is to create a coherent and fluent summary having only the main points outlined in the
document. The Natural Level Processing technique has proved to be critical in quickly and
accurately summarizing and classifying voluminous texts, something which could be expensive and
time consuming if done without machines.

The Project is operationally feasible since all the small, medium and big companies as well as the
general internet users having basic knowledge about computer and Internet can use it effectively.
The text summarizer and classifier tool is based on client-server architecture, where client is users
and server is the machine where the datasets are stored.
CHAPTER 3- HARDWARE AND SOFTWARE REQUIREMENTS

3.1 Hardware Requirements


Minimum:
 Intel 486 processor or better
 16 / 24 Mbytes RAM
 10 MB hard disk space

3.2 Software Requirements


 Spyder for running the Python Scripts
 Windows 95/98/2000 Windows NT 4.0/ 2000 Profession
CHAPTER 4- WORKLOAD MATRIX
CHAPTER 5- QUALITY PARAMETERS

PARAMETERS GRADE (0-3)

Innovation 2

Real Time Problems 2

Creativity 2

Thoroughness 2

Knowledge Gained 3

Accuracy of Conclusions 2

Helpful for the society 2

Quality of written and Oral 2


Presentation

Easy to use 3

Scalable 3
REFERENCE

[1] Brownlee, J. (2017a, November 29). A Gentle Introduction to Text Summarization.


Retrieved March 02, 2018, from
https://machinelearningmastery.com/gentle-introduction-text-summarization/

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy