Hackathon
Hackathon
DOC-
HIVE
DocHive Mates
Mentor:
Ritu Jain
Students:
Subradeep Bal,
Vidhant Jain,
Tejas Singh
DocHive
DocHive
previous interactions.
Creates a mind map based on the given data
suitable to user’s understanding.
Features Streamlit Sidebar and
Interface:
of Allows users to upload PDFs and track the
DocHive
number of characters and chunks processed.
Provides an input field for user queries and
displays responses dynamically.
Features
of Multi-lingual
• The chatbot utilizes sophisticated vector stores to conduct similarity searches, enabling it to
provide accurate and contextually relevant information quickly and efficiently.
• By employing advanced algorithms, DocHive identifies relevant topics based on user queries,
ensuring that users receive the most pertinent information with minimal effort.
• With its advanced search capabilities, DocHive significantly reduces the time users spend
finding information, streamlining the learning process and increasing productivity.
• The chatbot seamlessly accesses external databases, providing users with a broader knowledge
base and enhancing the depth of information available.
Working of NLP
Working in natural language processing (NLP) typically involves using computational techniques to
analyze and understand human language. This can include tasks such as language understanding,
language generation, and language interaction.
• Data Collection: Gathering text data from various sources such as websites, books, social media,
or proprietary databases.
• Data Storage: Storing the collected text data in a structured format, such as a database or a
collection of documents.
Deploying the trained model and using it to make predictions or extract insights from new text data.
• Text Classification: Categorizing text into predefined classes (e.g., spam detection, sentiment
analysis).
• Named Entity Recognition (NER): Identifying and classifying entities in the text.
• Question Answering: Providing answers to questions based on the context provided by text
data.
Diagrammatic representation
Future Enhancements