Synopsis
Synopsis
PROJECT-II SYNOPSIS
ON
Manish (821229)
[AP, CSE]
content using natural language. It integrates LangChain, OpenAI embeddings, and a Chroma vector database to perform
Key Features
- PDF Upload & Parsing: Users can upload PDFs, which are parsed and loaded usingPyPDFLoader.
- Text Chunking: Documents are split into overlapping chunks using RecursiveCharacterTextSplitterfor optimal
(text-embedding-3-large).
- Vector Store: Embeddings are stored in a persistent Chroma vector store for efficient retrieval.
- Semantic Search: User queries are converted to vectors, and relevant document chunks areretrieved using similarity
search (k=1).
- LLM Response: Retrieved chunks and user prompts are passed to ChatOpenAI (gpt-4o-mini) usinga
- Live UI: Built with Streamlit, the app features real-time file uploads, prompt inputs, and streaminganswers.
Deployment Instructions
1. Install Dependencies:
export OPENAI_API_KEY=your_key_here
This project showcases the seamless integration of document processing, vector search, and AI interaction. With minimal
user input, the app delivers accurate, context-aware answers from uploaded documentsdemonstrating real-world potential
Future Scope
References