Notes - by Kishor
Notes - by Kishor
1. Document Processing:
o The application loads all .txt files from your documents directory
o Each chunk is converted into a numerical vector using OpenAI's embedding model
2. Vector Storage:
3. Question Answering:
1. Import Errors:
o Check your .env file exists and contains the correct API key
import os
import argparse
class RAGApplication:
"""
Parameters:
- openai_api_key: Optional API key (can also be set via .env file)
"""
load_dotenv()
if openai_api_key:
os.environ["OPENAI_API_KEY"] = openai_api_key
self.documents_dir = documents_dir
self.embeddings = OpenAIEmbeddings()
"""
"""
# Create a loader that will read all .txt files in the directory
loader = DirectoryLoader(
self.documents_dir,
documents = loader.load()
return documents
"""
Parameters:
"""
text_splitter = RecursiveCharacterTextSplitter(
)
# Split all documents into chunks
splits = text_splitter.split_documents(documents)
return splits
"""
This converts text chunks to vectors and stores them for similarity search.
"""
self.vector_store = Chroma.from_documents(
documents=splits,
embedding=self.embeddings,
self.vector_store.persist()
def setup_qa_chain(self):
"""
"""
retriever = self.vector_store.as_retriever(
self.qa_chain = RetrievalQA.from_chain_type(
"""
Parameters:
Returns:
"""
if not self.qa_chain:
return {
"answer": response["result"],
def main():
parser.add_argument('--question', type=str,
parser.add_argument('--api_key', type=str,
args = parser.parse_args()
rag_app = RAGApplication(
documents_dir=args.docs_dir,
openai_api_key=args.api_key
print("Loading documents...")
documents = rag_app.load_documents()
print("Splitting documents...")
rag_app.create_vector_store(splits)
rag_app.setup_qa_chain()
# Handle questions either from command line or interactive mode
if args.question:
response = rag_app.query(args.question)
print("\nAnswer:", response['answer'])
print("\nSource Documents:")
print(f"\nDocument {idx}:")
print(doc[:200] + "...")
else:
# Interactive mode
while True:
if question.lower() == 'quit':
break
response = rag_app.query(question)
print("\nAnswer:", response['answer'])
print("\nSource Documents:")
print(f"\nDocument {idx}:")
print(doc[:200] + "...")
if __name__ == "__main__":
main()
Now comes the exciting part - actually running your RAG application! You have several ways to do
this:
python rag_app.py
```
a) Document Processing:
- The system reads all your text files from the documents folder
c) Question-Answering:
- Show you both the answer and the source documents it used
```bash
```
3. Advanced Configuration:
```bash
```
- Check your .env file exists and has the correct API key
Would you like me to explain any part of this process in more detail? Or would you like to see an
example of the system in action with some sample documents?
Linux commands :
Intro 0:05
⏩ ssh 0:21
⏩ ls 0:30
⏩ pwd 0:35
⏩ cd 0:51
⏩ touch 1:23
⏩ echo 1:32
⏩ nano 1:42
⏩ vim 1:56
⏩ cat 2:02
⏩ shred 2:10
⏩ mkdir 2:15
⏩ cp 2:26
⏩ rm 2:28
⏩ rmdir 2:38
⏩ ln 2:45
⏩ clear 2:50
⏩ whoami 2:57
⏩ useradd 3:02
⏩ sudo 3:08
⏩ adduser 3:15
⏩ su 3:21
⏩ exit 3:29
⏩ passwd 3:50
⏩ apt 4:12
⏩ finger 4:20
⏩ man 4:33
⏩ whatis 4:55 ⏩ curl 5:05 ⏩ zip 5:13 ⏩ unzip 5:20 ⏩ less 5:29 ⏩ head 5:32 ⏩ tail 5:34 ⏩ cmp
5:42 ⏩ diff 5:50 ⏩ sort 6:00 ⏩ find 6:19 ⏩ chmod 6:24 ⏩ chown 6:34 ⏩ ifconfig 6:40 ⏩ ip
address 6:47 ⏩ grep 7:02 ⏩ awk 7:26 ⏩ resolvectl status 7:31 ⏩ ping 7:57 ⏩ netstat 8:08 ⏩ ss
8:14 ⏩ iptables 8:24 ⏩ ufw 8:43 ⏩ uname 8:52 ⏩ neofetch 9:01 ⏩ cal 9:14 ⏩ free 9:21 ⏩ df
9:28 ⏩ ps 9:36 ⏩ top 9:40 ⏩ htop 9:44 ⏩ kill 10:03 ⏩ pkill 10:14 ⏩ systemctl 10:29 ⏩ history
10:35 ⏩ reboot 10:37 ⏩ shutdown