0% found this document useful (0 votes)

8 views2 pages

Interview Task 1

The document outlines the development of an Intelligent Document Processing and Query System that processes technical PDF documents, extracts key information, and stores it in a vector database for user query responses. It includes requirements for document processing, information extraction, vector database integration, query processing, response generation, system integration, and performance optimization. The deliverables consist of Python code, documentation, and a performance report.

Uploaded by

phalkeshubham19

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views2 pages

Interview Task 1

Uploaded by

phalkeshubham19

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

Task: Intelligent Document Processing and

Query System
Objective:
Develop a system that processes technical PDF documents, extracts key
information, stores it in a vector database, and provides relevant responses to
user queries using Retrieval-Augmented Generation (RAG).

Requirements:
1. Document Processing:
- Accept 10 PDF files as input.
- Extract text content from each PDF.
- Split each document into logical sections (e.g., paragraphs or pages).
2. Information Extraction and Tagging:
- For each section, extract and tag the following information:
a. Equipment name
b. Domain (e.g., electronics, mechanical, software)
c. Model numbers
d. Manufacturer
3. Vector Database Integration:
- Choose and implement a suitable vector database (e.g., Pinecone, Weaviate, or
Milvus, or any other of your choice).
- Convert each tagged section into a vector representation.
- Store the vectors along with their associated metadata (tags) in the database.
4. Query Processing:
- Implement a user interface to accept natural language queries.
- Extract for which equipment, model or manufacture is the query for.
- Convert user queries into vector representations.
- Perform cosine similarity search in the vector database to retrieve the most
relevant sections for the matching (equipment, model or manufacturer)
5. Response Generation:
- Utilize a Language Model (e.g., GPT-3, GPT-4) for response generation.
- Use this API key if you do not have your own (key - sk-proj-
3NAMKruBiPy16sQr1ixNT3BlbkFJmRPJIl1zNhn7qH2bD1dI). Make sure that activity
on this key is monitored so use it only for this task.
- Design an effective prompt that incorporates the retrieved relevant sections
and the user's query.
- Generate a coherent and informative response based on the retrieved
information.
6. System Integration:
- Develop a Python application that integrates all the above components.
- Ensure smooth data flow from document processing to query response.
7. Performance and Scalability:
- Optimize the system for quick response times.
- Design the system to handle potential scaling to more documents in the future.

Example Scenario:
Input: 10 PDF files containing technical specifications of various electronic
devices.
User Query: "What is the power consumption of the latest XYZ Corp
smartphone?"

Expected System Behaviour:

1. Process and tag all 10 PDFs, storing information in the vector database.
2. Convert the user query to a vector.
3. Retrieve the most relevant section(s) from the database.
4. Generate a response using the LLM, incorporating the retrieved information.
5. Present the answer to the user, e.g., "The latest XYZ Corp smartphone, model
ABC123, has a power consumption of 5W in standby mode and up to 15W during
peak usage, according to the technical specifications."

Deliverables:
1. Python code for the entire system.
2. Documentation explaining the architecture, chosen technologies, and how to
run the system.
3. A brief report on the system's performance, including response times and
accuracy.

Tip:
Feel free to use LLM to generate code for you.

Learn SAP Basis in 24 Hours
From Everand
Learn SAP Basis in 24 Hours
Alex Nordeen
4.5/5 (2)
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
From Everand
Google Cloud Platform for Data Engineering: From Beginner to Data Engineer using Google Cloud Platform
alasdair gilchrist
5/5 (1)
C# for Beginners: Learn in 24 Hours
From Everand
C# for Beginners: Learn in 24 Hours
Alex Nordeen
No ratings yet
Wheebox Test Guide
100% (1)
Wheebox Test Guide
20 pages
Assignment For Applied AI Engineer (RAG Pipeline) Role
No ratings yet
Assignment For Applied AI Engineer (RAG Pipeline) Role
4 pages
Document RAG Assignment
No ratings yet
Document RAG Assignment
4 pages
Sithafal Project Tasks
No ratings yet
Sithafal Project Tasks
2 pages
Problem Statement
No ratings yet
Problem Statement
4 pages
Study Guide Cisco 300-735 SAUTO Automating and Programming Cisco Security Solutions Exam
From Everand
Study Guide Cisco 300-735 SAUTO Automating and Programming Cisco Security Solutions Exam
Anand Vemula
No ratings yet
Mini Project Docubot Power Point
No ratings yet
Mini Project Docubot Power Point
17 pages
Technical Interview Task
No ratings yet
Technical Interview Task
3 pages
Take-Home Challenge
No ratings yet
Take-Home Challenge
3 pages
GenAI Virtual Interviewer POC
No ratings yet
GenAI Virtual Interviewer POC
3 pages
Intern Assignment
No ratings yet
Intern Assignment
1 page
AI Eng Task
No ratings yet
AI Eng Task
1 page
Backend Engineering Take-Home Assignment
No ratings yet
Backend Engineering Take-Home Assignment
4 pages
Project Report
No ratings yet
Project Report
60 pages
SAS Interview Questions You'll Most Likely Be Asked
From Everand
SAS Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
Met PDF Extraction System
No ratings yet
Met PDF Extraction System
43 pages
Synopsis
No ratings yet
Synopsis
3 pages
RAG Project Understanding Document
No ratings yet
RAG Project Understanding Document
4 pages
5th - DE Presentation Format
No ratings yet
5th - DE Presentation Format
12 pages
Post-Interview Evaluation Test1
No ratings yet
Post-Interview Evaluation Test1
2 pages
Jobless Group Logithon PPT
No ratings yet
Jobless Group Logithon PPT
7 pages
Exam MS-102: Microsoft 365 Administrator Complete Exam Preparation
From Everand
Exam MS-102: Microsoft 365 Administrator Complete Exam Preparation
Georgio Daccache
No ratings yet
PROJECT
No ratings yet
PROJECT
32 pages
Practical RAG
No ratings yet
Practical RAG
127 pages
DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Preparation
From Everand
DP-600: Implementing Analytics Solutions Using Microsoft Fabric Exam Preparation
Georgio Daccache
No ratings yet
HLD LLD Design
No ratings yet
HLD LLD Design
3 pages
CV NguyenVanTuan
No ratings yet
CV NguyenVanTuan
3 pages
Harshit AI ML Engineer
No ratings yet
Harshit AI ML Engineer
4 pages
Backend Developer Assignment
No ratings yet
Backend Developer Assignment
3 pages
Hackathon Siet Problem Statements
No ratings yet
Hackathon Siet Problem Statements
5 pages
Information Retreival Assignment
No ratings yet
Information Retreival Assignment
4 pages
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
From Everand
SAS Programming Guidelines Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
No ratings yet
One-Month Crash Course - Implementing RAG Architecture With Python, FastAPI, and Vector Search
No ratings yet
One-Month Crash Course - Implementing RAG Architecture With Python, FastAPI, and Vector Search
4 pages
GenAI Final Project
No ratings yet
GenAI Final Project
8 pages
Ankit - Mrinalkar - Aryan - Amandeep (Minor Report)
No ratings yet
Ankit - Mrinalkar - Aryan - Amandeep (Minor Report)
19 pages
Hack Hustlers: Keshav Garg - Generative AI Engineer Jatin Raghav - Full Stack Engineer Parv Maurya - UI/UX Designer
No ratings yet
Hack Hustlers: Keshav Garg - Generative AI Engineer Jatin Raghav - Full Stack Engineer Parv Maurya - UI/UX Designer
5 pages
Project Topics
No ratings yet
Project Topics
5 pages
Project
No ratings yet
Project
4 pages
Assignment
No ratings yet
Assignment
5 pages
LLM For QnA Proposal
No ratings yet
LLM For QnA Proposal
12 pages
Information Technology HandBook
From Everand
Information Technology HandBook
Duong Tran
3/5 (1)
Tutoring Intelligent System
No ratings yet
Tutoring Intelligent System
19 pages
Generative AI PPT Final
No ratings yet
Generative AI PPT Final
34 pages
.Net Framework and Programming in ASP.NET
From Everand
.Net Framework and Programming in ASP.NET
Priyanka Agarwal
No ratings yet
LS Sowmya Resume
No ratings yet
LS Sowmya Resume
4 pages
Gen Project
No ratings yet
Gen Project
7 pages
Hackathon Problem Statement April 29 2023 PDF
No ratings yet
Hackathon Problem Statement April 29 2023 PDF
3 pages
AI Mini Project
No ratings yet
AI Mini Project
22 pages
LY Project Poster
No ratings yet
LY Project Poster
1 page
Full-Stack Developer Assignment
No ratings yet
Full-Stack Developer Assignment
3 pages
Assignment
No ratings yet
Assignment
5 pages
Fullstack Internship Assignment
No ratings yet
Fullstack Internship Assignment
2 pages
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
From Everand
Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data
Byron Ellis
No ratings yet
Project Basket
No ratings yet
Project Basket
388 pages
Bootcamp GenAI AgenticAI Backend Engineers MacBook
No ratings yet
Bootcamp GenAI AgenticAI Backend Engineers MacBook
3 pages
Python Lists
No ratings yet
Python Lists
6 pages
Study Guide Cisco 300-535 SPAUTO Automating and Programming Cisco Service Provider Solutions
From Everand
Study Guide Cisco 300-535 SPAUTO Automating and Programming Cisco Service Provider Solutions
Anand Vemula
No ratings yet
Visual Basic 2010 Coding Briefs Data Access
From Everand
Visual Basic 2010 Coding Briefs Data Access
Kevin Hough
5/5 (1)
Workday Course Curriculum TEKSLATE
No ratings yet
Workday Course Curriculum TEKSLATE
9 pages
M02 AWS Security+Management+in+AWS Ed9
No ratings yet
M02 AWS Security+Management+in+AWS Ed9
38 pages
Pass4Test: IT Certification Guaranteed, The Easy Way!
No ratings yet
Pass4Test: IT Certification Guaranteed, The Easy Way!
7 pages
Midterm Sample For Electronics and Circuits
No ratings yet
Midterm Sample For Electronics and Circuits
3 pages
Logic Gates - Theory
100% (1)
Logic Gates - Theory
43 pages
Object Oriented Programming
No ratings yet
Object Oriented Programming
16 pages
Webinar Optislang 4 Ansys Workbench Dynardo GMBH Equation of Motion of Free
No ratings yet
Webinar Optislang 4 Ansys Workbench Dynardo GMBH Equation of Motion of Free
36 pages
LESSON 5 - Common Errors in Computer Networks
No ratings yet
LESSON 5 - Common Errors in Computer Networks
17 pages
Wistron NeWeb Corporation CDMA-82 Approval Sheet
No ratings yet
Wistron NeWeb Corporation CDMA-82 Approval Sheet
22 pages
Comp1:Office Productivity Application Software: Talisay City College
No ratings yet
Comp1:Office Productivity Application Software: Talisay City College
9 pages
GRS ExerciseManual Rev6.1
No ratings yet
GRS ExerciseManual Rev6.1
15 pages
Microsoft Entra Private Access Datasheet Final 5 26 2023
No ratings yet
Microsoft Entra Private Access Datasheet Final 5 26 2023
2 pages
Deepak Report
No ratings yet
Deepak Report
26 pages
Engineering Science and Technology, An International Journal
No ratings yet
Engineering Science and Technology, An International Journal
21 pages
"EduPotStat" - Construction and Testing of A Low Cost Potentiostat
No ratings yet
"EduPotStat" - Construction and Testing of A Low Cost Potentiostat
9 pages
BANSAL - MOHIT Resume 1
No ratings yet
BANSAL - MOHIT Resume 1
1 page
2020 Samsung MagicINFO 8 WEB
No ratings yet
2020 Samsung MagicINFO 8 WEB
10 pages
Bringing Computational Thinking K12 Barr Stephenson 2011
No ratings yet
Bringing Computational Thinking K12 Barr Stephenson 2011
7 pages
Coding With Lua Cheatsheet: Create New Scripts Run Code
No ratings yet
Coding With Lua Cheatsheet: Create New Scripts Run Code
2 pages
Binary Arithmetic
No ratings yet
Binary Arithmetic
35 pages
Using The Compute Module Provisioner
No ratings yet
Using The Compute Module Provisioner
18 pages
Setting Up Epson Logo Tools
No ratings yet
Setting Up Epson Logo Tools
4 pages
Installing Operating The Osa 54xx Family Sync Director Open Class
No ratings yet
Installing Operating The Osa 54xx Family Sync Director Open Class
3 pages
Python (Back Traking Algorithms)
No ratings yet
Python (Back Traking Algorithms)
6 pages
Muhammad's Resume
No ratings yet
Muhammad's Resume
2 pages
Packages in Java-1
No ratings yet
Packages in Java-1
7 pages
4 Core Competency of Computer Systems Servicing
No ratings yet
4 Core Competency of Computer Systems Servicing
16 pages
INFO5301 Tutorial 1 Answers PDF
No ratings yet
INFO5301 Tutorial 1 Answers PDF
3 pages
Tp-Link TD-W8961N
No ratings yet
Tp-Link TD-W8961N
85 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Interview Task 1

Uploaded by

Interview Task 1

Uploaded by

Task: Intelligent Document Processing and

Expected System Behaviour:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.