0% found this document useful (0 votes)

45 views11 pages

Research Ibm Com Blog retrieval-augmented-generation-RAG

Uploaded by

uma5b3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views11 pages

Research Ibm Com Blog retrieval-augmented-generation-RAG

Uploaded by

uma5b3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Research Focus areas Blog Publications Careers About

Home
↳ Blog

Date Explainer 5 minute read

22 Aug 2023

Authors What is retrieval-augmented

Kim Martineau
generation?
Topics
RAG is an AI framework for retrieving facts from an external
AI
knowledge base to ground large language models (LLMs) on
Explainable AI
the most accurate, up-to-date information and to give users
Generative AI
insight into LLMs' generative process.
Natural Language Process…

Trustworthy Generation

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
Large language models can be inconsistent. Sometimes they
Explore what you can do with
nail the answer to questions, other times they regurgitate IBM watsonx to deploy and
random facts from their training data. If they occasionally embed AI across your business.
sound like they have no idea what they’re saying, it’s
because they don’t. LLMs know how words relate
statistically, but not what they mean.

Retrieval-augmented generation (RAG) is an AI framework

for improving the quality of LLM-generated responses by
grounding the model on external sources of knowledge to
supplement the LLM’s internal representation of information.
PDFmyURL converts web pages and even full websites to PDF easily and quickly.
Implementing RAG in an LLM-based question answering
system has two main benefits: It ensures that the model has
access to the most current, reliable facts, and that users
have access to the model’s sources, ensuring that its claims
can be checked for accuracy and ultimately trusted.

“You want to cross-reference a model’s answers with the

original content so you can see what it is basing its answer
on,” said Luis Lastras, director of language technologies at
IBM Research.

RAG has additional benefits. By grounding an LLM on a set of

external, verifiable facts, the model has fewer opportunities
to pull information baked into its parameters. This reduces
the chances that an LLM will leak sensitive data, or
‘hallucinate’ incorrect or misleading information.

RAG also reduces the need for users to continuously train the
model on new data and update its parameters as
circumstances evolve. In this way, RAG can lower the
computational and financial costs of running LLM-powered
chatbots in an enterprise setting. IBM unveiled its new AI
and data platform, watsonx, which offers RAG, back in May.

An overview from IBM expert Marina Danilevsky

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
Hear from other AI experts

Dmitri Krotov is on a quest to improve AI.

Ruchir Puri leads the charge on code modernization.
Kush Varshney explains how governance is applied throughout the AI
pipeline.

An ‘open book’ approach to answering tough

questions

Underpinning all foundation models, including LLMs, is an AI

architecture known as the transformer. It turns heaps of raw

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
data into a compressed representation of its basic structure.
Starting from this raw representation, a foundation model
can be adapted to a variety of tasks with some additional
fine-tuning on labeled, domain-specific knowledge.

But fine-tuning alone rarely gives the model the full breadth
of knowledge it needs to answer highly specific questions in
an ever-changing context. In a 2020 paper, Meta (then
known as Facebook) came up with a framework called
retrieval-augmented generation to give LLMs access to
information beyond their training data. RAG allows LLMs to
build on a specialized body of knowledge to answer
questions in more accurate way.

“It’s the difference between an open-book and a closed-

book exam,” Lastras said. “In a RAG system, you are asking
the model to respond to a question by browsing through the
content in a book, as opposed to trying to remember facts
from memory.”

As the name suggests, RAG has two phases: retrieval and

content generation. In the retrieval phase, algorithms search
for and retrieve snippets of information relevant to the user’s
prompt or question. In an open-domain, consumer setting,
those facts can come from indexed documents on the
internet; in a closed-domain, enterprise setting, a narrower
set of sources are typically used for added security and
reliability.

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
This assortment of external knowledge is appended to the
user’s prompt and passed to the language model. In the
generative phase, the LLM draws from the augmented
prompt and its internal representation of its training data to
synthesize an engaging answer tailored to the user in that
instant. The answer can then be passed to a chatbot with
links to its sources.

Toward personalized and verifiable responses

Before LLMs, digital conversation agents followed a manual

dialogue flow. They confirmed the customer’s intent, fetched
the requested information, and delivered an answer in a one-
size-fits all script. For straightforward queries, this manual
decision-tree method worked just fine.

But it had limitations. Anticipating and scripting answers to

every question a customer might conceivably ask took time;
if you missed a scenario, the chatbot had no ability to
improvise. Updating the scripts as policies and
circumstances evolved was either impractical or impossible.

Today, LLM-powered chatbots can give customers more

personalized answers without humans having to write out
new scripts. And RAG allows LLMs to go one step further by
greatly reducing the need to feed and retrain the model on
fresh examples. Simply upload the latest documents or

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
policies, and the model retrieves the information in open-
book mode to answer the question.

IBM is currently using RAG to ground its internal customer-

care chatbots on content that can be verified and trusted.
This real-world scenario shows how it works: An employee,
Alice, has learned that her son’s school will have early
dismissal on Wednesdays for the rest of the year. She wants
to know if she can take vacation in half-day increments and if
she has enough vacation to finish the year.

To craft its response, the LLM first pulls data from Alice’s HR
files to find out how much vacation she gets as a longtime
employee, and how many days she has left for the year. It
also searches the company’s policies to verify that her
vacation can be taken in half-days. These facts are injected
into Alice’s initial query and passed to the LLM, which
generates a concise, personalized answer. A chatbot delivers
the response, with links to its sources.

Teaching the model to recognize when it doesn’t

know

Customer queries aren’t always this straightforward. They

can be ambiguously worded, complex, or require knowledge
the model either doesn’t have or can’t easily parse. These

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
are the conditions in which LLMs are prone to making things
up.

“Think of the model as an overeager junior employee that

blurts out an answer before checking the facts,” said Lastras.
“Experience teaches us to stop and say when we don’t know
something. But LLMs need to be explicitly trained to
recognize questions they can’t answer.”

In a more challenging scenario taken from real life, Alice

wants to know how many days of maternity leave she gets. A
chatbot that does not use RAG responds cheerfully (and
incorrectly): “Take as long as you want.”

Maternity-leave policies are complex, in part, because they

vary by the state or country of the employee’s home-office.
When the LLM failed to find a precise answer, it should have
responded, “I’m sorry, I don’t know,” said Lastras, or asked
additional questions until it could land on a question it could
definitively answer. Instead, it pulled a phrase from a training
set stocked with empathetic, customer-pleasing language.

With enough fine-tuning, an LLM can be trained to pause and

say when it’s stuck. But it may need to see thousands of
examples of questions that can and can’t be answered. Only
then can the model learn to identify an unanswerable
question, and probe for more detail until it hits on a question
that it has the information to answer.

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
RAG is currently the best-known tool for grounding LLMs on
the latest, verifiable information, and lowering the costs of
having to constantly retrain and update them. RAG depends
on the ability to enrich prompts with relevant information
contained in vectors, which are mathematical
representations of data. Vector databases can efficiently
index, store and retrieve information for things like
recommendation engines and chatbots. But RAG is
imperfect, and many interesting challenges remain in getting
RAG done right.

At IBM Research, we are focused on innovating at both ends

of the process: retrieval, how to find and fetch the most
relevant information possible to feed the LLM; and
generation, how to best structure that information to get the
richest responses from the LLM.

See how we're applying AI research

We are setting out to reduce the environmental impact of PFAS.

We are developing methods to improve chatbots.
We are co-creating an AI foundation model for weather and climate
prediction.

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
Subscribe to our Future Forward newsletter and stay up to
date on the latest research news S ubs
c ri b e
news to our
lette
r

Explainer Explainer Research Research

An air traffic Why we’re teaching New algorithms open How memory
controller for LLMs LLMs to forget things possibilities for augmentation can
training AI models o… improve large…

Kim Martineau Kim Martineau Peter Hess Peter Hess

10 Oct 2024 07 Oct 2024 03 Oct 2024 24 Sep 2024

Foundation Models Generative AI AI Generative AI AI AI Hardware AI Generative AI

PDFmyURL converts web pages and even full websites to PDF easily and quickly.
Focus areas Work with us Follow us
Semiconductors Careers Newsletter
Artificial Intelligence Collaborate X
Quantum Computing Contact Research LinkedIn
Hybrid Cloud YouTube

Directories
Quick links Topics
About People
Publications Projects
Blog
Events

Contact IBM Terms of use

Privacy Accessibility

PDFmyURL converts web pages and even full websites to PDF easily and quickly.

RAG - A Simple Introduction
100% (5)
RAG - A Simple Introduction
75 pages
RAG Architecture
100% (7)
RAG Architecture
52 pages
A Taxonomy of Retrieval Augmented Generation
100% (2)
A Taxonomy of Retrieval Augmented Generation
56 pages
WWW Databricks Com Glossary Retrieval-Augmented-Generation-Rag
No ratings yet
WWW Databricks Com Glossary Retrieval-Augmented-Generation-Rag
12 pages
Retrieval Augmented Generation - A Simple Introduction
No ratings yet
Retrieval Augmented Generation - A Simple Introduction
82 pages
Rag
No ratings yet
Rag
10 pages
rag
No ratings yet
rag
20 pages
RAG - The Future of LLMs - LinkedIn
No ratings yet
RAG - The Future of LLMs - LinkedIn
7 pages
WWW Cohesity Com Glossary Retrieval-Augmented-Generation-Rag
No ratings yet
WWW Cohesity Com Glossary Retrieval-Augmented-Generation-Rag
5 pages
WWW Oracle Com in Artificial-Intelligence Generative-Ai Retrieval-Augmented-Generation-Rag
No ratings yet
WWW Oracle Com in Artificial-Intelligence Generative-Ai Retrieval-Augmented-Generation-Rag
7 pages
Cloud Google Com Use-Cases Retrieval-Augmented-Generation
No ratings yet
Cloud Google Com Use-Cases Retrieval-Augmented-Generation
7 pages
WWW - K2view - Com - What Is Retrieval Augmented Generation
No ratings yet
WWW - K2view - Com - What Is Retrieval Augmented Generation
29 pages
Ibm
No ratings yet
Ibm
12 pages
2.5 Retrieval Augmented Generation RAG
No ratings yet
2.5 Retrieval Augmented Generation RAG
2 pages
DSPT 114 - Hands-On With LlamaIndex - First Steps For Retrieval-Augmented Generation (RAG)
No ratings yet
DSPT 114 - Hands-On With LlamaIndex - First Steps For Retrieval-Augmented Generation (RAG)
87 pages
PPT Agents
No ratings yet
PPT Agents
59 pages
Building Blocks of Rag Ebook Final
100% (1)
Building Blocks of Rag Ebook Final
9 pages
LLM and RAG
No ratings yet
LLM and RAG
12 pages
39-04 RAG Retrieval Augmented Generation
No ratings yet
39-04 RAG Retrieval Augmented Generation
7 pages
17 (Advanced) RAG Techniques To Turn Your LLM App Prototype Into A Production-Ready Solution - by Dominik Polzer - Jun, 2024 - Towards Data Science
No ratings yet
17 (Advanced) RAG Techniques To Turn Your LLM App Prototype Into A Production-Ready Solution - by Dominik Polzer - Jun, 2024 - Towards Data Science
54 pages
LlamaIndex Talk (W&B Fully Connected 2024)
No ratings yet
LlamaIndex Talk (W&B Fully Connected 2024)
38 pages
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
No ratings yet
Developing Retrieval Augmented Generation (RAG) Based LLM Systems From Pdfs - An Expert Report
36 pages
1_Build a Complete OpenSource LLM RAG QA Chatbot — an in-Depth Journey (Introduction) _ by Marco Bertelli _ Level Up Coding
No ratings yet
1_Build a Complete OpenSource LLM RAG QA Chatbot — an in-Depth Journey (Introduction) _ by Marco Bertelli _ Level Up Coding
12 pages
What Is Retrieval Augmented Generation Rag Final v2 Cs
No ratings yet
What Is Retrieval Augmented Generation Rag Final v2 Cs
5 pages
A Practical Blueprint For Implementing Generative AI Retrieval-Augmented Generation
No ratings yet
A Practical Blueprint For Implementing Generative AI Retrieval-Augmented Generation
19 pages
Rag in 80 Questions Rag Basics 1732967574
No ratings yet
Rag in 80 Questions Rag Basics 1732967574
28 pages
The Ultimate Guide to GenAI RAG: Enhancing AI with Real-Time Data Retrieval
No ratings yet
The Ultimate Guide to GenAI RAG: Enhancing AI with Real-Time Data Retrieval
12 pages
CR_AI-in-Your-Pocket
No ratings yet
CR_AI-in-Your-Pocket
18 pages
RAG Implementation
No ratings yet
RAG Implementation
14 pages
GenAI PDF
No ratings yet
GenAI PDF
34 pages
5th and 6th Topic
No ratings yet
5th and 6th Topic
8 pages
4-HC24.PrimisAI.Hans_Bouwmeester.v4
No ratings yet
4-HC24.PrimisAI.Hans_Bouwmeester.v4
29 pages
What is Retrieval-Augmented Generation (RAG)
No ratings yet
What is Retrieval-Augmented Generation (RAG)
12 pages
Github - Blog - Ai and ML - Generative Ai - What Is Retrieval Augmented Generation and What Does It Do For Generative Ai
No ratings yet
Github - Blog - Ai and ML - Generative Ai - What Is Retrieval Augmented Generation and What Does It Do For Generative Ai
14 pages
Untitled (1)hbdhbhbhfbhbfhwehufhfbwehfgwgfg
No ratings yet
Untitled (1)hbdhbhbhfbhbfhwehufhfbwehfgwgfg
10 pages
Tutorial Membuat RAG AI ChatBot API Dengan Python FastAPI Dan Open Source LLMs
No ratings yet
Tutorial Membuat RAG AI ChatBot API Dengan Python FastAPI Dan Open Source LLMs
41 pages
Minor_proj
No ratings yet
Minor_proj
15 pages
2024.Naacl Industry.23
No ratings yet
2024.Naacl Industry.23
16 pages
Ways To Use LLM in Finance Organisation
No ratings yet
Ways To Use LLM in Finance Organisation
5 pages
AI For Education RAG
No ratings yet
AI For Education RAG
18 pages
RAG (Generative AI) - A "Rags To Riches" Moment For Artificial Intelligence - by Kanishk Khatter - Medium
No ratings yet
RAG (Generative AI) - A "Rags To Riches" Moment For Artificial Intelligence - by Kanishk Khatter - Medium
12 pages
2024 KDD RAG Meets LLM Tutorial Part1
No ratings yet
2024 KDD RAG Meets LLM Tutorial Part1
68 pages
Medical_rag_report
No ratings yet
Medical_rag_report
6 pages
(2) Basic AI & ML Concepts Explained _ LinkedIn
No ratings yet
(2) Basic AI & ML Concepts Explained _ LinkedIn
10 pages
IR-LLMs
No ratings yet
IR-LLMs
17 pages
Grounding LLM Models For Increased Accuracy
No ratings yet
Grounding LLM Models For Increased Accuracy
9 pages
building RAG apps
No ratings yet
building RAG apps
32 pages
Llmrag
No ratings yet
Llmrag
6 pages
UNIT-5 Modern Artificial Intelligence
No ratings yet
UNIT-5 Modern Artificial Intelligence
15 pages
A Survey On Retrieval-Augmented Text Generation For Large Language Models
No ratings yet
A Survey On Retrieval-Augmented Text Generation For Large Language Models
18 pages
GALLM_Unit_5_Note
No ratings yet
GALLM_Unit_5_Note
7 pages
Neurons To GenerativeAI Roadmap 2024
No ratings yet
Neurons To GenerativeAI Roadmap 2024
14 pages
building-blocks-of-rag-ebook-final
No ratings yet
building-blocks-of-rag-ebook-final
15 pages
The Difference Between RAG, Agentic RAG and Agents
No ratings yet
The Difference Between RAG, Agentic RAG and Agents
9 pages
CrateDB and LangChain
No ratings yet
CrateDB and LangChain
14 pages
LlamaIndex Talk (AI Eng World Fair, 2024-06-26)
No ratings yet
LlamaIndex Talk (AI Eng World Fair, 2024-06-26)
30 pages
_OceanofPDF.com_LLMs_in_Enterprise_-_Ahmed_Menshawy
No ratings yet
_OceanofPDF.com_LLMs_in_Enterprise_-_Ahmed_Menshawy
194 pages
NVIDIA RAG Whitepaper
No ratings yet
NVIDIA RAG Whitepaper
7 pages
Untitled 2
No ratings yet
Untitled 2
40 pages
Cloud Native AI and Machine Learning on AWS: Use SageMaker for building ML models, automate MLOps, and take advantage of numerous AWS AI services (English Edition)
From Everand
Cloud Native AI and Machine Learning on AWS: Use SageMaker for building ML models, automate MLOps, and take advantage of numerous AWS AI services (English Edition)
Premkumar Rangarajan
No ratings yet
Proposal
0% (1)
Proposal
3 pages
Google Cloud Gemini, Image 2, and MLOps Updates - Google Cloud Blog
No ratings yet
Google Cloud Gemini, Image 2, and MLOps Updates - Google Cloud Blog
16 pages
Bus Part A2P ReadingBank U1
No ratings yet
Bus Part A2P ReadingBank U1
2 pages
NPTEL Live Session Week 1 Deep Learning-IIT Ropar
No ratings yet
NPTEL Live Session Week 1 Deep Learning-IIT Ropar
26 pages
Internship Report
No ratings yet
Internship Report
43 pages
Artificial Intelligence Lab Manual
No ratings yet
Artificial Intelligence Lab Manual
9 pages
B.tech CSE 2019 Scheme Syllabi v0.9
No ratings yet
B.tech CSE 2019 Scheme Syllabi v0.9
29 pages
Artifical Intelligence Notes Part 1
No ratings yet
Artifical Intelligence Notes Part 1
22 pages
AI Project Logbook
No ratings yet
AI Project Logbook
54 pages
HPCL Engineer (Mechanical) Official Paper (Held On_ 18 Aug, 2024)
No ratings yet
HPCL Engineer (Mechanical) Official Paper (Held On_ 18 Aug, 2024)
62 pages
Legal-AI-a-beginners-guide-web
No ratings yet
Legal-AI-a-beginners-guide-web
12 pages
Accounting Information System-CHAPTER 5.pdf
No ratings yet
Accounting Information System-CHAPTER 5.pdf
17 pages
Security, Privacy and Steganographic Analysis of FaceApp and TikTok
No ratings yet
Security, Privacy and Steganographic Analysis of FaceApp and TikTok
22 pages
Complete Download Advances in Human Factors in Wearable Technologies and Game Design: Proceedings of the AHFE 2019 International Conference on Human Factors and Wearable Technologies, and the AHFE International Conference on Game Design and Virtual Environments, July 24-28 Tareq Ahram PDF All Chapters
100% (3)
Complete Download Advances in Human Factors in Wearable Technologies and Game Design: Proceedings of the AHFE 2019 International Conference on Human Factors and Wearable Technologies, and the AHFE International Conference on Game Design and Virtual Environments, July 24-28 Tareq Ahram PDF All Chapters
59 pages
A Recurrent Neural Network
No ratings yet
A Recurrent Neural Network
22 pages
Machine Learning Supervised
No ratings yet
Machine Learning Supervised
42 pages
Data Science
No ratings yet
Data Science
17 pages
1:-What Is Robotics ?: Ipp Presentation ROLL NO: - 1598009 (Topic: - Artificial Intellligence (In Robotics)
No ratings yet
1:-What Is Robotics ?: Ipp Presentation ROLL NO: - 1598009 (Topic: - Artificial Intellligence (In Robotics)
4 pages
JAMA Impact of AI On Diagnosis
No ratings yet
JAMA Impact of AI On Diagnosis
10 pages
Co-Innovation: Enterprise Start-Up Collaboration: January
No ratings yet
Co-Innovation: Enterprise Start-Up Collaboration: January
40 pages
Lecture 7
No ratings yet
Lecture 7
60 pages
slides-ai-governance-unveiled-leveraging-ai-effectively-responsibly
No ratings yet
slides-ai-governance-unveiled-leveraging-ai-effectively-responsibly
22 pages
MergeResult 2024 05 16 07 32 08
No ratings yet
MergeResult 2024 05 16 07 32 08
75 pages
Announcement-Technical-Internship-Website
No ratings yet
Announcement-Technical-Internship-Website
4 pages
Quantum Computing and AI A Quantum Leap in Intelligence
No ratings yet
Quantum Computing and AI A Quantum Leap in Intelligence
7 pages
Resume
No ratings yet
Resume
2 pages
Ai Unit 2 Notes
No ratings yet
Ai Unit 2 Notes
33 pages
MTX - Associate Machine Learning Engineer
No ratings yet
MTX - Associate Machine Learning Engineer
2 pages
10-1108_ecam-11-2024-1517
No ratings yet
10-1108_ecam-11-2024-1517
36 pages
2203.14963v2
No ratings yet
2203.14963v2
4 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Research Ibm Com Blog retrieval-augmented-generation-RAG

Uploaded by

Research Ibm Com Blog retrieval-augmented-generation-RAG

Uploaded by

Research Focus areas Blog Publications Careers About

Date Explainer 5 minute read

Authors What is retrieval-augmented

Retrieval-augmented generation (RAG) is an AI framework

“You want to cross-reference a model’s answers with the

RAG has additional benefits. By grounding an LLM on a set of

An overview from IBM expert Marina Danilevsky

Dmitri Krotov is on a quest to improve AI.

An ‘open book’ approach to answering tough

Underpinning all foundation models, including LLMs, is an AI

“It’s the difference between an open-book and a closed-

As the name suggests, RAG has two phases: retrieval and

Toward personalized and verifiable responses

Before LLMs, digital conversation agents followed a manual

But it had limitations. Anticipating and scripting answers to

Today, LLM-powered chatbots can give customers more

IBM is currently using RAG to ground its internal customer-

Teaching the model to recognize when it doesn’t

Customer queries aren’t always this straightforward. They

“Think of the model as an overeager junior employee that

In a more challenging scenario taken from real life, Alice

Maternity-leave policies are complex, in part, because they

With enough fine-tuning, an LLM can be trained to pause and

At IBM Research, we are focused on innovating at both ends

See how we're applying AI research

We are setting out to reduce the environmental impact of PFAS.

Explainer Explainer Research Research

Kim Martineau Kim Martineau Peter Hess Peter Hess

Foundation Models Generative AI AI Generative AI AI AI Hardware AI Generative AI

Contact IBM Terms of use

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.