Generative AI Applications
Generative AI Applications
Customizing generative AI
applications for your business
using your own data
Maira Ladeira Tanke
(she/her)
Sr. Generative AI Data Scientist
AWS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why customize?
E.g., healthcare – Understand E.g., finance – Teach financial and E.g., customer service – Improve E.g., legal services – Better
medical terminology and provide accounting terms to provide good ability to understand and respond understand case facts and law to
accurate responses related to a analysis for earnings reports to a customer’s inquires and provide useful insights for
patient’s health complaints attorneys
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customizing foundation models to understand
your use case
FOUNDATION
MODEL
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Augment models without changing pretrained
model weights with knowledge bases
FOUNDATION
MODEL
FOUNDATION
MODEL
AGENTS DATABASES,
APIS
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Adapt models for your use case with fine-tuning
FOUNDATION
MODEL
FINE-TUNING LABELED
DATA
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Update your models through continued
pretraining
FOUNDATION
MODEL
CONTINUED UNLABELED
PRETRAINING DATA
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Approaches to customizing models with your data
Contextual
DATA Contextual information Small number Large number
information
NEED based on user’s query of labeled examples of unlabeled datasets
based on user’s query
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customize
External data
Consolidated or
sources
historical info
or up-to-date info Task
information
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Bedrock
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon
Broad choice of models
Jurassic-2 Ultra Titan Text Embeddings Claude 2 Command + Embed Llama 2 Stable Diffusion XL1.0
Jurassic-2 Mid Titan Multimodal Embeddings Claude 2.1 Cohere Command Light Llama 2 13B
Titan Text Lite Claude Instant Cohere Embed English Llama 2 70B
Titan Text Express Cohere Embed Multilingual
Titan Image Generator
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Retrieval-augmented
generation (RAG)
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What is retrieval-augmented generation?
Augmentation Generation
Retrieval
Fetches the relevant Adding the retrieved Response from the
content from the relevant context to the foundation model based
external knowledge base user prompt, which goes on the augmented
or data sources based on as an input to the prompt
a user query foundation model
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
RAG use cases
E.g., helps in reducing E.g., enhances chatbot capabilities E.g., searches based on a user’s E.g., retrieves and summarizes
hallucinations and connecting by integrating with real-time data previous search history and transactional data from databases
with recent knowledge, including persona or API calls
enterprise data
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What are embeddings?
• Numerical representation of
text (vectors) that captures
semantics and relationships
New York 0.027 -0.011 … -0.023
between words.
Paris 0.025 -0.009 … -0.025
• Embedding models capture
features and nuances of the EMBEDDING
MODEL
text. Animal -0.011 0.021 … 0.013
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Why are embeddings important for RAG?
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Titan text embeddings model
Translates text inputs (words, phrases) into numerical • Titan Text Embeddings offers fast, cost-
representations (embeddings). Comparing effective, high-performance, accurate
embeddings produces more relevant and contextual embeddings in 25 languages.
responses than word matching.
• Optimized for text retrieval tasks, semantic
similarity, and clustering.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Cohere embeddings model
Embed is Cohere's text representation, or embeddings, Embed is Cohere's text representation, or embeddings,
model. This version supports English only. model. This version supports multiple languages.
Supported use cases: Semantic search, retrieval- Supported use cases: Semantic search, retrieval-
augmented generation (RAG), classification, augmented generation (RAG), classification,
clustering. clustering.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How RAG works
User input
Text User
Prompt
augmentation
Large language
model
Response
generation
workflow Embeddings
model
Context
Data Semantic
ingestion search
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
However, when it comes to implementing RAG,
there are challenges…
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Knowledge bases for Amazon Bedrock
NATIVE SUPPORT FOR RETRIEVAL-AUGMENTED GENERATION (RAG)
Securely connect FMs Fully managed RAG Built-in session Automatic citations
to data sources for workflow, including context management with retrievals to
RAG to deliver more ingestion, retrieval, for multi-turn improve transparency
relevant responses and augmentation conversations
1 4 5 AI21 Labs—Jurassic2 6
A M A Z O N
B E D R O C K
Amazon—Titan Text
2 3
Anthropic—Claude
KNOWLEDGE Meta—Llama2
BASES FOR
AMAZON BEDROCK
Cohere—Command
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Data ingestion workflow
KNOWLEDGE BASES FOR AMAZON BEDROCK
Fully
managed
data
ingestion
Data source Embeddings
workflow New data Document chunks
model
Vector store
• Choose your data source Choose your chunking Choose your Choose your vector store
strategy embedding model • Amazon OpenSearch
(Amazon S3)
• Fixed chunks • Amazon Titan
Serverless
• Support for incremental
• No chunking • Cohere Embed
• Amazon Aurora
updates
• Default (200 tokens)
• Pinecone
• Multiple data file formats
• Redis
supported
• MongoDB
(coming soon)
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Retrieval and generate
User input
Text Embeddings
generation model Context
workflow
0.89 -0.02 -0.53 0.95 0.17 -0.38
Embedding
Semantic
search
User Response
Fully User query Generated
managed response
RAG
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customize RAG workflows using Retrieve API
KNOWLEDGE BASES FOR AMAZON BEDROCK
User input
Customized
RAG Retrieve API
workflow
User Retrieved
Context
query documents
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Vector databases supported by Amazon Bedrock
COMING SOON
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agents
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agents for Amazon Bedrock
ENABLE GENERATIVE AI APPLICATIONS TO EXECUTE MULTISTEP TASKS USING COMPANY SYSTEMS AND
DATA SOURCES
1 2 3 4
Automates Simplifies building Provides secure Lets you choose Provides fully
orchestration of and deploying access to enterprise implementation managed
multistep tasks AI assistants data and APIs languages infrastructure
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agents build on existing enterprise resources
HR knowledge base Existing resources
HR policy docs
Vacation actions
get-Vacation-Balance
HR time-off
Vacation Vacation
agent microservice database
Leave of absence actions
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Knowledge bases integration with agents
Search
Knowledge bases
Query
Retrieval
Agent
Query + Retrieval
Large language
model
Response generation
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Fine-tuning and continued
pretraining
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Bedrock custom models
CREATE CUSTOM MODELS USING THE CONSOLE OR API’S
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customizing Amazon Titan models
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Fine-tune additional models
in Amazon Bedrock
COMING SOON
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Fine-tuning and continued pretraining
Domain
adaptation
(e.g., extend
knowledge)
Fine-tuning
Continued
• Instruction training dataset is available? pretraining
• Specific style, behavior required? Continued +
Pretraining Fine-tuning
Continued pretraining
• Raw dataset (e.g., PDFs)
• Additional knowledge through domain adaptation
Fine-tuning
Task specialization
(e.g., behavior, style)
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Datasets for fine-tuning and continued pretraining
Instruction dataset Raw data
(e.g., question-answer) (e.g. PDFs)
Dataset
Fine-tuning Continued
pretraining
{”prompt": "<prompt text>", ”completion": "<expected generated text>"} {"input": "<raw text>"}
{”prompt": "<prompt text>", ”completion": "<expected generated text>"} {"input": "<raw text>"}
{”prompt": "<prompt text>", ”completion": "<expected generated text>"} {"input": "<raw text>"}
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Components of a model customization job
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customization architecture overview
Amazon Bedrock service account Model deployment account
(AWS owned and operated)
via the console, SDKs, and API
All incoming network traffic
Training orchestration
Amazon Bedrock
Base model S3
bucket Custom
Provisioned Job
Runtime inference capacity
API compute
endpoint
Fine-tuned model
S3 bucket
Customer account
Training data
AWS Amazon AWS S3 bucket
CloudTrail CloudWatch IAM
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Security and privacy
You are always in control of your data
✓ Data not used to improve models, and not shared with model providers
✓ Custom models encrypted and stored with service or customer managed keys
(CMK) – Only you have access to your models
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Summary
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
Maira Ladeira Tanke
mttanke@amazon.com
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2024, Amazon Web Services, Inc. or its affiliates. All rights reserved.