A system using Amazon S3 Vectors and Amazon Bedrock to create a RAG (Retrieval-Augmented Generation) system for Shakespearean plays. The system is designed to answer questions about the plays based on the text and embeddings stored in the vector database.
Here are the steps to deploy your Shakespeare RAG system:
- Manual S3 Vector Infrastructure Setup
Create Vector Bucket (AWS Console):
- Navigate to Amazon S3 in us-east-1 region
- Look for
Vector Buckets
in the sidebar (preview feature) - Create bucket named:
shakespeare-rag-vector-bucket
- Note: Must be globally unique, so add suffix if needed
Create Vector Index:
- Within your vector bucket, create index: hamlet-shakespeare-index
- Set dimensions to 1024 (Titan Embed v2 default)
- Choose cosine distance metric
- Exclude text metadata from filter targets (only title should be filterable)
- AWS Bedrock Model Access
Enable Required Models:
- Go to Amazon Bedrock console
- Request access to
amazon.titan-embed-text-v2:0
- Request access to
amazon.titan-text-premier-v1:0
- Wait for approval (usually immediate for Titan models)
# Install SAM CLI via Homebrew
brew tap aws/tap
brew install aws-sam-cli
# Verify installation
sam --version
# Build the SAM application
sam build
# Deploy with guided setup (first time)
sam deploy --guided
# Use these parameters when prompted:
# Stack Name: shakespeare-rag-system
# AWS Region: us-east-1 (or your preferred region)
# VectorBucketName: shakespeare-rag-vector-bucket
# VectorIndexName: hamlet-shakespeare-index
# For subsequent deployments
sam deploy
# To destroy everything deployed by SAM, use:
sam delete
# Delete S3 Vectors resources
aws s3vectors delete-index --vector-bucket-name "shakespeare-rag-vector-bucket" --index-name "hamlet-shakespeare-index" --region us-east-1
aws s3vectors delete-vector-bucket --vector-bucket-name "shakespeare-rag-vector-bucket" --region us-east-1
You can now test the system locally with Shakespearean queries:
# Create searchable vector indexes
python3 -m src.create_index
# Test embedding generation
python3 -m src.query --test-embeddings
# Query with command line arguments
python3 -m src.query -q "Tell me about Hamlet's relationship with Ophelia"
# Interactive query mode
python3 -m src.query
# Query with command line arguments
# python3 -m src.query -q "Tell me about Hamlet's relationship with Ophelia"
π Searching for: Tell me about Hamlet's relationship with Ophelia
π¦ Vector Bucket: shakespeare-rag-vector-bucket
π Index: hamlet-shakespeare-index
--------------------------------------------------
π§ Generating embedding...
β
Embedding generated (dimension: 1024)
π Querying vectors...
β
Found 3 similar documents
==================================================
π Result 1
Title: Hamlet
Distance: 0.4494
Key: b77cbc83-6760-49ff-bc60-b0f5be479e0c
Text Preview: ## Act II - The Prince's Feigned Madness
To better observe the court and plan his revenge, Hamlet assumes an antic disposition, speaking in riddles and behaving as one touched by lunacy. His ...
--------------------------------------------------
π Result 2
Title: Hamlet
Distance: 0.4494
Key: 4c93c8fd-a6d8-4171-898a-7a1b014822b2
Text Preview: ## Act II - The Prince's Feigned Madness
To better observe the court and plan his revenge, Hamlet assumes an antic disposition, speaking in riddles and behaving as one touched by lunacy. His ...
--------------------------------------------------
π Result 3
Title: Hamlet
Distance: 0.4494
Key: 6a2b5596-7779-440e-b6a9-48c8c4c3aa4a
Text Preview: ## Act II - The Prince's Feigned Madness
To better observe the court and plan his revenge, Hamlet assumes an antic disposition, speaking in riddles and behaving as one touched by lunacy. His ...
--------------------------------------------------
π Distance Statistics:
Best Match: 0.4494
Worst Match: 0.4494
Average: 0.4494
Once deployed, test with Shakespearean queries:
# 1. Basic cURL example
curl -X POST \
https://6y3pc5k09e.execute-api.us-east-1.amazonaws.com/prod/query \
-H "Content-Type: application/json" \
-H "x-api-key: 8uNAa2dWzx3U5EasnC9HhfldaTjoXgLe" \
-d '{
"question": "Tell me about Hamlet'\''s relationship with Ophelia"
}'
# 2. Example with invalid API key (should return 403)
curl -X POST \
https://6y3pc5k09e.execute-api.us-east-1.amazonaws.com/prod/query \
-H "Content-Type: application/json" \
-H "x-api-key: invalid-key" \
-d '{
"question": "Tell me about Hamlet'\''s relationship with Ophelia"
}'
# 3. Example without API key (should return 403)
curl -X POST \
https://6y3pc5k09e.execute-api.us-east-1.amazonaws.com/prod/query \
-H "Content-Type: application/json" \
-d '{
"question": "Tell me about Hamlet'\''s relationship with Ophelia"
}'
{
"answer": "Hamlet's relationship with Ophelia is a complex one. Initially, Hamlet appears to be in love with Ophelia, but after his father's death, he becomes distant and cruel towards her. In Act II, Hamlet tells Ophelia to \"get thee to a nunnery,\" which is a harsh and hurtful thing to say. However, it's important to note that Hamlet is feigning madness at this point, and his harsh words may be a way of protecting Ophelia from the corruption of the court. Ophelia is deeply affected by Hamlet's behavior and eventually goes mad with grief after her father's death.",
"sources": [
{
"title": "Hamlet",
"distance": 0.44943636655807495,
"relevance_score": 0.551
},
{
"title": "Hamlet",
"distance": 0.44943636655807495,
"relevance_score": 0.551
},
{
"title": "Hamlet",
"distance": 0.44943636655807495,
"relevance_score": 0.551
}
],
"metadata": {
"question_length": 48,
"sources_found": 3,
"processing_successful": true,
"timestamp": "2025-07-21T22:13:42.347218+00:00",
"request_id": "4cd6cfe9-a0fa-4ca9-aeb5-9812222bc58f"
}
}%
Vector Database | Pricing Model | Monthly Cost (Example) | Key Features | Best For |
---|---|---|---|---|
Amazon S3 Vectors | Pay-as-you-use: β’ Data upload: $0.20/GB β’ Storage: $0.06/GB/month β’ Queries: Variable by request count |
$1,216/month (400M vectors, 40 indexes, 10M queries) β’ Storage: $141 β’ Upload: $78 β’ Queries: $997 |
β’ 90% cost reduction vs traditional DBs β’ Native S3 integration β’ Serverless, no infrastructure β’ Sub-second query performance |
Long-term storage, infrequent access, cost-sensitive workloads |
Pinecone | Tiered plans: β’ Starter: Free (100K vectors) β’ Standard: $50/month minimum β’ Enterprise: $500/month minimum |
$480/month (Multiple p1.x2 pods with replicas) Single pod: ~$160/month |
β’ Fully managed β’ Sub-100ms latency β’ Auto-scaling β’ 50x cost reduction with serverless |
Production apps, real-time search, high QPS requirements |
Qdrant | Usage-based: β’ Free: 1GB cluster forever β’ Cloud: ~$0.03494/hour β’ Hybrid: $0.014/hour |
$25-50/month (Small to medium clusters) Scales with RAM, CPU, storage |
β’ Open-source option β’ 4x higher RPS than competitors β’ Built in Rust for performance β’ Advanced filtering |
Performance-critical apps, open-source preference, custom deployments |
Chroma | Open-source + Cloud: β’ Open-source: Free β’ Cloud Starter: $0/month + $5 credits β’ Cloud Team: $100 credits then usage-based |
$0-150/month (Self-hosted on AWS m4.xlarge: ~$150) Cloud version: Usage-based |
β’ Completely free open-source β’ Easy local development β’ Python/JS focused β’ Simple API |
Prototyping, development, budget-conscious projects |
Weaviate | Dimension-based: β’ $0.05 per million dimensions β’ Serverless and managed options |
$50-200/month ($1 for 20M dimensions) Enterprise: Custom pricing |
β’ GraphQL + REST APIs β’ Built-in vectorization modules β’ Multi-modal support β’ Open-source available |
Enterprise applications, multi-modal data, flexible deployment |
- Up to 90% cost reduction compared to traditional vector databases
- Ideal for large-scale, infrequently accessed vector data
- No infrastructure management required
- Up to 4x higher RPS than competitors
- Built in Rust for maximum performance
- Flexible deployment options
- Mature platform with proven scalability
- 50x cost reduction with new serverless architecture
- Strong enterprise support and SLAs
- Completely free and open-source
- Perfect for prototyping and small projects
- Easy local development
- Transparent per-dimension pricing starting at $0.05 per million dimensions
- Strong open-source ecosystem
- Multi-modal capabilities
Choose S3 Vectors if: You need cost-effective storage for large datasets with moderate query frequency and already use AWS infrastructure.
Choose Pinecone if: You need a fully managed solution with guaranteed performance SLAs and enterprise support.
Choose Qdrant if: Performance is critical and you want the flexibility of open-source with optional managed services.
Choose Chroma if: You're prototyping, learning, or building small-scale applications on a tight budget.
Choose Weaviate if: You need multi-modal support, prefer GraphQL APIs, or want enterprise features with open-source flexibility.
Pricing data current as of July 2025. Costs may vary based on specific usage patterns, regions, and enterprise agreements.
This project is licensed under the Modified MIT License.
@misc{rags3vectors,
author = {Oketunji, A.F.},
title = {RAG S3 Vectors},
year = 2025,
version = {0.0.3},
publisher = {Zenodo},
doi = {10.5281/zenodo.16291024},
url = {https://doi.org/10.5281/zenodo.16291024}
}
(c) 2025 Finbarrs Oketunji. All Rights Reserved.