Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

RAG S3 Vectors

A system using Amazon S3 Vectors and Amazon Bedrock to create a RAG (Retrieval-Augmented Generation) system for Shakespearean plays. The system is designed to answer questions about the plays based on the text and embeddings stored in the vector database.

Prerequisite

Here are the steps to deploy your Shakespeare RAG system:

Manual S3 Vector Infrastructure Setup

Create Vector Bucket (AWS Console):

Navigate to Amazon S3 in us-east-1 region
Look for Vector Buckets in the sidebar (preview feature)
Create bucket named: shakespeare-rag-vector-bucket
Note: Must be globally unique, so add suffix if needed

Create Vector Index:

Within your vector bucket, create index: hamlet-shakespeare-index
Set dimensions to 1024 (Titan Embed v2 default)
Choose cosine distance metric
Exclude text metadata from filter targets (only title should be filterable)

AWS Bedrock Model Access

Enable Required Models:

Go to Amazon Bedrock console
Request access to amazon.titan-embed-text-v2:0
Request access to amazon.titan-text-premier-v1:0
Wait for approval (usually immediate for Titan models)

Deployment Commands

# Install SAM CLI via Homebrew
brew tap aws/tap
brew install aws-sam-cli

# Verify installation
sam --version

# Build the SAM application
sam build

# Deploy with guided setup (first time)
sam deploy --guided

# Use these parameters when prompted:
# Stack Name: shakespeare-rag-system
# AWS Region: us-east-1 (or your preferred region)
# VectorBucketName: shakespeare-rag-vector-bucket
# VectorIndexName: hamlet-shakespeare-index

# For subsequent deployments
sam deploy

# To destroy everything deployed by SAM, use:
sam delete

# Delete S3 Vectors resources
aws s3vectors delete-index --vector-bucket-name "shakespeare-rag-vector-bucket" --index-name "hamlet-shakespeare-index" --region us-east-1
aws s3vectors delete-vector-bucket --vector-bucket-name "shakespeare-rag-vector-bucket" --region us-east-1

Testing Locally

You can now test the system locally with Shakespearean queries:

# Create searchable vector indexes
python3 -m src.create_index

# Test embedding generation
python3 -m src.query --test-embeddings

# Query with command line arguments
python3 -m src.query -q "Tell me about Hamlet's relationship with Ophelia"

# Interactive query mode
python3 -m src.query

Successful Response

# Query with command line arguments
# python3 -m src.query -q "Tell me about Hamlet's relationship with Ophelia"

🔍 Searching for: Tell me about Hamlet's relationship with Ophelia
📦 Vector Bucket: shakespeare-rag-vector-bucket
📊 Index: hamlet-shakespeare-index
--------------------------------------------------
🧠 Generating embedding...
✅ Embedding generated (dimension: 1024)
🔎 Querying vectors...
✅ Found 3 similar documents
==================================================
📄 Result 1
   Title: Hamlet
   Distance: 0.4494
   Key: b77cbc83-6760-49ff-bc60-b0f5be479e0c
   Text Preview: ## Act II - The Prince's Feigned Madness
    
    To better observe the court and plan his revenge, Hamlet assumes an antic disposition, speaking in riddles and behaving as one touched by lunacy. His ...
--------------------------------------------------
📄 Result 2
   Title: Hamlet
   Distance: 0.4494
   Key: 4c93c8fd-a6d8-4171-898a-7a1b014822b2
   Text Preview: ## Act II - The Prince's Feigned Madness
    
    To better observe the court and plan his revenge, Hamlet assumes an antic disposition, speaking in riddles and behaving as one touched by lunacy. His ...
--------------------------------------------------
📄 Result 3
   Title: Hamlet
   Distance: 0.4494
   Key: 6a2b5596-7779-440e-b6a9-48c8c4c3aa4a
   Text Preview: ## Act II - The Prince's Feigned Madness
    
    To better observe the court and plan his revenge, Hamlet assumes an antic disposition, speaking in riddles and behaving as one touched by lunacy. His ...
--------------------------------------------------
📊 Distance Statistics:
   Best Match: 0.4494
   Worst Match: 0.4494
   Average: 0.4494

Testing API

Once deployed, test with Shakespearean queries:

# 1. Basic cURL example
curl -X POST \
  https://6y3pc5k09e.execute-api.us-east-1.amazonaws.com/prod/query \
  -H "Content-Type: application/json" \
  -H "x-api-key: 8uNAa2dWzx3U5EasnC9HhfldaTjoXgLe" \
  -d '{
    "question": "Tell me about Hamlet'\''s relationship with Ophelia"
  }'

# 2. Example with invalid API key (should return 403)
curl -X POST \
  https://6y3pc5k09e.execute-api.us-east-1.amazonaws.com/prod/query \
  -H "Content-Type: application/json" \
  -H "x-api-key: invalid-key" \
  -d '{
    "question": "Tell me about Hamlet'\''s relationship with Ophelia"
  }'

# 3. Example without API key (should return 403)
curl -X POST \
  https://6y3pc5k09e.execute-api.us-east-1.amazonaws.com/prod/query \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Tell me about Hamlet'\''s relationship with Ophelia"
  }'

Successful API Response

{
  "answer": "Hamlet's relationship with Ophelia is a complex one. Initially, Hamlet appears to be in love with Ophelia, but after his father's death, he becomes distant and cruel towards her. In Act II, Hamlet tells Ophelia to \"get thee to a nunnery,\" which is a harsh and hurtful thing to say. However, it's important to note that Hamlet is feigning madness at this point, and his harsh words may be a way of protecting Ophelia from the corruption of the court. Ophelia is deeply affected by Hamlet's behavior and eventually goes mad with grief after her father's death.",
  "sources": [
    {
      "title": "Hamlet",
      "distance": 0.44943636655807495,
      "relevance_score": 0.551
    },
    {
      "title": "Hamlet",
      "distance": 0.44943636655807495,
      "relevance_score": 0.551
    },
    {
      "title": "Hamlet",
      "distance": 0.44943636655807495,
      "relevance_score": 0.551
      }
    ],
    "metadata": {
      "question_length": 48,
      "sources_found": 3,
      "processing_successful": true,
      "timestamp": "2025-07-21T22:13:42.347218+00:00",
      "request_id": "4cd6cfe9-a0fa-4ca9-aeb5-9812222bc58f"
    }
}%

Vector Database Pricing Comparison 2025

Vector Database	Pricing Model	Monthly Cost (Example)	Key Features	Best For
Amazon S3 Vectors	Pay-as-you-use: • Data upload: $0.20/GB • Storage: $0.06/GB/month • Queries: Variable by request count	$1,216/month (400M vectors, 40 indexes, 10M queries) • Storage: $141 • Upload: $78 • Queries: $997	• 90% cost reduction vs traditional DBs • Native S3 integration • Serverless, no infrastructure • Sub-second query performance	Long-term storage, infrequent access, cost-sensitive workloads
Pinecone	Tiered plans: • Starter: Free (100K vectors) • Standard: $50/month minimum • Enterprise: $500/month minimum	$480/month (Multiple p1.x2 pods with replicas) Single pod: ~$160/month	• Fully managed • Sub-100ms latency • Auto-scaling • 50x cost reduction with serverless	Production apps, real-time search, high QPS requirements
Qdrant	Usage-based: • Free: 1GB cluster forever • Cloud: ~$0.03494/hour • Hybrid: $0.014/hour	$25-50/month (Small to medium clusters) Scales with RAM, CPU, storage	• Open-source option • 4x higher RPS than competitors • Built in Rust for performance • Advanced filtering	Performance-critical apps, open-source preference, custom deployments
Chroma	Open-source + Cloud: • Open-source: Free • Cloud Starter: $0/month + $5 credits • Cloud Team: $100 credits then usage-based	$0-150/month (Self-hosted on AWS m4.xlarge: ~$150) Cloud version: Usage-based	• Completely free open-source • Easy local development • Python/JS focused • Simple API	Prototyping, development, budget-conscious projects
Weaviate	Dimension-based: • $0.05 per million dimensions • Serverless and managed options	$50-200/month ($1 for 20M dimensions) Enterprise: Custom pricing	• GraphQL + REST APIs • Built-in vectorization modules • Multi-modal support • Open-source available	Enterprise applications, multi-modal data, flexible deployment

Key Insights

🏆 Best Value: Amazon S3 Vectors

Up to 90% cost reduction compared to traditional vector databases
Ideal for large-scale, infrequently accessed vector data
No infrastructure management required

⚡ Best Performance: Qdrant

Up to 4x higher RPS than competitors
Built in Rust for maximum performance
Flexible deployment options

🎯 Best for Production: Pinecone

Mature platform with proven scalability
50x cost reduction with new serverless architecture
Strong enterprise support and SLAs

💰 Best for Budgets: Chroma

Completely free and open-source
Perfect for prototyping and small projects
Easy local development

🔧 Best for Flexibility: Weaviate

Transparent per-dimension pricing starting at $0.05 per million dimensions
Strong open-source ecosystem
Multi-modal capabilities

Decision Framework

Choose S3 Vectors if: You need cost-effective storage for large datasets with moderate query frequency and already use AWS infrastructure.

Choose Pinecone if: You need a fully managed solution with guaranteed performance SLAs and enterprise support.

Choose Qdrant if: Performance is critical and you want the flexibility of open-source with optional managed services.

Choose Chroma if: You're prototyping, learning, or building small-scale applications on a tight budget.

Choose Weaviate if: You need multi-modal support, prefer GraphQL APIs, or want enterprise features with open-source flexibility.

Pricing data current as of July 2025. Costs may vary based on specific usage patterns, regions, and enterprise agreements.

License

This project is licensed under the Modified MIT License.

Citation

@misc{rags3vectors,
  author       = {Oketunji, A.F.},
  title        = {RAG S3 Vectors},
  year         = 2025,
  version      = {0.0.3},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.16291024},
  url          = {https://doi.org/10.5281/zenodo.16291024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.devcontainer		.devcontainer
.github		.github
design_system		design_system
logs		logs
src		src
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
template.yaml		template.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG S3 Vectors

Prerequisite

Deployment Commands

Testing Locally

Successful Response

Testing API

Successful API Response

Vector Database Pricing Comparison 2025

Key Insights

🏆 Best Value: Amazon S3 Vectors

⚡ Best Performance: Qdrant

🎯 Best for Production: Pinecone

💰 Best for Budgets: Chroma

🔧 Best for Flexibility: Weaviate

Decision Framework

License

Citation

Copyright

About

Uh oh!

Releases 3

Packages

Uh oh!

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

License

0xnu/rag-s3-vectors

Folders and files

Latest commit

History

Repository files navigation

RAG S3 Vectors

Prerequisite

Deployment Commands

Testing Locally

Successful Response

Testing API

Successful API Response

Vector Database Pricing Comparison 2025

Key Insights

🏆 Best Value: Amazon S3 Vectors

⚡ Best Performance: Qdrant

🎯 Best for Production: Pinecone

💰 Best for Budgets: Chroma

🔧 Best for Flexibility: Weaviate

Decision Framework

License

Citation

Copyright

About

Topics

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Languages

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Packages