Skip to content

0xnu/rag-s3-vectors

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

17 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

RAG S3 Vectors

Lint Release License

A system using Amazon S3 Vectors and Amazon Bedrock to create a RAG (Retrieval-Augmented Generation) system for Shakespearean plays. The system is designed to answer questions about the plays based on the text and embeddings stored in the vector database.

Prerequisite

Here are the steps to deploy your Shakespeare RAG system:

  1. Manual S3 Vector Infrastructure Setup

Create Vector Bucket (AWS Console):

  • Navigate to Amazon S3 in us-east-1 region
  • Look for Vector Buckets in the sidebar (preview feature)
  • Create bucket named: shakespeare-rag-vector-bucket
  • Note: Must be globally unique, so add suffix if needed

Create Vector Index:

  • Within your vector bucket, create index: hamlet-shakespeare-index
  • Set dimensions to 1024 (Titan Embed v2 default)
  • Choose cosine distance metric
  • Exclude text metadata from filter targets (only title should be filterable)
  1. AWS Bedrock Model Access

Enable Required Models:

  • Go to Amazon Bedrock console
  • Request access to amazon.titan-embed-text-v2:0
  • Request access to amazon.titan-text-premier-v1:0
  • Wait for approval (usually immediate for Titan models)

Deployment Commands

# Install SAM CLI via Homebrew
brew tap aws/tap
brew install aws-sam-cli

# Verify installation
sam --version

# Build the SAM application
sam build

# Deploy with guided setup (first time)
sam deploy --guided

# Use these parameters when prompted:
# Stack Name: shakespeare-rag-system
# AWS Region: us-east-1 (or your preferred region)
# VectorBucketName: shakespeare-rag-vector-bucket
# VectorIndexName: hamlet-shakespeare-index

# For subsequent deployments
sam deploy

# To destroy everything deployed by SAM, use:
sam delete

# Delete S3 Vectors resources
aws s3vectors delete-index --vector-bucket-name "shakespeare-rag-vector-bucket" --index-name "hamlet-shakespeare-index" --region us-east-1
aws s3vectors delete-vector-bucket --vector-bucket-name "shakespeare-rag-vector-bucket" --region us-east-1

Testing Locally

You can now test the system locally with Shakespearean queries:

# Create searchable vector indexes
python3 -m src.create_index

# Test embedding generation
python3 -m src.query --test-embeddings

# Query with command line arguments
python3 -m src.query -q "Tell me about Hamlet's relationship with Ophelia"

# Interactive query mode
python3 -m src.query

Successful Response

# Query with command line arguments
# python3 -m src.query -q "Tell me about Hamlet's relationship with Ophelia"

πŸ” Searching for: Tell me about Hamlet's relationship with Ophelia
πŸ“¦ Vector Bucket: shakespeare-rag-vector-bucket
πŸ“Š Index: hamlet-shakespeare-index
--------------------------------------------------
🧠 Generating embedding...
βœ… Embedding generated (dimension: 1024)
πŸ”Ž Querying vectors...
βœ… Found 3 similar documents
==================================================
πŸ“„ Result 1
   Title: Hamlet
   Distance: 0.4494
   Key: b77cbc83-6760-49ff-bc60-b0f5be479e0c
   Text Preview: ## Act II - The Prince's Feigned Madness
    
    To better observe the court and plan his revenge, Hamlet assumes an antic disposition, speaking in riddles and behaving as one touched by lunacy. His ...
--------------------------------------------------
πŸ“„ Result 2
   Title: Hamlet
   Distance: 0.4494
   Key: 4c93c8fd-a6d8-4171-898a-7a1b014822b2
   Text Preview: ## Act II - The Prince's Feigned Madness
    
    To better observe the court and plan his revenge, Hamlet assumes an antic disposition, speaking in riddles and behaving as one touched by lunacy. His ...
--------------------------------------------------
πŸ“„ Result 3
   Title: Hamlet
   Distance: 0.4494
   Key: 6a2b5596-7779-440e-b6a9-48c8c4c3aa4a
   Text Preview: ## Act II - The Prince's Feigned Madness
    
    To better observe the court and plan his revenge, Hamlet assumes an antic disposition, speaking in riddles and behaving as one touched by lunacy. His ...
--------------------------------------------------
πŸ“Š Distance Statistics:
   Best Match: 0.4494
   Worst Match: 0.4494
   Average: 0.4494

Testing API

Once deployed, test with Shakespearean queries:

# 1. Basic cURL example
curl -X POST \
  https://6y3pc5k09e.execute-api.us-east-1.amazonaws.com/prod/query \
  -H "Content-Type: application/json" \
  -H "x-api-key: 8uNAa2dWzx3U5EasnC9HhfldaTjoXgLe" \
  -d '{
    "question": "Tell me about Hamlet'\''s relationship with Ophelia"
  }'

# 2. Example with invalid API key (should return 403)
curl -X POST \
  https://6y3pc5k09e.execute-api.us-east-1.amazonaws.com/prod/query \
  -H "Content-Type: application/json" \
  -H "x-api-key: invalid-key" \
  -d '{
    "question": "Tell me about Hamlet'\''s relationship with Ophelia"
  }'

# 3. Example without API key (should return 403)
curl -X POST \
  https://6y3pc5k09e.execute-api.us-east-1.amazonaws.com/prod/query \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Tell me about Hamlet'\''s relationship with Ophelia"
  }'

Successful API Response

{
  "answer": "Hamlet's relationship with Ophelia is a complex one. Initially, Hamlet appears to be in love with Ophelia, but after his father's death, he becomes distant and cruel towards her. In Act II, Hamlet tells Ophelia to \"get thee to a nunnery,\" which is a harsh and hurtful thing to say. However, it's important to note that Hamlet is feigning madness at this point, and his harsh words may be a way of protecting Ophelia from the corruption of the court. Ophelia is deeply affected by Hamlet's behavior and eventually goes mad with grief after her father's death.",
  "sources": [
    {
      "title": "Hamlet",
      "distance": 0.44943636655807495,
      "relevance_score": 0.551
    },
    {
      "title": "Hamlet",
      "distance": 0.44943636655807495,
      "relevance_score": 0.551
    },
    {
      "title": "Hamlet",
      "distance": 0.44943636655807495,
      "relevance_score": 0.551
      }
    ],
    "metadata": {
      "question_length": 48,
      "sources_found": 3,
      "processing_successful": true,
      "timestamp": "2025-07-21T22:13:42.347218+00:00",
      "request_id": "4cd6cfe9-a0fa-4ca9-aeb5-9812222bc58f"
    }
}%

Vector Database Pricing Comparison 2025

Vector Database Pricing Model Monthly Cost (Example) Key Features Best For
Amazon S3 Vectors Pay-as-you-use:
β€’ Data upload: $0.20/GB
β€’ Storage: $0.06/GB/month
β€’ Queries: Variable by request count
$1,216/month
(400M vectors, 40 indexes, 10M queries)
β€’ Storage: $141
β€’ Upload: $78
β€’ Queries: $997
β€’ 90% cost reduction vs traditional DBs
β€’ Native S3 integration
β€’ Serverless, no infrastructure
β€’ Sub-second query performance
Long-term storage,
infrequent access,
cost-sensitive workloads
Pinecone Tiered plans:
β€’ Starter: Free (100K vectors)
β€’ Standard: $50/month minimum
β€’ Enterprise: $500/month minimum
$480/month
(Multiple p1.x2 pods with replicas)
Single pod: ~$160/month
β€’ Fully managed
β€’ Sub-100ms latency
β€’ Auto-scaling
β€’ 50x cost reduction with serverless
Production apps,
real-time search,
high QPS requirements
Qdrant Usage-based:
β€’ Free: 1GB cluster forever
β€’ Cloud: ~$0.03494/hour
β€’ Hybrid: $0.014/hour
$25-50/month
(Small to medium clusters)
Scales with RAM, CPU, storage
β€’ Open-source option
β€’ 4x higher RPS than competitors
β€’ Built in Rust for performance
β€’ Advanced filtering
Performance-critical apps,
open-source preference,
custom deployments
Chroma Open-source + Cloud:
β€’ Open-source: Free
β€’ Cloud Starter: $0/month + $5 credits
β€’ Cloud Team: $100 credits then usage-based
$0-150/month
(Self-hosted on AWS m4.xlarge: ~$150)
Cloud version: Usage-based
β€’ Completely free open-source
β€’ Easy local development
β€’ Python/JS focused
β€’ Simple API
Prototyping,
development,
budget-conscious projects
Weaviate Dimension-based:
β€’ $0.05 per million dimensions
β€’ Serverless and managed options
$50-200/month
($1 for 20M dimensions)
Enterprise: Custom pricing
β€’ GraphQL + REST APIs
β€’ Built-in vectorization modules
β€’ Multi-modal support
β€’ Open-source available
Enterprise applications,
multi-modal data,
flexible deployment

Key Insights

πŸ† Best Value: Amazon S3 Vectors

  • Up to 90% cost reduction compared to traditional vector databases
  • Ideal for large-scale, infrequently accessed vector data
  • No infrastructure management required

⚑ Best Performance: Qdrant

  • Up to 4x higher RPS than competitors
  • Built in Rust for maximum performance
  • Flexible deployment options

🎯 Best for Production: Pinecone

  • Mature platform with proven scalability
  • 50x cost reduction with new serverless architecture
  • Strong enterprise support and SLAs

πŸ’° Best for Budgets: Chroma

  • Completely free and open-source
  • Perfect for prototyping and small projects
  • Easy local development

πŸ”§ Best for Flexibility: Weaviate

  • Transparent per-dimension pricing starting at $0.05 per million dimensions
  • Strong open-source ecosystem
  • Multi-modal capabilities

Decision Framework

Choose S3 Vectors if: You need cost-effective storage for large datasets with moderate query frequency and already use AWS infrastructure.

Choose Pinecone if: You need a fully managed solution with guaranteed performance SLAs and enterprise support.

Choose Qdrant if: Performance is critical and you want the flexibility of open-source with optional managed services.

Choose Chroma if: You're prototyping, learning, or building small-scale applications on a tight budget.

Choose Weaviate if: You need multi-modal support, prefer GraphQL APIs, or want enterprise features with open-source flexibility.


Pricing data current as of July 2025. Costs may vary based on specific usage patterns, regions, and enterprise agreements.

License

This project is licensed under the Modified MIT License.

Citation

@misc{rags3vectors,
  author       = {Oketunji, A.F.},
  title        = {RAG S3 Vectors},
  year         = 2025,
  version      = {0.0.3},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.16291024},
  url          = {https://doi.org/10.5281/zenodo.16291024}
}

Copyright

(c) 2025 Finbarrs Oketunji. All Rights Reserved.

Packages

No packages published

Languages

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy