-# Table of contents
-- [Introduction](#introduction)
+---
+PostgresML is a complete ML/AI platform built inside PostgreSQL. Our operating principle is:
+
+Move models to the database, rather than constantly moving data to the models.
+
+Data for ML & AI systems is inherently larger and more dynamic than the models. It's more efficient, manageable and reliable to move models to the database, rather than continuously moving data to the models.
+
+
+ Table of contents
- [Installation](#installation)
- [Getting started](#getting-started)
- [Natural Language Processing](#nlp-tasks)
@@ -52,9 +43,6 @@
-# Introduction
-PostgresML is a machine learning extension for PostgreSQL that enables you to perform training and inference on text and tabular data using SQL queries. With PostgresML, you can seamlessly integrate machine learning models into your PostgreSQL database and harness the power of cutting-edge algorithms to process data efficiently.
-
## Text Data
- Perform natural language processing (NLP) tasks like sentiment analysis, question and answering, translation, summarization and text generation
- Access 1000s of state-of-the-art language models like GPT-2, GPT-J, GPT-Neo from :hugs: HuggingFace model hub
diff --git a/pgml-apps/pgml-chat/pgml_chat/main.py b/pgml-apps/pgml-chat/pgml_chat/main.py
index e9ac079ea..3c447a419 100644
--- a/pgml-apps/pgml-chat/pgml_chat/main.py
+++ b/pgml-apps/pgml-chat/pgml_chat/main.py
@@ -123,7 +123,7 @@ def handler(signum, frame):
"--chat_completion_model",
dest="chat_completion_model",
type=str,
- default="meta-llama/Meta-Llama-3-8B-Instruct",
+ default="meta-llama/Meta-Llama-3.1-8B-Instruct",
)
parser.add_argument(
diff --git a/pgml-cms/blog/.gitbook/assets/owlllama2.jpeg b/pgml-cms/blog/.gitbook/assets/owlllama2.jpeg
new file mode 100644
index 000000000..920f324ab
Binary files /dev/null and b/pgml-cms/blog/.gitbook/assets/owlllama2.jpeg differ
diff --git a/pgml-cms/blog/README.md b/pgml-cms/blog/README.md
index 08ecb1ff9..8dc3b18d0 100644
--- a/pgml-cms/blog/README.md
+++ b/pgml-cms/blog/README.md
@@ -4,7 +4,8 @@ description: recent blog posts
# Home
-* [announcing-the-release-of-our-rust-sdk](announcing-the-release-of-our-rust-sdk.md)
+* [announcing-support-for-meta-llama-3.1](announcing-support-for-meta-llama-3.1.md "mention")
+* [announcing-the-release-of-our-rust-sdk](announcing-the-release-of-our-rust-sdk.md "mention")
* [meet-us-at-the-2024-ai-dev-summit-conference](meet-us-at-the-2024-ai-dev-summit-conference.md "mention")
* [introducing-the-openai-switch-kit-move-from-closed-to-open-source-ai-in-minutes.md](introducing-the-openai-switch-kit-move-from-closed-to-open-source-ai-in-minutes.md "mention")
* [speeding-up-vector-recall-5x-with-hnsw.md](speeding-up-vector-recall-5x-with-hnsw.md "mention")
diff --git a/pgml-cms/blog/SUMMARY.md b/pgml-cms/blog/SUMMARY.md
index 99f538d66..e684e39e9 100644
--- a/pgml-cms/blog/SUMMARY.md
+++ b/pgml-cms/blog/SUMMARY.md
@@ -4,6 +4,7 @@
* [Korvus The All-in-One RAG Pipeline for PostgresML](introducing-korvus-the-all-in-one-rag-pipeline-for-postgresml.md)
* [Semantic Search in Postgres in 15 Minutes](semantic-search-in-postgres-in-15-minutes.md)
* [Unified RAG](unified-rag.md)
+* [Announcing Support for Meta Llama 3.1](announcing-support-for-meta-llama-3.1.md)
* [Announcing the Release of our Rust SDK](announcing-the-release-of-our-rust-sdk.md)
* [Serverless LLMs are dead; Long live Serverless LLMs](serverless-llms-are-dead-long-live-serverless-llms.md)
* [Speeding up vector recall 5x with HNSW](speeding-up-vector-recall-5x-with-hnsw.md)
diff --git a/pgml-cms/blog/announcing-support-for-meta-llama-3.1.md b/pgml-cms/blog/announcing-support-for-meta-llama-3.1.md
new file mode 100644
index 000000000..493c23fc7
--- /dev/null
+++ b/pgml-cms/blog/announcing-support-for-meta-llama-3.1.md
@@ -0,0 +1,37 @@
+---
+description: >-
+ Today we’re taking the next steps towards open source AI becoming the industry standard. We’re adding support for Llama 3.1 405B, the first frontier-level open source AI model, as well as new and improved Llama 3.1 70B and 8B models.
+featured: false
+tags: [engineering]
+image: ".gitbook/assets/owlllama2.jpeg"
+---
+
+# Announcing Support for Meta Llama 3.1
+
+
+
+
+
+
+
+Montana Low
+
+July 23, 2024
+
+We're pleased to offer Meta Llama 3.1 running in our serverless cloud today. Mark Zuckerberg explained [his company's reasons for championing open source AI](https://about.fb.com/news/2024/07/open-source-ai-is-the-path-forward/), and it's great to see a strong ecosystem forming. These models are now available in our serverless cloud with optimized kernels for maximum throughput.
+
+- meta-llama/Meta-Llama-3.1-8B-Instruct
+- meta-llama/Meta-Llama-3.1-70B-Instruct
+- meta-llama/Meta-Llama-3.1-405B-Instruct
+
+## Is open-source AI right for you?
+
+We think so. Open-source models have made remarkable strides, not only catching up to proprietary counterparts but also surpassing them across multiple domains. The advantages are clear:
+
+* **Performance & reliability:** Open-source models are increasingly comparable or superior across a wide range of tasks and performance metrics. Mistral and Llama-based models, for example, are easily faster than GPT 4. Reliability is another concern you may reconsider leaving in the hands of OpenAI. OpenAI’s API has suffered from several recent outages, and their rate limits can interrupt your app if there is a surge in usage. Open-source models enable greater control over your model’s latency, scalability and availability. Ultimately, the outcome of greater control is that your organization can produce a more dependable integration and a highly reliable production application.
+* **Safety & privacy:** Open-source models are the clear winner when it comes to security sensitive AI applications. There are [enormous risks](https://www.infosecurity-magazine.com/news-features/chatgpts-datascraping-scrutiny/) associated with transmitting private data to external entities such as OpenAI. By contrast, open-source models retain sensitive information within an organization's own cloud environments. The data never has to leave your premises, so the risk is bypassed altogether – it’s enterprise security by default. At PostgresML, we offer such private hosting of LLM’s in your own cloud.
+* **Model censorship:** A growing number of experts inside and outside of leading AI companies argue that model restrictions have gone too far. The Atlantic recently published an [article on AI’s “Spicy-Mayo Problem'' ](https://www.theatlantic.com/ideas/archive/2023/11/ai-safety-regulations-uncensored-models/676076/) which delves into the issues surrounding AI censorship. The titular example describes a chatbot refusing to return commands asking for a “dangerously spicy” mayo recipe. Censorship can affect baseline performance, and in the case of apps for creative work such as Sudowrite, unrestricted open-source models can actually be a key differentiating value for users.
+* **Flexibility & customization:** Closed-source models like GPT3.5 Turbo are fine for generalized tasks, but leave little room for customization. Fine-tuning is highly restricted. Additionally, the headwinds at OpenAI have exposed the [dangerous reality of AI vendor lock-in](https://techcrunch.com/2023/11/21/openai-dangers-vendor-lock-in/). Open-source models such as MPT-7B, Llama V2 and Mistral 7B are designed with extensive flexibility for fine tuning, so organizations can create custom specifications and optimize model performance for their unique needs. This level of customization and flexibility opens the door for advanced techniques like DPO, PPO LoRa and more.
+
+For a full list of models available in our cloud, check out our [plans and pricing](/pricing).
+
diff --git a/pgml-cms/blog/generating-llm-embeddings-with-open-source-models-in-postgresml.md b/pgml-cms/blog/generating-llm-embeddings-with-open-source-models-in-postgresml.md
index 664569814..d834dce72 100644
--- a/pgml-cms/blog/generating-llm-embeddings-with-open-source-models-in-postgresml.md
+++ b/pgml-cms/blog/generating-llm-embeddings-with-open-source-models-in-postgresml.md
@@ -1,6 +1,6 @@
---
image: .gitbook/assets/blog_image_generating_llm_embeddings.png
-features: true
+featured: true
description: >-
How to use the pgml.embed(...) function to generate embeddings with free and
open source models in your own database.
@@ -120,7 +120,7 @@ LIMIT 5;
## Generating embeddings from natural language text
-PostgresML provides a simple interface to generate embeddings from text in your database. You can use the [`pgml.embed`](https://postgresml.org/docs/guides/transformers/embeddings) function to generate embeddings for a column of text. The function takes a transformer name and a text value. The transformer will automatically be downloaded and cached on your connection process for reuse. You can see a list of potential good candidate models to generate embeddings on the [Massive Text Embedding Benchmark leaderboard](https://huggingface.co/spaces/mteb/leaderboard).
+PostgresML provides a simple interface to generate embeddings from text in your database. You can use the [`pgml.embed`](https://postgresml.org/docs/open-source/pgml/guides/transformers/embeddings) function to generate embeddings for a column of text. The function takes a transformer name and a text value. The transformer will automatically be downloaded and cached on your connection process for reuse. You can see a list of potential good candidate models to generate embeddings on the [Massive Text Embedding Benchmark leaderboard](https://huggingface.co/spaces/mteb/leaderboard).
Since our corpus of documents (movie reviews) are all relatively short and similar in style, we don't need a large model. [`Alibaba-NLP/gte-base-en-v1.5`](https://huggingface.co/Alibaba-NLP/gte-base-en-v1.5) will be a good first attempt. The great thing about PostgresML is you can always regenerate your embeddings later to experiment with different embedding models.
diff --git a/pgml-cms/blog/introducing-korvus-the-all-in-one-rag-pipeline-for-postgresml.md b/pgml-cms/blog/introducing-korvus-the-all-in-one-rag-pipeline-for-postgresml.md
index fa1bfdf76..259d84173 100644
--- a/pgml-cms/blog/introducing-korvus-the-all-in-one-rag-pipeline-for-postgresml.md
+++ b/pgml-cms/blog/introducing-korvus-the-all-in-one-rag-pipeline-for-postgresml.md
@@ -100,7 +100,7 @@ async def main():
"aggregate": {"join": "\n"},
},
"chat": {
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"messages": [
{
"role": "system",
diff --git a/pgml-cms/blog/introducing-the-openai-switch-kit-move-from-closed-to-open-source-ai-in-minutes.md b/pgml-cms/blog/introducing-the-openai-switch-kit-move-from-closed-to-open-source-ai-in-minutes.md
index 8384b6fc8..a1d9609fa 100644
--- a/pgml-cms/blog/introducing-the-openai-switch-kit-move-from-closed-to-open-source-ai-in-minutes.md
+++ b/pgml-cms/blog/introducing-the-openai-switch-kit-move-from-closed-to-open-source-ai-in-minutes.md
@@ -44,7 +44,7 @@ The Switch Kit is an open-source AI SDK that provides a drop in replacement for
const korvus = require("korvus");
const client = korvus.newOpenSourceAI();
const results = client.chat_completions_create(
- "meta-llama/Meta-Llama-3-8B-Instruct",
+ "meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
role: "system",
@@ -65,7 +65,7 @@ console.log(results);
import korvus
client = korvus.OpenSourceAI()
results = client.chat_completions_create(
- "meta-llama/Meta-Llama-3-8B-Instruct",
+ "meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
"role": "system",
@@ -96,7 +96,7 @@ print(results)
],
"created": 1701291672,
"id": "abf042d2-9159-49cb-9fd3-eef16feb246c",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion",
"system_fingerprint": "eecec9d4-c28b-5a27-f90b-66c3fb6cee46",
"usage": {
@@ -113,7 +113,7 @@ We don't charge per token, so OpenAI “usage” metrics are not particularly re
!!!
-The above is an example using our open-source AI SDK with Meta-Llama-3-8B-Instruct, an incredibly popular and highly efficient 8 billion parameter model.
+The above is an example using our open-source AI SDK with Meta-Llama-3.1-8B-Instruct, an incredibly popular and highly efficient 8 billion parameter model.
Notice there is near one to one relation between the parameters and return type of OpenAI’s `chat.completions.create` and our `chat_completion_create`.
@@ -125,7 +125,7 @@ Here is an example of streaming:
const korvus = require("korvus");
const client = korvus.newOpenSourceAI();
const it = client.chat_completions_create_stream(
- "meta-llama/Meta-Llama-3-8B-Instruct",
+ "meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
role: "system",
@@ -150,7 +150,7 @@ while (!result.done) {
import korvus
client = korvus.OpenSourceAI()
results = client.chat_completions_create_stream(
- "meta-llama/Meta-Llama-3-8B-Instruct",
+ "meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
"role": "system",
@@ -182,7 +182,7 @@ for c in results:
],
"created": 1701296792,
"id": "62a817f5-549b-43e0-8f0c-a7cb204ab897",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion.chunk",
"system_fingerprint": "f366d657-75f9-9c33-8e57-1e6be2cf62f3"
}
@@ -198,7 +198,7 @@ for c in results:
],
"created": 1701296792,
"id": "62a817f5-549b-43e0-8f0c-a7cb204ab897",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion.chunk",
"system_fingerprint": "f366d657-75f9-9c33-8e57-1e6be2cf62f3"
}
@@ -210,7 +210,7 @@ We have truncated the output to two items
!!!
-We also have asynchronous versions of the create and `create_stream` functions relatively named `create_async` and `create_stream_async`. Checkout [our documentation](https://postgresml.org/docs/guides/opensourceai) for a complete guide of the open-source AI SDK including guides on how to specify custom models.
+We also have asynchronous versions of the create and `create_stream` functions relatively named `create_async` and `create_stream_async`. Checkout [our documentation](https://postgresml.org/docs/open-source/pgml/guides/opensourceai) for a complete guide of the open-source AI SDK including guides on how to specify custom models.
PostgresML is free and open source. To run the above examples yourself [create an account](https://postgresml.org/signup), install korvus, and get running!
diff --git a/pgml-cms/blog/semantic-search-in-postgres-in-15-minutes.md b/pgml-cms/blog/semantic-search-in-postgres-in-15-minutes.md
index e638e4b47..34cc0ae1b 100644
--- a/pgml-cms/blog/semantic-search-in-postgres-in-15-minutes.md
+++ b/pgml-cms/blog/semantic-search-in-postgres-in-15-minutes.md
@@ -152,7 +152,7 @@ SELECT '[1,2,3]'::vector <=> '[2,3,4]'::vector;
!!!
-Other distance functions have similar formulas and provide convenient operators to use as well. It may be worth testing other operators and to see which performs better for your use case. For more information on the other distance functions, take a look at our [Embeddings guide](https://postgresml.org/docs/guides/embeddings/vector-similarity).
+Other distance functions have similar formulas and provide convenient operators to use as well. It may be worth testing other operators and to see which performs better for your use case. For more information on the other distance functions, take a look at our [Embeddings guide](https://postgresml.org/docs/open-source/pgml/guides/embeddings/vector-similarity).
Going back to our search example, we can compute the cosine distance between our query embedding and our documents:
diff --git a/pgml-cms/blog/unified-rag.md b/pgml-cms/blog/unified-rag.md
index 49461068d..8028fa981 100644
--- a/pgml-cms/blog/unified-rag.md
+++ b/pgml-cms/blog/unified-rag.md
@@ -51,7 +51,7 @@ Here is an example of the pgml.transform function
SELECT pgml.transform(
task => ''{
"task": "text-generation",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct"
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"
}''::JSONB,
inputs => ARRAY[''AI is going to''],
args => ''{
@@ -64,7 +64,7 @@ Here is another example of the pgml.transform function
SELECT pgml.transform(
task => ''{
"task": "text-generation",
- "model": "meta-llama/Meta-Llama-3-70B-Instruct"
+ "model": "meta-llama/Meta-Llama-3.1-70B-Instruct"
}''::JSONB,
inputs => ARRAY[''AI is going to''],
args => ''{
@@ -145,9 +145,9 @@ SELECT * FROM chunks limit 10;
| id | chunk | chunk_index | document_id |
| ---- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- | ------------- |
| 1 | Here is an example of the pgml.transform function | 1 | 1 |
-| 2 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 2 | 1 |
+| 2 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 2 | 1 |
| 3 | Here is another example of the pgml.transform function | 3 | 1 |
-| 4 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 4 | 1 |
+| 4 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 4 | 1 |
| 5 | Here is a third example of the pgml.transform function | 5 | 1 |
| 6 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "microsoft/Phi-3-mini-128k-instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 6 | 1 |
| 7 | ae94d3413ae82367c3d0592a67302b25 | 1 | 2 |
@@ -253,8 +253,8 @@ LIMIT 6;
| 1 | 0.09044166306461232 | Here is an example of the pgml.transform function |
| 3 | 0.10787954026965096 | Here is another example of the pgml.transform function |
| 5 | 0.11683694289239333 | Here is a third example of the pgml.transform function |
-| 2 | 0.17699128851412282 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
-| 4 | 0.17844729798760672 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
+| 2 | 0.17699128851412282 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
+| 4 | 0.17844729798760672 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
| 6 | 0.17520464423854842 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "microsoft/Phi-3-mini-128k-instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
!!!
@@ -330,8 +330,8 @@ FROM (
| cosine_distance | rank_score | chunk |
| -------------------- | -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| 0.2124727254737595 | 0.3427378833293915 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
-| 0.2109014406365579 | 0.342184841632843 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
+| 0.2124727254737595 | 0.3427378833293915 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
+| 0.2109014406365579 | 0.342184841632843 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
| 0.21259646694819168 | 0.3332781493663788 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "microsoft/Phi-3-mini-128k-instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
| 0.19483324929456136 | 0.03163915500044823 | Here is an example of the pgml.transform function |
| 0.1685870257610742 | 0.031176624819636345 | Here is a third example of the pgml.transform function |
diff --git a/pgml-cms/docs/.gitbook/assets/Chatbots_Flow-Diagram.svg b/pgml-cms/docs/.gitbook/assets/Chatbots_Flow-Diagram.svg
new file mode 100644
index 000000000..382cab6e3
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/Chatbots_Flow-Diagram.svg
@@ -0,0 +1,281 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/Chatbots_King-Diagram.svg b/pgml-cms/docs/.gitbook/assets/Chatbots_King-Diagram.svg
new file mode 100644
index 000000000..8f9d7f7fd
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/Chatbots_King-Diagram.svg
@@ -0,0 +1,78 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/Chatbots_Limitations-Diagram.svg b/pgml-cms/docs/.gitbook/assets/Chatbots_Limitations-Diagram.svg
new file mode 100644
index 000000000..c96b30ec4
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/Chatbots_Limitations-Diagram.svg
@@ -0,0 +1,275 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/Chatbots_Tokens-Diagram.svg b/pgml-cms/docs/.gitbook/assets/Chatbots_Tokens-Diagram.svg
new file mode 100644
index 000000000..0b7c0915a
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/Chatbots_Tokens-Diagram.svg
@@ -0,0 +1,238 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/Getting-Started_FDW-Diagram.svg b/pgml-cms/docs/.gitbook/assets/Getting-Started_FDW-Diagram.svg
new file mode 100644
index 000000000..14c9f2f4e
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/Getting-Started_FDW-Diagram.svg
@@ -0,0 +1,47 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/Getting-Started_Logical-Replication-Diagram.svg b/pgml-cms/docs/.gitbook/assets/Getting-Started_Logical-Replication-Diagram.svg
new file mode 100644
index 000000000..8a5f88f18
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/Getting-Started_Logical-Replication-Diagram.svg
@@ -0,0 +1,47 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/PGML_Korvus-Applications_Diagram.svg b/pgml-cms/docs/.gitbook/assets/PGML_Korvus-Applications_Diagram.svg
new file mode 100644
index 000000000..e4a95a4ac
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/PGML_Korvus-Applications_Diagram.svg
@@ -0,0 +1,184 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/PgCat_High-Availability-Diagram.svg b/pgml-cms/docs/.gitbook/assets/PgCat_High-Availability-Diagram.svg
new file mode 100644
index 000000000..47a740f43
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/PgCat_High-Availability-Diagram.svg
@@ -0,0 +1,63 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/PgCat_Load-Balancing-Diagram.svg b/pgml-cms/docs/.gitbook/assets/PgCat_Load-Balancing-Diagram.svg
new file mode 100644
index 000000000..e6f3e184f
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/PgCat_Load-Balancing-Diagram.svg
@@ -0,0 +1,63 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/PgCat_Read-Write-Diagram.svg b/pgml-cms/docs/.gitbook/assets/PgCat_Read-Write-Diagram.svg
new file mode 100644
index 000000000..b143f2cab
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/PgCat_Read-Write-Diagram.svg
@@ -0,0 +1,77 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/PgCat_Scale-Diagram.svg b/pgml-cms/docs/.gitbook/assets/PgCat_Scale-Diagram.svg
new file mode 100644
index 000000000..cf1be1b29
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/PgCat_Scale-Diagram.svg
@@ -0,0 +1,168 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/PgCat_Sharding-Diagram.svg b/pgml-cms/docs/.gitbook/assets/PgCat_Sharding-Diagram.svg
new file mode 100644
index 000000000..e9236aaca
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/PgCat_Sharding-Diagram.svg
@@ -0,0 +1,110 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/architecture.png b/pgml-cms/docs/.gitbook/assets/architecture.png
index de7da35c2..b66435dab 100644
Binary files a/pgml-cms/docs/.gitbook/assets/architecture.png and b/pgml-cms/docs/.gitbook/assets/architecture.png differ
diff --git a/pgml-cms/docs/.gitbook/assets/chatbot_flow.png b/pgml-cms/docs/.gitbook/assets/chatbot_flow.png
deleted file mode 100644
index f9107d99f..000000000
Binary files a/pgml-cms/docs/.gitbook/assets/chatbot_flow.png and /dev/null differ
diff --git a/pgml-cms/docs/.gitbook/assets/embedding_king.png b/pgml-cms/docs/.gitbook/assets/embedding_king.png
deleted file mode 100644
index 03deebbe8..000000000
Binary files a/pgml-cms/docs/.gitbook/assets/embedding_king.png and /dev/null differ
diff --git a/pgml-cms/docs/.gitbook/assets/embeddings_tokens.png b/pgml-cms/docs/.gitbook/assets/embeddings_tokens.png
deleted file mode 100644
index 6f7a13221..000000000
Binary files a/pgml-cms/docs/.gitbook/assets/embeddings_tokens.png and /dev/null differ
diff --git a/pgml-cms/docs/.gitbook/assets/fdw_1.png b/pgml-cms/docs/.gitbook/assets/fdw_1.png
deleted file mode 100644
index c19ed86f6..000000000
Binary files a/pgml-cms/docs/.gitbook/assets/fdw_1.png and /dev/null differ
diff --git a/pgml-cms/docs/.gitbook/assets/logical_replication_1.png b/pgml-cms/docs/.gitbook/assets/logical_replication_1.png
deleted file mode 100644
index 171959b62..000000000
Binary files a/pgml-cms/docs/.gitbook/assets/logical_replication_1.png and /dev/null differ
diff --git a/pgml-cms/docs/.gitbook/assets/pgcat_3.png b/pgml-cms/docs/.gitbook/assets/pgcat_3.png
deleted file mode 100644
index 5b3e36bb8..000000000
Binary files a/pgml-cms/docs/.gitbook/assets/pgcat_3.png and /dev/null differ
diff --git a/pgml-cms/docs/.gitbook/assets/pgcat_4.png b/pgml-cms/docs/.gitbook/assets/pgcat_4.png
deleted file mode 100644
index 54fef38a3..000000000
Binary files a/pgml-cms/docs/.gitbook/assets/pgcat_4.png and /dev/null differ
diff --git a/pgml-cms/docs/.gitbook/assets/pgcat_5.png b/pgml-cms/docs/.gitbook/assets/pgcat_5.png
deleted file mode 100644
index c8f17eb2b..000000000
Binary files a/pgml-cms/docs/.gitbook/assets/pgcat_5.png and /dev/null differ
diff --git a/pgml-cms/docs/.gitbook/assets/pgcat_6.png b/pgml-cms/docs/.gitbook/assets/pgcat_6.png
deleted file mode 100644
index 201184d9d..000000000
Binary files a/pgml-cms/docs/.gitbook/assets/pgcat_6.png and /dev/null differ
diff --git a/pgml-cms/docs/.gitbook/assets/pgcat_7.png b/pgml-cms/docs/.gitbook/assets/pgcat_7.png
deleted file mode 100644
index 58ad2a818..000000000
Binary files a/pgml-cms/docs/.gitbook/assets/pgcat_7.png and /dev/null differ
diff --git a/pgml-cms/docs/.gitbook/assets/vpc.png b/pgml-cms/docs/.gitbook/assets/vpc.png
deleted file mode 100644
index de19a6e8b..000000000
Binary files a/pgml-cms/docs/.gitbook/assets/vpc.png and /dev/null differ
diff --git a/pgml-cms/docs/SUMMARY.md b/pgml-cms/docs/SUMMARY.md
index b29645395..59687e3e7 100644
--- a/pgml-cms/docs/SUMMARY.md
+++ b/pgml-cms/docs/SUMMARY.md
@@ -6,11 +6,16 @@
* [Getting started](introduction/getting-started/README.md)
* [Create your database](introduction/getting-started/create-your-database.md)
* [Connect your app](introduction/getting-started/connect-your-app.md)
-* [Import your data](introduction/getting-started/import-your-data/README.md)
- * [Logical replication](introduction/getting-started/import-your-data/logical-replication/README.md)
- * [Foreign Data Wrappers](introduction/getting-started/import-your-data/foreign-data-wrappers.md)
- * [Move data with COPY](introduction/getting-started/import-your-data/copy.md)
- * [Migrate with pg_dump](introduction/getting-started/import-your-data/pg-dump.md)
+* [Import your data](introduction/import-your-data/README.md)
+ * [Logical replication](introduction/import-your-data/logical-replication/README.md)
+ * [Foreign Data Wrappers](introduction/import-your-data/foreign-data-wrappers.md)
+ * [Move data with COPY](introduction/import-your-data/copy.md)
+ * [Migrate with pg_dump](introduction/import-your-data/pg-dump.md)
+ * [Storage & Retrieval](introduction/import-your-data/storage-and-retrieval/README.md)
+ * [Documents](introduction/import-your-data/storage-and-retrieval/documents.md)
+ * [Partitioning](introduction/import-your-data/storage-and-retrieval/partitioning.md)
+ * [LLM based pipelines with PostgresML and dbt (data build tool)](introduction/import-your-data/storage-and-retrieval/llm-based-pipelines-with-postgresml-and-dbt-data-build-tool.md)
+* [FAQ](introduction/faq.md)
## Open Source
@@ -44,6 +49,61 @@
* [Hyperparameter Search](open-source/pgml/api/pgml.train/hyperparameter-search.md)
* [Joint Optimization](open-source/pgml/api/pgml.train/joint-optimization.md)
* [pgml.tune()](open-source/pgml/api/pgml.tune.md)
+ * [Guides](open-source/pgml/guides/README.md)
+ * [Embeddings](open-source/pgml/guides/embeddings/README.md)
+ * [In-database Generation](open-source/pgml/guides/embeddings/in-database-generation.md)
+ * [Dimensionality Reduction](open-source/pgml/guides/embeddings/dimensionality-reduction.md)
+ * [Aggregation](open-source/pgml/guides/embeddings/vector-aggregation.md)
+ * [Similarity](open-source/pgml/guides/embeddings/vector-similarity.md)
+ * [Normalization](open-source/pgml/guides/embeddings/vector-normalization.md)
+ * [Search](open-source/pgml/guides/improve-search-results-with-machine-learning.md)
+ * [Chatbots](open-source/pgml/guides/chatbots/README.md)
+ * [Supervised Learning](open-source/pgml/guides/supervised-learning.md)
+ * [Unified RAG](open-source/pgml/guides/unified-rag.md)
+ * [Natural Language Processing](open-source/pgml/guides/natural-language-processing.md)
+ * [Vector database](open-source/pgml/guides/vector-database.md)
+
+ * [Developers](open-source/pgml/developers/README.md)
+ * [Local Docker Development](open-source/pgml/developers/quick-start-with-docker.md)
+ * [Installation](open-source/pgml/developers/installation.md)
+ * [Contributing](open-source/pgml/developers/contributing.md)
+ * [Distributed Training](open-source/pgml/developers/distributed-training.md)
+ * [GPU Support](open-source/pgml/developers/gpu-support.md)
+ * [Self-hosting](open-source/pgml/developers/self-hosting/README.md)
+ * [Pooler](open-source/pgml/developers/self-hosting/pooler.md)
+ * [Building from source](open-source/pgml/developers/self-hosting/building-from-source.md)
+ * [Replication](open-source/pgml/developers/self-hosting/replication.md)
+ * [Backups](open-source/pgml/developers/self-hosting/backups.md)
+ * [Running on EC2](open-source/pgml/developers/self-hosting/running-on-ec2.md)
* [Korvus](open-source/korvus/README.md)
* [API](open-source/korvus/api/README.md)
* [Collections](open-source/korvus/api/collections.md)
@@ -53,6 +113,7 @@
* [RAG](open-source/korvus/guides/rag.md)
* [Vector Search](open-source/korvus/guides/vector-search.md)
* [Document Search](open-source/korvus/guides/document-search.md)
+ * [OpenSourceAI](open-source/korvus/guides/opensourceai.md)
* [Example Apps](open-source/korvus/example-apps/README.md)
* [Semantic Search](open-source/korvus/example-apps/semantic-search.md)
* [RAG with OpenAI](open-source/korvus/example-apps/rag-with-openai.md)
@@ -69,48 +130,24 @@
* [Enterprise](cloud/enterprise/README.md)
* [Teams](cloud/enterprise/teams.md)
* [VPC](cloud/enterprise/vpc.md)
+* [Privacy Policy](cloud/privacy-policy.md)
+* [Terms of Service](cloud/terms-of-service.md)
-## Guides
-
-* [Embeddings](guides/embeddings/README.md)
- * [In-database Generation](guides/embeddings/in-database-generation.md)
- * [Dimensionality Reduction](guides/embeddings/dimensionality-reduction.md)
- * [Aggregation](guides/embeddings/vector-aggregation.md)
- * [Similarity](guides/embeddings/vector-similarity.md)
- * [Normalization](guides/embeddings/vector-normalization.md)
-* [Search](guides/improve-search-results-with-machine-learning.md)
-* [Chatbots](guides/chatbots/README.md)
- * [Example Application](use-cases/chatbots.md)
-* [Supervised Learning](guides/supervised-learning.md)
-* [Unified RAG](guides/unified-rag.md)
-* [OpenSourceAI](guides/opensourceai.md)
-* [Natural Language Processing](guides/natural-language-processing.md)
-* [Vector database](guides/vector-database.md)
-
-## Resources
+
diff --git a/pgml-cms/docs/resources/architecture/README.md b/pgml-cms/docs/TODO/architecture/README.md
similarity index 100%
rename from pgml-cms/docs/resources/architecture/README.md
rename to pgml-cms/docs/TODO/architecture/README.md
diff --git a/pgml-cms/docs/resources/architecture/why-postgresml.md b/pgml-cms/docs/TODO/architecture/why-postgresml.md
similarity index 100%
rename from pgml-cms/docs/resources/architecture/why-postgresml.md
rename to pgml-cms/docs/TODO/architecture/why-postgresml.md
diff --git a/pgml-cms/docs/use-cases/chatbots.md b/pgml-cms/docs/TODO/chatbots.md
similarity index 100%
rename from pgml-cms/docs/use-cases/chatbots.md
rename to pgml-cms/docs/TODO/chatbots.md
diff --git a/pgml-cms/docs/resources/benchmarks/ggml-quantized-llm-support-for-huggingface-transformers.md b/pgml-cms/docs/TODO/ggml-quantized-llm-support-for-huggingface-transformers.md
similarity index 100%
rename from pgml-cms/docs/resources/benchmarks/ggml-quantized-llm-support-for-huggingface-transformers.md
rename to pgml-cms/docs/TODO/ggml-quantized-llm-support-for-huggingface-transformers.md
diff --git a/pgml-cms/docs/cloud/enterprise/vpc.md b/pgml-cms/docs/cloud/enterprise/vpc.md
index f7c0e9c1d..1400a1dfd 100644
--- a/pgml-cms/docs/cloud/enterprise/vpc.md
+++ b/pgml-cms/docs/cloud/enterprise/vpc.md
@@ -2,7 +2,7 @@
PostgresML can be launched in your Virtual Private Cloud (VPC) account on AWS, Azure or GCP.
-
Deploy in your cloud
+
Deploy in your cloud
The PostgresML control plane provides a complete management solution to control the resources in your cloud account:
- Responsible for PostgresML instance launches, backups, monitoring and failover operations. This requires permission to create and destroy AWS EC2, EBS and AMI resources inside the designated VPC.
diff --git a/pgml-cms/docs/cloud/privacy-policy.md b/pgml-cms/docs/cloud/privacy-policy.md
new file mode 100644
index 000000000..82e718522
--- /dev/null
+++ b/pgml-cms/docs/cloud/privacy-policy.md
@@ -0,0 +1,132 @@
+# Privacy Policy
+
+Effective Date: 7/16/2024
+
+This privacy policy (“Policy”) describes how Hyperparam Inc. (“Company”, “PostgresML”, “we”, “us”) collects, uses, and shares personal information of consumer users of this website, https://postgresml.org (the “Site”), as well as associated products and services (together, the “Services”), and applies to personal information that we collect through the Site and our Services as well as personal information you provide to us directly. This Policy also applies to any of our other websites that post this Policy. Please note that by using the Site or the Services, you accept the practices and policies described in this Policy and you consent that we will collect, use, and share your personal information as described below. If you do not agree to this Policy, please do not use the Site or the Services.
+
+## Personal Information We Collect
+
+We collect personal information about you in a number of different ways:
+**Personal Information Collected From You.** When you use the Site or our Services, we collect personal information that you provide to us, which may include the following categories of personal information depending on how you use the Site or our Services and communicate with us:
+- **General identifiers**, such as your full name, home or work address, zip code, telephone number, email address, job title and organizational affiliation.
+- **Online identifiers**, such as your username and passwords for any of our Sites, or information we automatically collect through cookies and similar technologies used on our websites.
+- **Commercial information**, such as your billing and payment history, and any records of personal property that we collect in connection with providing our Services to you. We also collect information about your preferences regarding marketing communications.
+- **Protected classification characteristics**, such as any information that you choose to provide to us or that we collect in connection with providing our Services to you, including age, race, color, ancestry, national origin, citizenship, religion or creed, marital status, medical condition, physical or mental disability, sex, sexual orientation, veteran or military status or genetic information.
+- **Audio, electronic, and visual information** that we collect in connection with providing our Services to you, such as video or audio recordings of conversations made with your consent.
+- **Professional or employment-related information** that we collect in connection with providing our Services to you, such as your job title, employer information and work history.
+- **Other information you provide to us**.
+
+**Personal Information We Get From Others.** We may collect personal information about you from other sources. We may add this to information we collect from the Site and through our Services.
+
+**Information We Collect Automatically.** We automatically log information about you and your computer, phone, tablet, or other devices you use to access the Site and the Services. For example, when visiting our Site or using the Services, we may log your computer or device identification, operating system type, browser type, screen resolution, browser language, internet protocol (IP) address, unique identifier, general location such as city, state or geographic area, the website you visited before browsing to our Site, pages you viewed, how long you spent on a page, access times and information about your use of and actions on our Site or Services. How much of this information we collect depends on the type and settings of the device you use to access the Site and Services.
+
+**Cookies.** We may log information using “cookies.” Cookies are small data files stored on your hard drive by a website. We may use both session Cookies (which expire once you close your web browser) and persistent Cookies (which stay on your computer until you delete them) to provide you with a more personal and interactive experience on our Site. Other similar tools we may use to collect information by automated means include web server logs, web beacons and pixel tags. This type of information is collected to make the Site and Services more useful to you and to tailor the experience with us to meet your interests and needs.
+
+**Google Analytics.** We may use Google Analytics to help analyze how users use the Site. Google Analytics uses Cookies to collect information such as how often users visit the Site, what pages they visit, and what other sites they used prior to coming to the Site. We use the information we get from Google Analytics only to improve our Site and the Services. Although Google Analytics plants a persistent Cookie on your web browser to identify you as a unique user the next time you visit the Site, the Cookie cannot be used by anyone but Google. Google’s ability to use and share information collected by Google Analytics about your visits to the Site is restricted by the Google Analytics Terms of Use and the Google Privacy Policy.
+
+**Session Replay Technology.** We use session replay technology, such as Hotjar, Inc., to collect information regarding visitor behavior on the Site and the Services. Hotjar is a full-session replay product that helps us see clearly what actions our Site visitors take and where they might get stuck or confused. Hotjar’s service allows us to record and replay an individual’s interaction with the Site and the Services. This helps us to understand our customer’s experience, where they might get stuck, and how we can improve the Site and the Services. You can review Hotjar’s privacy policy by visiting https://www.hotjar.com/legal/policies/privacy/.
+
+**Additional Information.** If you choose to interact on the Site or through the Services (such as by registering; using our Services; entering into agreements with us; or requesting information from us), we will collect the personal information that you provide. We may collect personal information about you that you provide through telephone, email, or other communications. If you provide us with personal information regarding another individual, please do not do so unless you have that person’s consent to give us their personal information.
+
+## How We Use Your Personal Information
+
+Generally, we may use your personal information in the following ways and as otherwise described in this Privacy Policy or to you at the time we collect the personal information from you:
+
+**To Provide the Services and Personalize Your Experience.** We use personal information about you to provide the Services to you, including:
+
+- To help establish and verify your identity;
+- For the purposes for which you specifically provided it to us, including, without limitation, to enable us to process and fulfill your requests or provide the Services to you;
+- To provide you with effective customer service;
+- To provide you with a personalized experience when you use the Site or the Services or by delivering relevant Site or Services content;
+- To send you information about your relationship or transactions with us;
+- To otherwise contact you with information that we believe will be of interest to you, including marketing and promotional communications; and
+- To enhance or develop features, products or services.
+
+**Research and development.** We may use your personal information for research and development purposes, including to analyze and improve the Services, our Sites and our business. As part of these activities, we may create aggregated, de-identified or other anonymous data from personal information we collect. We make personal information into anonymous data by removing information that makes the data personally identifiable to you. We may use this anonymous data and share it with third-parties for our lawful business purposes.
+
+**Marketing.** We may use your personal information in connection with sending you marketing communications as permitted by law, including by mail and email. You may opt-out of marketing communications by following the unsubscribe instructions at the bottom of our marketing communications, emailing us at contact@postgresml.org.
+
+**Compliance and protection.** We may use any of the categories of personal information described above to:
+
+- Comply with applicable laws, lawful requests, and legal process, such as to respond to subpoenas or requests from government authorities.
+- Protect our, your and others’ rights, privacy, safety and property (including by making and defending legal claims).
+- Audit our internal processes for compliance with legal and contractual requirements and internal policies.
+- Enforce the terms and conditions that govern the Site and our Services.
+- Prevent, identify, investigate and deter fraudulent, harmful, unauthorized, unethical or illegal activity, including cyberattacks and identity theft.
+
+We may also use your personal information for other purposes consistent with this Privacy Policy or that are explained to you at the time of collection of your personal information.
+
+## How We Share Your Personal Information
+
+We may disclose all categories of personal information described above with the following categories of third parties:
+
+**Affiliates.** We may share your personal information with our affiliates, for purposes consistent with this notice or that operate shared infrastructure, systems and technology.
+
+**Third Party Service Providers.** We may provide your personal information to third party service providers that help us provide you with the Services that we offer through the Site or otherwise, and to operate our business.
+
+**Professional Advisors.** We may provide your personal information to our lawyers, accountants, bankers and other outside professional advisors in the course of the services they provide to us.
+
+**Corporate Restructuring.** We may share some or all of your personal information in connection with or during negotiation of any merger, financing, acquisition or dissolution, transaction or proceeding involving the sale, transfer, divestiture, or disclosure of all or a portion of our business or assets. In the event of an insolvency, bankruptcy, or receivership, personal information may also be transferred as a business asset. If another company acquires PostgresML, our business, or assets, that company will possess the personal information collected by us and will assume the rights and obligations regarding your personal information described in this Privacy Policy.
+
+**Other Disclosures.** PostgresML may disclose your personal information if it believes in good faith that such disclosure is necessary for any of the following:
+
+- In connection with a legal investigation;
+- To comply with relevant laws or to respond to subpoenas or warrants served on PostgresML;
+- To protect or defend the rights or property of PostgresML or users of the Site or Services; and/or
+- To investigate or assist in preventing any violation or potential violation of the law, this Privacy Policy, or our terms of service/terms of use.
+
+We may also share personal information with other categories of third parties with your consent or as described to you at the time of collection of your personal information.
+
+**Third Party Websites.** Our Site or the Services may contain links to third party websites or services. When you click on a link to any other website or location, you will leave our Site or the Services and go to another site and another entity may collect your personal information from you. We have no control over, do not review, and cannot be responsible for these outside websites or their content, or any collection of your personal information after you click on links to such outside websites. The links to third party websites or locations are for your convenience and do not signify our endorsement of such third parties or their products, content, websites or privacy practices.
+
+## Your Choices Regarding Your Personal Information
+
+You have several choices regarding the use of your personal information on the Site and our Services:
+
+**Email Communications.** We may periodically send you free newsletters and e-mails that directly promote the use of our Site or the Services. When you receive newsletters or promotional communications from us, you may indicate a preference to stop receiving further communications from us and you will have the opportunity to “opt-out” by following the unsubscribe instructions provided in the e-mail you receive or by contacting us directly (please see contact information below). Despite your indicated e-mail preferences, we may send you Services-related communications, including notices of any updates to our Privacy Policy or terms of service/terms of use.
+
+**Cookies.** If you decide at any time that you no longer wish to accept cookies from our Site for any of the purposes described above, then you can instruct your browser, by changing its settings, to stop accepting cookies or to prompt you before accepting a cookie from the websites you visit. Consult your browser’s technical information. If you do not accept cookies, however, you may not be able to use all portions of the Site or all functionality of the Services. If you have any questions about how to disable or modify cookies, visit https://www.allaboutcookies.org/.
+
+**Session Replay Technology.** If you decide that you do not wish to participate in Hotjar’s session replay technology, you can opt out of Hotjar’s collection and processing of data generated by your use of the Site and the Services by visiting https://www.hotjar.com/policies/do-not-track/.
+
+## Security Of Your Personal Information
+
+PostgresML is committed to protecting the security of your personal information. We use a variety of security technologies and procedures to help protect your personal information from unauthorized access, use, or disclosure. No method of transmission over the internet, or method of electronic storage, is 100% secure, however. Therefore, while PostgresML uses reasonable efforts to protect your personal information, we cannot guarantee its absolute security.
+
+## International Users
+
+Please note that our Site and the Services are hosted in the United States. If you use our Site or our Services from outside the United States, please be aware that your personal information may be transferred to, stored, and processed in the United States or other countries where our servers are located and our central database is operated. The data protection and privacy laws of the United States may differ from the laws in your country. By using our Site or our Services, you consent to the transfer of your personal information to the United States or other countries as described in this Privacy Policy.
+
+## Children
+
+Our Site and the Services are not intended for children under 18 years of age, and you must be at least 18 years old to have our permission to use the Site or the Services. We do not knowingly collect, use, or disclose personally identifiable information from children under 13. If you believe that we have collected, used, or disclosed personally identifiable information of a child under the age of 13, please contact us using the contact information below so that we can take appropriate action.
+
+## Do Not Track
+
+We currently do not support the Do Not Track browser setting or respond to Do Not Track signals. Do Not Track (or DNT) is a preference you can set in your browser to let the websites you visit know that you do not want them collecting certain information about you. For more details about Do Not Track, including how to enable or disable this preference, visit http://www.allaboutdnt.com.
+
+## Updates To This Privacy Policy
+
+We reserve the right to change this Privacy Policy at any time. If we make any material changes to this Privacy Policy, we will post the revised version to our website and update the “Effective Date” at the top of this Privacy Policy. Except as otherwise indicated, any changes will become effective when we post the revised Privacy Policy on our website.
+
+## California Consumer Privacy Act (CCPA)
+
+If you are a California resident, you have the right to request that we disclose certain information about our collection and use of your personal information over the past 12 months. You also have the right to request that we delete any personal information that we have collected from you, subject to certain exceptions. To make such requests, please contact us using the contact information provided below.
+
+We will not discriminate against you for exercising any of your CCPA rights, such as by denying you goods or services, charging you a different price, or providing you with a different level or quality of goods or services. For purposes of compliance with the CCPA, in the preceding 12 months, we have not sold any personal information. We do not sell personal information without affirmative authorization.
+
+## General Data Protection Regulation (GDPR)
+
+If you are a resident of the European Economic Area (EEA), you have certain rights under the General Data Protection Regulation (GDPR) regarding the collection, use, and retention of your personal data (which, as defined in the GDPR, means any information related to an identified or identifiable natural person).
+
+You have the right to access, correct, update, or delete any personal data we hold about you. You may also have the right to restrict or object to our processing of your personal data or to request that we provide a copy of your personal data to you or another controller. To exercise any of these rights, please contact us using the contact information provided below. You also have the right to lodge a complaint with a supervisory authority if you believe that our processing of your personal data violates applicable law.
+
+We may collect, use, and retain your personal data for the purposes of providing the Services to you and for other legitimate business purposes. Your personal data may be transferred to and stored in the United States or other countries outside the EEA. When we transfer your personal data outside the EEA, we will take appropriate steps to ensure that your personal data receives the same level of protection as it would in the EEA, including by entering into appropriate data transfer agreements.
+
+Our legal basis for collecting and processing your personal data is typically based on your consent or our legitimate business interests. In certain cases, we may also have a legal obligation to collect and process your personal data or may need to do so to perform services for you.
+
+If you have any questions or concerns about our privacy practices, please contact us using the contact information provided below.
+
+## Contact Us
+
+Our contact information is as follows: contact@postgresml.org
diff --git a/pgml-cms/docs/cloud/serverless.md b/pgml-cms/docs/cloud/serverless.md
index 1ddb73741..32412d96f 100644
--- a/pgml-cms/docs/cloud/serverless.md
+++ b/pgml-cms/docs/cloud/serverless.md
@@ -1,15 +1,15 @@
-# Serverless databases
+# Serverless
-A Serverless PostgresML database can be created in less than 5 seconds and provides immediate access to modern GPU acceleration, a predefined set of state-of-the-art large language models that should satisfy most use cases, and dozens of supervised learning algorithms like XGBoost, LightGBM, Catboost, and everything from Scikit-learn.
-With a Serverless database, storage and compute resources dynamically adapt to your application's needs, ensuring it can scale down or handle peak loads without overprovisioning.
+A Serverless PostgresML database can be created in less than 5 seconds and provides immediate access to modern GPU acceleration, a predefined set of state-of-the-art large language models that should satisfy most use cases, and dozens of supervised learning algorithms like XGBoost, LightGBM, Catboost, and everything from Scikit-learn. We call this combination of tools an AI engine.
+With a Serverless engine, storage and compute resources dynamically adapt to your application's needs, ensuring it can scale down or handle peak loads without overprovisioning.
-Serverless databases are billed on a pay-per-use basis and we offer $100 in free credits to get you started!
+Serverless engines are billed on a pay-per-use basis and we offer $100 in free credits to get you started!
-### Create a Serverless database
+### Create a Serverless engine
-To create a Serverless database, make sure you have an account on postgresml.org. If you don't, you can create one now.
+To create a Serverless engine, make sure you have an account on postgresml.org. If you don't, you can create one now.
-Once logged in, select "New Database" from the left menu and choose the Serverless Plan.
+Once logged in, select "New Engine" from the left menu and choose the Serverless Plan.
Create new database
diff --git a/pgml-cms/docs/cloud/terms-of-service.md b/pgml-cms/docs/cloud/terms-of-service.md
new file mode 100644
index 000000000..93a83d750
--- /dev/null
+++ b/pgml-cms/docs/cloud/terms-of-service.md
@@ -0,0 +1,160 @@
+# Terms of Service
+
+Last Updated: 7/16/2024
+
+## Introduction
+
+Welcome to PostgresML! Your use of PostgresML’s services, including the services PostgresML makes available through this website and applications which link to these terms of service (the “Site”) and to all software or services offered by PostgresML in connection with any of those (the “Services”), is governed by these terms of service (the “Terms”), so please carefully read them before using the Services. For the purposes of these Terms, “we,” “our,” “us,” and “PostgresML” refer to Hyperparam Inc., the providers and operators of the Services.
+
+In order to use the Services, you must first agree to these Terms. If you are registering for or using the Services on behalf of an organization, you are agreeing to these Terms for that organization and promising that you have the authority to bind that organization to these Terms. In that case, “you” and “Customer” will also refer to that organization, wherever possible.
+
+You agree your purchases and/or use of the Services are not contingent on the delivery of any future functionality or features or dependent on any oral or written public comments made by PostgresML or any of its affiliates regarding future functionality or features.
+
+If you have entered into a separate written agreement with PostgresML for use of the Services, the terms and conditions of such other agreement shall prevail over any conflicting terms or conditions in these Terms with respect to the Services specified in such agreement.
+
+Arbitration notice: except for certain types of disputes described in the arbitration clause below, you agree that disputes between you and PostgresML will be resolved by mandatory binding arbitration and you waive any right to participate in a class-action lawsuit or class-wide arbitration.
+
+By using, downloading, installing, or otherwise accessing the services or any materials included in or with the services, you hereby agree to be bound by these terms. If you do not accept these terms, then you may not use, download, install, or otherwise access the services.
+
+Certain features of the services or site may be subject to additional guidelines, terms, or rules, which will be posted on the service or site in connection with such features. To the extent such terms, guidelines, and rules conflict with these terms, such terms shall govern solely with respect to such features. In all other situations, these terms shall govern.
+
+## Your Account
+
+In the course of registering for or using the Services, you may be required to provide PostgresML with certain information, including your name, contact information, username and password (“Credentials”). PostgresML handles such information with the utmost attention, care and security. Nonetheless, you, not PostgresML, shall be responsible for maintaining and protecting your Credentials in connection with the Services. If your contact information or other information relating to your account changes, you must notify PostgresML promptly and keep such information current. You are solely responsible for any activity using your Credentials, whether or not you authorized that activity. You should immediately notify PostgresML of any unauthorized use of your Credentials or if your email or password has been hacked or stolen. If you discover that someone is using your Credentials without your consent, or you discover any other breach of security, you agree to notify PostgresML immediately.
+
+## Content
+
+A variety of information, reviews, recommendations, messages, comments, posts, text, graphics, software, photographs, videos, data, and other materials (“Content”) may be made available through the Services by PostgresML or its suppliers (“PostgresML-Supplied Content”). While PostgresML strives to keep the Content that it provides through the Services accurate, complete, and up-to-date, PostgresML cannot guarantee, and is not responsible for the accuracy, completeness, or timeliness of any PostgresML-Supplied Content.
+
+You acknowledge that you may also be able to create, transmit, publish or display information (such as data files, written text, computer software, music, audio files or other sounds, photographs, videos or other images) through use of the Services. All such information is referred to below as “User Content.”
+
+You agree that you are solely responsible for (and that PostgresML has no responsibility to you or to any third party for) any User Content, and for the consequences of your actions (including any loss or damage which PostgresML may suffer) in connection with such User Content. If you are registering for these Services on behalf of an organization, you also agree that you are also responsible for the actions of associated Users and for any User Content that such associated Users might upload, record, publish, post, link to, or otherwise transmit or distribute through use of the Services. Furthermore, you acknowledge that PostgresML does not control or actively monitor Content uploaded by users and, as such, does not guarantee the accuracy, integrity or quality of such Content. You acknowledge that by using the Services, you may be exposed to materials that are offensive, indecent or objectionable. Under no circumstances will PostgresML be liable in any way for any such Content.
+
+PostgresML may refuse to store, provide, or otherwise maintain your User Content for any or no reason. PostgresML may remove your User Content from the Services at any time if you violate these Terms or if the Services are canceled or suspended. If User Content is stored using the Services with an expiration date, PostgresML may also delete the User Content as of that date. User Content that is deleted may be irretrievable. You agree that PostgresML has no responsibility or liability for the deletion or failure to store any User Content or other communications maintained or transmitted through use of the Services.
+
+PostgresML reserves the right (but shall have no obligation) to monitor and remove User Content from the Services, in its discretion. You agree to immediately take down any Content that violates these Terms, including pursuant to a takedown request from PostgresML. PostgresML also reserves the right to directly take down such Content.
+
+By submitting, posting or otherwise uploading User Content on or through the Services you give PostgresML a worldwide, nonexclusive, perpetual, fully sub-licensable, royalty-free right and license as set below:
+
+with respect to User Content that you submit, post or otherwise make publicly or generally available via the Services (e.g. public forum posts), the license to use, reproduce, modify, adapt, publish, translate, create derivative works from, distribute , publicly perform, and publicly display such User Content (in whole or part) worldwide via the Services or otherwise, and/or to incorporate it in other works in any form, media, or technology now known or later developed for any legal business purpose; and
+
+with respect to User Content that you submit, post or otherwise transmit privately via the Services, the license to use, reproduce, modify, adapt, publish, translate, create derivative works from, distribute, publicly perform and publicly display such User Content for the purpose of enabling PostgresML to provide you with the Services, and for the limited purposes stated in our Privacy Policy.
+
+Notwithstanding anything to the contrary in these Terms, PostgresML may monitor Customer's use of the Services and collect and compile Aggregated Data. As between PostgresML and you, all right, title, and interest in Aggregated Data, and all intellectual property rights therein, belong to and are retained solely by PostgresML. You acknowledge that PostgresML may compile Aggregated Data based on User Content input into the Services. Customer agrees that PostgresML may (i) make Aggregated Data available to third parties including its other customers in compliance with applicable law, and (ii) use Aggregated Data to the extent and in the manner permitted under applicable law. As used herein, “Aggregated Data” means data and information related to or derived from User Content or your use of the Services that is used by PostgresML in an aggregate and anonymized manner, including to compile statistical and performance information related to the Services.
+
+## Proprietary Rights
+
+You acknowledge and agree that PostgresML (and/or PostgresML’s licensors) own all legal right, title and interest in and to the Services and PostgresML-Supplied Content and that the Services and PostgresML-Supplied Content are protected by copyrights, trademarks, patents, or other proprietary rights and laws (whether those rights happen to be registered or not, and wherever in the world those rights may exist).
+
+Except as provided in Section 3, PostgresML acknowledges and agrees that it obtains no right, title or interest from you (or your licensors) under these Terms in or to any Content that you create, upload, submit, post, transmit, share or display on, or through, the Services, including any intellectual property rights which subsist in that Content (whether those rights happen to be registered or not, and wherever in the world those rights may exist). Unless you have agreed otherwise in writing with PostgresML, you agree that you are responsible for protecting and enforcing those rights and that PostgresML has no obligation to do so on your behalf.
+
+
+## License from PostgresML and Restrictions on Use
+
+PostgresML gives you a personal, worldwide, royalty-free, non-assignable and non-exclusive license to use the Site and Services for the sole purpose of to allow you to access the Services for your non-commercial or internal business purposes, in the manner permitted by these Terms.
+
+You may not (and you may not permit anyone else to): (i) copy, modify, create a derivative work of, reverse engineer, decompile or otherwise attempt to extract the source code of the Services or any part thereof, unless this is expressly permitted or required by law, or unless you have been specifically told that you may do so by PostgresML, in writing (e.g., through an open source software license); or (ii) attempt to disable or circumvent any security mechanisms used by the Services or any applications running on the Services.
+
+You may not engage in any activity that interferes with or disrupts the Services (or the servers and networks which are connected to the Services).
+
+You may not rent, lease, provide access to or sublicense any elements of the Services to a third party or use the Services on behalf of or to provide services to third parties.
+
+You may not access the Services in a manner intended to avoid incurring fees or exceeding usage limits or quotas.
+
+You may not access the Services for the purpose of bringing an intellectual property infringement claim against PostgresML or for the purpose of creating a product or service competitive with the Services. You may not use any robot, spider, site search/retrieval application or other manual or automatic program or device to retrieve, index, “scrape,” “data mine” or in any way gather Content from the Services.
+
+You agree that you will not upload, record, publish, post, link to, transmit or distribute User Content, or otherwise utilize the Services in a manner that: (i) advocates, promotes, incites, instructs, informs, assists or otherwise encourages violence or any illegal activities; (ii) infringes or violates the copyright, patent, trademark, service mark, trade name, trade secret, or other intellectual property rights of any third party or PostgresML, or any rights of publicity or privacy of any party; (iii) attempts to mislead others about your identity or the origin of a message or other communication, or impersonates or otherwise misrepresents your affiliation with any other person or entity, or is otherwise materially false, misleading, or inaccurate; (iv) promotes, solicits or comprises inappropriate, harassing, abusive, profane, hateful, defamatory, libelous, threatening, obscene, indecent, vulgar, pornographic or otherwise objectionable or unlawful content or activity; (v) is harmful to minors; (vi) utilizes or contains any viruses, Trojan horses, worms, time bombs, or any other similar software, data, or programs that may damage, detrimentally interfere with, surreptitiously intercept, or expropriate any system, data, personal information, or property of another; or (vii) violates any law, statute, ordinance, or regulation (including without limitation the laws and regulations governing export control, unfair competition, anti-discrimination, or false advertising).
+
+You may not use the Services if you are a person barred from receiving the Services under the laws of the United States or other countries, including the country in which you are resident or from which you use the Services. You affirm that you are over the age of 13, as the Services are not intended for children under 13.
+
+Customer is responsible and liable for all uses of the Services and Documentation resulting from access provided by Customer, directly or indirectly, whether such access or use is permitted by or in violation of these Terms. Without limiting the generality of the foregoing, Customer is responsible for all acts and omissions of authorized users, and any act or omission by an authorized user that would constitute a breach of these Terms if taken by Customer will be deemed a breach of these Terms by Customer. Customer shall use reasonable efforts to make all authorized users aware of these Terms's provisions as applicable to such authorized users’ use of the Services and shall cause authorized users to comply with such provisions.
+
+PostgresML may from time to time make third-party products available to Customer or PostgresML may allow for certain third-party products to be integrated with the Services to allow for the transmission of User Content from such third-party products into the services. For purposes of these Terms, such third-party products are subject to their own terms and conditions. If Customer does not agree to abide by the applicable terms for any such third-party products, then Customer should not install or use such third-party products. By authorizing PostgresML to transmit User Content from third-party products into the services, Customer represents and warrants to PostgresML that it has all right, power, and authority to provide such authorization.
+
+Customer has and will retain sole responsibility for: (i) all User Content, including its content and use; (ii) all information, instructions, and materials provided by or on behalf of Customer or any authorized user in connection with the Services; (iii) Customer's information technology infrastructure, including computers, software, databases, electronic systems (including database management systems), and networks, whether operated directly by Customer or through the use of third-party services ("Customer Systems"); (iv) the security and use of Customer's and its authorized users' access credentials; and (v) all access to and use of the Services directly or indirectly by or through the Customer Systems or its or its authorized users' access credentials, with or without Customer's knowledge or consent, including all results obtained from, and all conclusions, decisions, and actions based on, such access or use.
+
+## Pricing Terms
+
+Subject to the Terms, the Services are provided to you without charge up to certain usage limits, and usage in excess of these limits may require purchase of additional resources and the payment of fees. Please see the [pricing](/pricing) terms for details regarding pricing for the Services.
+
+## Privacy Policies
+
+These Services are provided in accordance with our [Privacy Policy](/docs/cloud/privacy-policy). You agree to the use of your User Content and personal information in accordance with these Terms and PostgresML’s Privacy Policy.
+
+You agree to protect the privacy and legal rights of your End Users. If your End Users provide you with user names, passwords, or other login information or personal information, you agree make such End Users aware that such information may be made available to PostgresML and to refer such End Users to our Privacy Policy linked above.
+
+Notwithstanding anything to the contrary, in the event you use the Services as an organization, you agree to permit PostgresML to identify you as a customer and to use your name and/or logo in PostgresML’s website and marketing materials.
+
+## Modification and Termination of Services
+
+PostgresML is constantly innovating in order to provide the best possible experience for its users. You acknowledge and agree that the form and nature of the Services which PostgresML provides may change from time to time without prior notice to you, subject to the terms in its Privacy Policy. Changes to the form and nature of the Services will be effective with respect to all versions of the Services; examples of changes to the form and nature of the Services include without limitation changes to fee and payment policies, security patches, added functionality, automatic updates, and other enhancements. Any new features that may be added to the website or the Services from time to time will be subject to these Terms, unless stated otherwise.
+
+You may terminate these Terms at any time by canceling your account on the Services, subject to any terms and conditions in connection with termination contained in the separate written agreement between you and PostgresML.
+
+You agree that PostgresML, in its sole discretion and for any or no reason, may terminate your account or any part thereof. You agree that any termination of your access to the Services may be without prior notice, and you agree that PostgresML will not be liable to you or any third party for such termination.
+
+You are solely responsible for exporting your User Content from the Services prior to termination of your account for any reason, provided that if we terminate your account for our convenience, we will endeavor to provide you a reasonable opportunity to retrieve your User Content.
+
+Upon any termination of the Services or your account these Terms will also terminate, but all provisions of these Terms which, by their nature, should survive termination, shall survive termination, including, without limitation, ownership provisions, warranty disclaimers, and limitations of liability.
+
+## Changes to the Terms
+
+These Terms may be amended or updated from time to time without notice and may have changed since your last visit to the website or use of the Services. It is your responsibility to review these Terms for any changes. By continuing to access or use the Services after revisions become effective, you agree to be bound by the revised Terms. If you do not agree to the new Terms, please stop using the Services. Please visit this page regularly to review these Terms for any changes.
+
+## Disclaimer of Warranty
+
+YOU EXPRESSLY UNDERSTAND AND AGREE THAT YOUR USE OF THE SERVICES ARE AT YOUR SOLE RISK AND THAT THE SERVICES ARE PROVIDED “AS IS” AND “AS AVAILABLE.”
+
+POSTGRESML, ITS SUBSIDIARIES AND AFFILIATES, AND ITS LICENSORS MAKE NO EXPRESS WARRANTIES AND DISCLAIM ALL IMPLIED WARRANTIES REGARDING THE SERVICES, INCLUDING IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. WITHOUT LIMITING THE GENERALITY OF THE FOREGOING, POSTGRESML, ITS SUBSIDIARIES AND AFFILIATES, AND ITS LICENSORS DO NOT REPRESENT OR WARRANT TO YOU THAT: (A) YOUR USE OF THE SERVICES WILL MEET YOUR REQUIREMENTS, (B) YOUR USE OF THE SERVICES WILL BE UNINTERRUPTED, TIMELY, SECURE OR FREE FROM ERROR, AND (C) USAGE DATA PROVIDED THROUGH THE SERVICES WILL BE ACCURATE.
+
+NOTHING IN THESE TERMS, INCLUDING SECTIONS 10 AND 11, SHALL EXCLUDE OR LIMIT POSTGRESML’S WARRANTY OR LIABILITY FOR LOSSES WHICH MAY NOT BE LAWFULLY EXCLUDED OR LIMITED BY APPLICABLE LAW.
+
+## Limitation of Liability
+
+SUBJECT TO SECTION 10 ABOVE, YOU EXPRESSLY UNDERSTAND AND AGREE THAT POSTGRESML, ITS SUBSIDIARIES AND AFFILIATES, AND ITS LICENSORS SHALL NOT BE LIABLE TO YOU FOR ANY INDIRECT, INCIDENTAL, SPECIAL, CONSEQUENTIAL, OR EXEMPLARY DAMAGES WHICH MAY BE INCURRED BY YOU, HOWEVER CAUSED AND UNDER ANY THEORY OF LIABILITY. THIS SHALL INCLUDE, BUT NOT BE LIMITED TO, ANY LOSS OF PROFIT (WHETHER INCURRED DIRECTLY OR INDIRECTLY), ANY LOSS OF GOODWILL OR BUSINESS REPUTATION, ANY LOSS OF DATA SUFFERED, COST OF PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES, OR OTHER INTANGIBLE LOSS. THESE LIMITATIONS SHALL APPLY NOTWITHSTANDING THE FAILURE OF ESSENTIAL PURPOSE OF ANY LIMITED REMEDY.
+
+THE LIMITATIONS ON POSTGRESML’S LIABILITY TO YOU IN THIS SECTION SHALL APPLY WHETHER OR NOT POSTGRESML HAS BEEN ADVISED OF OR SHOULD HAVE BEEN AWARE OF THE POSSIBILITY OF ANY SUCH LOSSES ARISING.
+
+SOME STATES AND JURISDICTIONS MAY NOT ALLOW THE LIMITATION OR EXCLUSION OF LIABILITY FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES, SO THE ABOVE LIMITATION OR EXCLUSION MAY NOT APPLY TO YOU. IN NO EVENT SHALL POSTGRESML’S TOTAL LIABILITY TO YOU FOR ALL DAMAGES, LOSSES, AND CAUSES OF ACTION (WHETHER IN CONTRACT, TORT (INCLUDING NEGLIGENCE), OR OTHERWISE) EXCEED THE AMOUNT THAT YOU HAVE ACTUALLY PAID FOR THE SERVICES IN THE PAST TWELVE MONTHS, OR ONE HUNDRED DOLLARS ($100.00), WHICHEVER IS GREATER.
+
+## Indemnification
+
+You agree to hold harmless and indemnify PostgresML, and its subsidiaries, affiliates, officers, agents, employees, advertisers, licensors, suppliers or partners (collectively “PostgresML and Partners”) from and against any third party claim arising from or in any way related to (a) your breach of the Terms, (b) your use of the Services, (c) your violation of applicable laws, rules or regulations in connection with the Services, or (d) your User Content, including any liability or expense arising from all claims, losses, damages (actual and consequential), suits, judgments, litigation costs and attorneys’ fees, of every kind and nature.
+
+## Third-Party Content and Materials
+
+You may be able to access or use third party websites, resources, content, communications or information (“Third Party Materials”) via the Services. You acknowledge sole responsibility for and assume all risk arising from your access to, reliance upon or use of any such Third Party Materials and PostgresML disclaims any liability that you may incur arising from access to, reliance upon or use of such Third Party Materials via the Services.
+
+You acknowledge and agree that PostgresML: (a) is not responsible for the availability or accuracy of such Third Party Materials; (b) has no liability to you or any third party for any harm, injuries or losses suffered as a result of your access to, reliance upon or use of such Third Party Materials; and (c) does not make any promises to remove Third Party Materials from being accessed through the Services.
+
+## Third Party Software
+
+The Services may incorporate certain third party software (“Third Party Software”), which is licensed subject to the terms and conditions of the third party licensing such Third Party Software. Nothing in these Terms limits your rights under, or grants you rights that supersede, the terms and conditions of any applicable license for such Third Party Software.
+
+## Feedback
+
+You may choose to or we may invite you to submit comments or ideas about the Services, including without limitation about how to improve the Services or our products. By submitting any feedback, you agree that your disclosure is gratuitous, unsolicited and without restriction and will not place PostgresML under any fiduciary or other obligation, and that we are free to use such feedback without any additional compensation to you, and/or to disclose such feedback on a non-confidential basis or otherwise to anyone. Further, you warrant that your feedback is not subject to any license terms that would purport to require us to comply with any additional obligations with respect to any products or services that incorporate any of your feedback.
+
+## Disputes
+
+**Please read the following section carefully because it requires you to arbitrate certain disputes and claims with PostgresML and limits the manner in which you can seek relief from us.**
+
+These Terms and any action related thereto will be governed by the laws of the State of California without regard to its conflict of laws provisions. Except for small claims disputes in which you or PostgresML seek to bring an individual action in small claims court located in the county of your billing address or claims for injunctive relief by either party, any dispute or controversy arising out of, in relation to, or in connection with these Terms or your use of the Services shall be finally settled by binding arbitration in San Francisco County, California under the Federal Arbitration Act (9 U.S.C. §§ 1-307) and the then current rules of JAMS (formerly known as Judicial Arbitration & Mediation Services) by one (1) arbitrator appointed in accordance with such rules. Where arbitration is not required by these Terms, the exclusive jurisdiction and venue of any action with respect to the subject matter of these Terms will be the state and federal courts located in San Francisco County, California, and each of the parties hereto waives any objection to jurisdiction and venue in such courts. ANY DISPUTE RESOLUTION PROCEEDING ARISING OUT OF OR RELATED TO THESE TERMS OR THE SALES TRANSACTIONS BETWEEN YOU AND POSTGRESML, WHETHER IN ARBITRATION OR OTHERWISE, SHALL BE CONDUCTED ONLY ON AN INDIVIDUAL BASIS AND NOT IN A CLASS, CONSOLIDATED OR REPRESENTATIVE ACTION, AND YOU EXPRESSLY AGREE THAT CLASS ACTION AND REPRESENTATIVE ACTION PROCEDURES SHALL NOT BE ASSERTED IN NOR APPLY TO ANY ARBITRATION PURSUANT TO THESE TERMS AND CONDITIONS. YOU ALSO AGREE NOT TO BRING ANY LEGAL ACTION, BASED UPON ANY LEGAL THEORY INCLUDING CONTRACT, TORT, EQUITY OR OTHERWISE, AGAINST POSTGRESML THAT IS MORE THAN ONE YEAR AFTER THE DATE OF THE APPLICABLE ORDER.
+
+You have the right to opt out of binding arbitration within 30 days of the date you first accepted the terms of this Section by emailing us at contact@postgresml.org. In order to be effective, the opt out notice must include your full name and clearly indicate your intent to opt out of binding arbitration.
+
+## Miscellaneous
+
+These Terms, together with our Privacy Policy, constitutes the entire agreement between the parties relating to the Services and all related activities. These Terms shall not be modified except in writing signed by both parties or by a new posting of these Terms issued by us. If any part of these Terms is held to be unlawful, void, or unenforceable, that part shall be deemed severed and shall not affect the validity and enforceability of the remaining provisions. The failure of PostgresML to exercise or enforce any right or provision under these Terms shall not constitute a waiver of such right or provision. Any waiver of any right or provision by PostgresML must be in writing and shall only apply to the specific instance identified in such writing. You may not assign these Terms, or any rights or licenses granted hereunder, whether voluntarily, by operation of law, or otherwise without our prior written consent.
+
+You must be over 13 years of age to use the Services, and children under the age of 13 cannot use or register for the Services. If you are over 13 years of age but are not yet of legal age to form a binding contract (in many jurisdictions, this age is 18), then you must get your parent or guardian to read these Terms and agree to them for you before you use the Services. If you are a parent or guardian and you provide your consent to your child's registration with the Services, you agree to be bound by these Terms with respect of your child’s use of the Services.
+
+
+## Contact Us
+
+If you have any questions about these Terms or if you wish to make any complaint or claim with respect to the Services, please contact us at: contact@postgresml.org.
+
+When submitting a complaint, please provide a brief description of the nature of your complaint and the specific services to which your complaint relates.
+
+
+
diff --git a/pgml-cms/docs/resources/faqs.md b/pgml-cms/docs/introduction/faq.md
similarity index 70%
rename from pgml-cms/docs/resources/faqs.md
rename to pgml-cms/docs/introduction/faq.md
index 2d8ede8c6..4166b14cc 100644
--- a/pgml-cms/docs/resources/faqs.md
+++ b/pgml-cms/docs/introduction/faq.md
@@ -2,11 +2,11 @@
description: PostgresML Frequently Asked Questions
---
-# FAQs
+# FAQ
-## What is PostgresML?
+## What is PGML?
-PostgresML is an open-source database extension that turns Postgres into an end-to-end machine learning platform. It allows you to build, train, and deploy ML models directly within your Postgres database without moving data between systems.
+PGML is an open-source database extension that turns Postgres into an end-to-end machine learning platform. It allows you to build, train, and deploy ML models directly within your Postgres database without moving data between systems.
## What is a DB extension?
@@ -24,11 +24,11 @@ Benefits include faster development cycles, reduced latency, tighter integration
PostgresML requires using Postgres as the database. If your data currently resides in a different database, there would be some upfront effort required to migrate the data into Postgres in order to utilize PostgresML's capabilities.
-## What is hosted PostgresML?
+## What is PostgresML Cloud?
-Hosted PostgresML is a fully managed cloud service that provides all the capabilities of open source PostgresML without the need to run your own database infrastructure.
+Hosted PostgresML is a fully managed cloud service that provides all the capabilities of open source PGML without the need to run your own database infrastructure.
-With hosted PostgresML, you get:
+With PostgresML Cloud, you get:
* Flexible compute resources - Choose CPU, RAM or GPU machines tailored to your workload
* Horizontally scalable inference with read-only replicas
@@ -37,4 +37,4 @@ With hosted PostgresML, you get:
* Automated backups and point-in-time restore
* Monitoring dashboard with metrics and logs
-In summary, hosted PostgresML removes the operational burden so you can focus on developing machine learning applications, while still getting the benefits of the unified PostgresML architecture.
+In summary, PostgresML Cloud removes the operational burden so you can focus on developing machine learning applications, while still getting the benefits of the unified PostgresML architecture.
diff --git a/pgml-cms/docs/introduction/getting-started/connect-your-app.md b/pgml-cms/docs/introduction/getting-started/connect-your-app.md
index f561fb081..100fcb638 100644
--- a/pgml-cms/docs/introduction/getting-started/connect-your-app.md
+++ b/pgml-cms/docs/introduction/getting-started/connect-your-app.md
@@ -42,7 +42,7 @@ const pgml = require("pgml");
const main = () => {
const client = pgml.newOpenSourceAI();
const results = client.chat_completions_create(
- "meta-llama/Meta-Llama-3-8B-Instruct",
+ "meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
role: "system",
@@ -66,7 +66,7 @@ import pgml
async def main():
client = pgml.OpenSourceAI()
results = client.chat_completions_create(
- "meta-llama/Meta-Llama-3-8B-Instruct",
+ "meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
"role": "system",
diff --git a/pgml-cms/docs/introduction/getting-started/import-your-data/README.md b/pgml-cms/docs/introduction/import-your-data/README.md
similarity index 85%
rename from pgml-cms/docs/introduction/getting-started/import-your-data/README.md
rename to pgml-cms/docs/introduction/import-your-data/README.md
index 0ab10669c..c73d25ae6 100644
--- a/pgml-cms/docs/introduction/getting-started/import-your-data/README.md
+++ b/pgml-cms/docs/introduction/import-your-data/README.md
@@ -12,11 +12,11 @@ Just like any PostgreSQL database, PostgresML can be configured as the primary a
If your intention is to use PostgresML as your primary database, your job here is done. You can use the connection credentials provided and start building your application on top of in-database AI right away.
-## [Logical replica](logical-replication/)
+## [Logical replication](logical-replication/)
If your primary database is hosted elsewhere, for example AWS RDS, or Azure Postgres, you can get your data replicated to PostgresML in real time using logical replication.
-
+
Having access to your data immediately is very useful to
accelerate your machine learning use cases and removes the need for moving data multiple times between microservices. Latency-sensitive applications should consider using this approach.
@@ -25,7 +25,7 @@ accelerate your machine learning use cases and removes the need for moving data
Foreign data wrappers are a set of PostgreSQL extensions that allow making direct connections from inside the database directly to other databases, even if they aren't running on Postgres. For example, Postgres has foreign data wrappers for MySQL, S3, Snowflake and many others.
-
+
FDWs are useful when data access is infrequent and not latency-sensitive. For many use cases, like offline batch workloads and not very busy websites, this approach is suitable and easy to get started with.
diff --git a/pgml-cms/docs/introduction/getting-started/import-your-data/copy.md b/pgml-cms/docs/introduction/import-your-data/copy.md
similarity index 100%
rename from pgml-cms/docs/introduction/getting-started/import-your-data/copy.md
rename to pgml-cms/docs/introduction/import-your-data/copy.md
diff --git a/pgml-cms/docs/introduction/getting-started/import-your-data/foreign-data-wrappers.md b/pgml-cms/docs/introduction/import-your-data/foreign-data-wrappers.md
similarity index 97%
rename from pgml-cms/docs/introduction/getting-started/import-your-data/foreign-data-wrappers.md
rename to pgml-cms/docs/introduction/import-your-data/foreign-data-wrappers.md
index 0e3b12333..298634ed8 100644
--- a/pgml-cms/docs/introduction/getting-started/import-your-data/foreign-data-wrappers.md
+++ b/pgml-cms/docs/introduction/import-your-data/foreign-data-wrappers.md
@@ -6,7 +6,7 @@ description: Connect your production database to PostgresML using Foreign Data W
Foreign data wrappers are a set of Postgres extensions that allow making direct connections to other databases from inside your PostgresML database. Other databases can be your production Postgres database on RDS or Azure, or another database engine like MySQL, Snowflake, or even an S3 bucket.
-
+
## Getting started
diff --git a/pgml-cms/docs/introduction/getting-started/import-your-data/logical-replication/README.md b/pgml-cms/docs/introduction/import-your-data/logical-replication/README.md
similarity index 94%
rename from pgml-cms/docs/introduction/getting-started/import-your-data/logical-replication/README.md
rename to pgml-cms/docs/introduction/import-your-data/logical-replication/README.md
index d5371b391..b92daac8e 100644
--- a/pgml-cms/docs/introduction/getting-started/import-your-data/logical-replication/README.md
+++ b/pgml-cms/docs/introduction/import-your-data/logical-replication/README.md
@@ -6,7 +6,7 @@ description: Stream data from your primary database to PostgresML in real time u
Logical replication allows your PostgresML database to copy data from your primary database to PostgresML in real time. As soon as your customers make changes to their data on your website, those changes will become available in PostgresML.
-
+
## Getting started
@@ -21,7 +21,7 @@ First things first, make sure your primary database is configured to support log
| `wal_level` | `logical` |
| `wal_senders` | Greater than 0 |
| `max_replication_slots` | Greater than 0 |
-| `rds.logical_replicationion` (only on AWS RDS) | `1` |
+| `rds.logical_replication` (only on AWS RDS) | `1` |
Make sure to **restart your database** after changing any of these settings.
diff --git a/pgml-cms/docs/introduction/getting-started/import-your-data/logical-replication/inside-a-vpc.md b/pgml-cms/docs/introduction/import-your-data/logical-replication/inside-a-vpc.md
similarity index 82%
rename from pgml-cms/docs/introduction/getting-started/import-your-data/logical-replication/inside-a-vpc.md
rename to pgml-cms/docs/introduction/import-your-data/logical-replication/inside-a-vpc.md
index 55da8bafb..278d8e865 100644
--- a/pgml-cms/docs/introduction/getting-started/import-your-data/logical-replication/inside-a-vpc.md
+++ b/pgml-cms/docs/introduction/import-your-data/logical-replication/inside-a-vpc.md
@@ -3,7 +3,7 @@
If your database doesn't have Internet access, PostgresML will need a service to proxy connections to your database. Any TCP proxy will do,
and we also provide an nginx-based Docker image than can be used without any additional configuration.
-
+
## PostgresML IPs by region
diff --git a/pgml-cms/docs/introduction/getting-started/import-your-data/pg-dump.md b/pgml-cms/docs/introduction/import-your-data/pg-dump.md
similarity index 100%
rename from pgml-cms/docs/introduction/getting-started/import-your-data/pg-dump.md
rename to pgml-cms/docs/introduction/import-your-data/pg-dump.md
diff --git a/pgml-cms/docs/resources/data-storage-and-retrieval/README.md b/pgml-cms/docs/introduction/import-your-data/storage-and-retrieval/README.md
similarity index 100%
rename from pgml-cms/docs/resources/data-storage-and-retrieval/README.md
rename to pgml-cms/docs/introduction/import-your-data/storage-and-retrieval/README.md
diff --git a/pgml-cms/docs/resources/data-storage-and-retrieval/documents.md b/pgml-cms/docs/introduction/import-your-data/storage-and-retrieval/documents.md
similarity index 100%
rename from pgml-cms/docs/resources/data-storage-and-retrieval/documents.md
rename to pgml-cms/docs/introduction/import-your-data/storage-and-retrieval/documents.md
diff --git a/pgml-cms/docs/resources/data-storage-and-retrieval/llm-based-pipelines-with-postgresml-and-dbt-data-build-tool.md b/pgml-cms/docs/introduction/import-your-data/storage-and-retrieval/llm-based-pipelines-with-postgresml-and-dbt-data-build-tool.md
similarity index 100%
rename from pgml-cms/docs/resources/data-storage-and-retrieval/llm-based-pipelines-with-postgresml-and-dbt-data-build-tool.md
rename to pgml-cms/docs/introduction/import-your-data/storage-and-retrieval/llm-based-pipelines-with-postgresml-and-dbt-data-build-tool.md
diff --git a/pgml-cms/docs/resources/data-storage-and-retrieval/partitioning.md b/pgml-cms/docs/introduction/import-your-data/storage-and-retrieval/partitioning.md
similarity index 100%
rename from pgml-cms/docs/resources/data-storage-and-retrieval/partitioning.md
rename to pgml-cms/docs/introduction/import-your-data/storage-and-retrieval/partitioning.md
diff --git a/pgml-cms/docs/resources/data-storage-and-retrieval/tabular-data.md b/pgml-cms/docs/introduction/import-your-data/storage-and-retrieval/tabular-data.md
similarity index 100%
rename from pgml-cms/docs/resources/data-storage-and-retrieval/tabular-data.md
rename to pgml-cms/docs/introduction/import-your-data/storage-and-retrieval/tabular-data.md
diff --git a/pgml-cms/docs/open-source/korvus/example-apps/rag-with-openai.md b/pgml-cms/docs/open-source/korvus/example-apps/rag-with-openai.md
index 738777f7d..64cc2af4a 100644
--- a/pgml-cms/docs/open-source/korvus/example-apps/rag-with-openai.md
+++ b/pgml-cms/docs/open-source/korvus/example-apps/rag-with-openai.md
@@ -1,3 +1,7 @@
+---
+description: An example application performing RAG with Korvus and OpenAI.
+---
+
# Rag with OpenAI
This example shows how to use third-party LLM providers like OpenAI to perform RAG with Korvus.
diff --git a/pgml-cms/docs/open-source/korvus/example-apps/semantic-search.md b/pgml-cms/docs/open-source/korvus/example-apps/semantic-search.md
index 88cf149cd..d48158b81 100644
--- a/pgml-cms/docs/open-source/korvus/example-apps/semantic-search.md
+++ b/pgml-cms/docs/open-source/korvus/example-apps/semantic-search.md
@@ -1,3 +1,8 @@
+---
+description: >-
+ An example application built with Korvus to perform Semantic Search.
+---
+
# Semantic Search
This example demonstrates using the `korvus` SDK to create a collection, add documents, build a pipeline for vector search and make a sample query.
@@ -47,7 +52,7 @@ const main = async () => {
// Perform vector_search
// We are querying for the string "Is Korvus fast?"
- // Notice that the `mixedbread-ai/mxbai-embed-large-v1` embedding model takes a prompt paramter when embedding for search
+ // Notice that the `mixedbread-ai/mxbai-embed-large-v1` embedding model takes a prompt parameter when embedding for search
// We specify that we only want to return the `id` of documents. If the `document` key was blank it would return the entire document with every result
// Limit the results to 5. In our case we only have two documents in our Collection so we will only get two results
const results = await collection.vector_search(
@@ -122,7 +127,7 @@ async def main():
# Perform vector_search
# We are querying for the string "Is Korvus fast?"
- # Notice that the `mixedbread-ai/mxbai-embed-large-v1` embedding model takes a prompt paramter when embedding for search
+ # Notice that the `mixedbread-ai/mxbai-embed-large-v1` embedding model takes a prompt parameter when embedding for search
# We specify that we only want to return the `id` of documents. If the `document` key was blank it would return the entire document with every result
# Limit the results to 5. In our case we only have two documents in our Collection so we will only get two results
results = await collection.vector_search(
diff --git a/pgml-cms/docs/open-source/korvus/guides/README.md b/pgml-cms/docs/open-source/korvus/guides/README.md
index 7a79c66f6..733c2b855 100644
--- a/pgml-cms/docs/open-source/korvus/guides/README.md
+++ b/pgml-cms/docs/open-source/korvus/guides/README.md
@@ -11,3 +11,5 @@ For example apps checkout our [Example apps section](../example-apps/).
- [Constructing Pipelines](constructing-pipelines)
- [RAG](rag)
- [Vector Search](vector-search)
+- [Document Search](document-search)
+- [OpenSourceAI](opensourceai)
diff --git a/pgml-cms/docs/guides/opensourceai.md b/pgml-cms/docs/open-source/korvus/guides/opensourceai.md
similarity index 92%
rename from pgml-cms/docs/guides/opensourceai.md
rename to pgml-cms/docs/open-source/korvus/guides/opensourceai.md
index e10386da5..2bd5f627b 100644
--- a/pgml-cms/docs/guides/opensourceai.md
+++ b/pgml-cms/docs/open-source/korvus/guides/opensourceai.md
@@ -62,7 +62,7 @@ Here is a simple example using zephyr-7b-beta, one of the best 7 billion paramet
const korvus = require("korvus");
const client = korvus.newOpenSourceAI();
const results = client.chat_completions_create(
- "meta-llama/Meta-Llama-3-8B-Instruct",
+ "meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
role: "system",
@@ -83,7 +83,7 @@ console.log(results);
import korvus
client = korvus.OpenSourceAI()
results = client.chat_completions_create(
- "meta-llama/Meta-Llama-3-8B-Instruct",
+ "meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
"role": "system",
@@ -114,7 +114,7 @@ print(results)
],
"created": 1701291672,
"id": "abf042d2-9159-49cb-9fd3-eef16feb246c",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion",
"system_fingerprint": "eecec9d4-c28b-5a27-f90b-66c3fb6cee46",
"usage": {
@@ -133,7 +133,7 @@ Notice there is near one to one relation between the parameters and return type
The best part of using open-source AI is the flexibility with models. Unlike OpenAI, we are not restricted to using a few censored models, but have access to almost any model out there.
-Here is an example of streaming with the popular `meta-llama/Meta-Llama-3-8B-Instruct` model.
+Here is an example of streaming with the popular `meta-llama/Meta-Llama-3.1-8B-Instruct` model.
{% tabs %}
{% tab title="JavaScript" %}
@@ -141,7 +141,7 @@ Here is an example of streaming with the popular `meta-llama/Meta-Llama-3-8B-Ins
const korvus = require("korvus");
const client = korvus.newOpenSourceAI();
const it = client.chat_completions_create_stream(
- "meta-llama/Meta-Llama-3-8B-Instruct",
+ "meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
role: "system",
@@ -166,7 +166,7 @@ while (!result.done) {
import korvus
client = korvus.OpenSourceAI()
results = client.chat_completions_create_stream(
- "meta-llama/Meta-Llama-3-8B-Instruct",
+ "meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
"role": "system",
@@ -196,7 +196,7 @@ for c in results:
],
"created": 1701296792,
"id": "62a817f5-549b-43e0-8f0c-a7cb204ab897",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion.chunk",
"system_fingerprint": "f366d657-75f9-9c33-8e57-1e6be2cf62f3"
}
@@ -212,7 +212,7 @@ for c in results:
],
"created": 1701296792,
"id": "62a817f5-549b-43e0-8f0c-a7cb204ab897",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion.chunk",
"system_fingerprint": "f366d657-75f9-9c33-8e57-1e6be2cf62f3"
}
@@ -234,7 +234,7 @@ We also have asynchronous versions of the `chat_completions_create` and `chat_co
const korvus = require("korvus");
const client = korvus.newOpenSourceAI();
const results = await client.chat_completions_create_async(
- "meta-llama/Meta-Llama-3-8B-Instruct",
+ "meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
role: "system",
@@ -255,7 +255,7 @@ console.log(results);
import korvus
client = korvus.OpenSourceAI()
results = await client.chat_completions_create_async(
- "meta-llama/Meta-Llama-3-8B-Instruct",
+ "meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
"role": "system",
@@ -284,7 +284,7 @@ results = await client.chat_completions_create_async(
],
"created": 1701291672,
"id": "abf042d2-9159-49cb-9fd3-eef16feb246c",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion",
"system_fingerprint": "eecec9d4-c28b-5a27-f90b-66c3fb6cee46",
"usage": {
@@ -303,7 +303,7 @@ Notice the return types for the sync and async variations are the same.
const korvus = require("korvus");
const client = korvus.newOpenSourceAI();
const it = await client.chat_completions_create_stream_async(
- "meta-llama/Meta-Llama-3-8B-Instruct",
+ "meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
role: "system",
@@ -328,7 +328,7 @@ while (!result.done) {
import korvus
client = korvus.OpenSourceAI()
results = await client.chat_completions_create_stream_async(
- "meta-llama/Meta-Llama-3-8B-Instruct",
+ "meta-llama/Meta-Llama-3.1-8B-Instruct",
[
{
"role": "system",
@@ -359,7 +359,7 @@ async for c in results:
],
"created": 1701296792,
"id": "62a817f5-549b-43e0-8f0c-a7cb204ab897",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion.chunk",
"system_fingerprint": "f366d657-75f9-9c33-8e57-1e6be2cf62f3"
}
@@ -375,7 +375,7 @@ async for c in results:
],
"created": 1701296792,
"id": "62a817f5-549b-43e0-8f0c-a7cb204ab897",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"object": "chat.completion.chunk",
"system_fingerprint": "f366d657-75f9-9c33-8e57-1e6be2cf62f3"
}
@@ -389,8 +389,8 @@ We have truncated the output to two items
We have tested the following models and verified they work with the OpenSourceAI:
-* meta-llama/Meta-Llama-3-8B-Instruct
-* meta-llama/Meta-Llama-3-70B-Instruct
+* meta-llama/Meta-Llama-3.1-8B-Instruct
+* meta-llama/Meta-Llama-3.1-70B-Instruct
* microsoft/Phi-3-mini-128k-instruct
* mistralai/Mixtral-8x7B-Instruct-v0.1
* mistralai/Mistral-7B-Instruct-v0.2
diff --git a/pgml-cms/docs/open-source/korvus/guides/rag.md b/pgml-cms/docs/open-source/korvus/guides/rag.md
index 4fe76f380..d9a2e23e1 100644
--- a/pgml-cms/docs/open-source/korvus/guides/rag.md
+++ b/pgml-cms/docs/open-source/korvus/guides/rag.md
@@ -114,7 +114,7 @@ const results = await collection.rag(
aggregate: { "join": "\n" },
},
chat: {
- model: "meta-llama/Meta-Llama-3-8B-Instruct",
+ model: "meta-llama/Meta-Llama-3.1-8B-Instruct",
messages: [
{
role: "system",
@@ -155,7 +155,7 @@ results = await collection.rag(
"aggregate": {"join": "\n"},
},
"chat": {
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"messages": [
{
"role": "system",
@@ -196,7 +196,7 @@ let results = collection.rag(serde_json::json!(
"aggregate": {"join": "\n"},
},
"chat": {
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"messages": [
{
"role": "system",
@@ -236,7 +236,7 @@ char * results = korvus_collectionc_rag(collection,
\"aggregate\": {\"join\": \"\\n\"}\
},\
\"chat\": {\
- \"model\": \"meta-llama/Meta-Llama-3-8B-Instruct\",\
+ \"model\": \"meta-llama/Meta-Llama-3.1-8B-Instruct\",\
\"messages\": [\
{\
\"role\": \"system\",\
@@ -314,7 +314,7 @@ const results = await collection.rag(
aggregate: { "join": "\n" },
},
chat: {
- model: "meta-llama/Meta-Llama-3-8B-Instruct",
+ model: "meta-llama/Meta-Llama-3.1-8B-Instruct",
messages: [
{
role: "system",
@@ -356,7 +356,7 @@ results = await collection.rag(
"aggregate": {"join": "\n"},
},
"chat": {
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"messages": [
{
"role": "system",
@@ -398,7 +398,7 @@ let results = collection.rag(serde_json::json!(
"aggregate": {"join": "\n"},
},
"chat": {
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"messages": [
{
"role": "system",
diff --git a/pgml-cms/docs/open-source/pgcat/README.md b/pgml-cms/docs/open-source/pgcat/README.md
index 805422e97..a5fd27649 100644
--- a/pgml-cms/docs/open-source/pgcat/README.md
+++ b/pgml-cms/docs/open-source/pgcat/README.md
@@ -29,7 +29,7 @@ PgCat, like PostgresML, is free and open source, distributed under the MIT licen
PgCat implements the PostgreSQL wire protocol and can understand and optimally route queries & transactions based on their characteristics. For example, if your database deployment consists of a primary and replica, PgCat can send all `SELECT` queries to the replica, and all other queries to the primary, creating a read/write traffic separation.
-
+ PgCat deployment at scale
diff --git a/pgml-cms/docs/open-source/pgcat/features.md b/pgml-cms/docs/open-source/pgcat/features.md
index f00ff7fb4..e8154dbac 100644
--- a/pgml-cms/docs/open-source/pgcat/features.md
+++ b/pgml-cms/docs/open-source/pgcat/features.md
@@ -11,7 +11,7 @@ PgCat has many features currently in various stages of readiness and development
-
+
@@ -32,7 +32,7 @@ Least active connections assumes queries have different costs and replicas have
-
+
@@ -49,7 +49,7 @@ High availability is important for production deployments because database error
-
+
@@ -66,7 +66,7 @@ Removing read traffic from the primary can help scale it beyond its normal capac
-
+
diff --git a/pgml-cms/docs/open-source/pgml/api/pgml.transform/README.md b/pgml-cms/docs/open-source/pgml/api/pgml.transform/README.md
index 722d49d57..b9d6de949 100644
--- a/pgml-cms/docs/open-source/pgml/api/pgml.transform/README.md
+++ b/pgml-cms/docs/open-source/pgml/api/pgml.transform/README.md
@@ -123,7 +123,7 @@ pgml.transform(
SELECT pgml.transform(
task => '{
"task": "text-generation",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"model_type": "mistral",
"revision": "main",
"device_map": "auto"
@@ -148,7 +148,7 @@ def transform(task, call, inputs):
transform(
{
"task": "text-generation",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct",
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
"model_type": "mistral",
"revision": "main",
},
diff --git a/pgml-cms/docs/open-source/pgml/api/pgml.transform/text-generation.md b/pgml-cms/docs/open-source/pgml/api/pgml.transform/text-generation.md
index 707f5ab84..7439f3c5f 100644
--- a/pgml-cms/docs/open-source/pgml/api/pgml.transform/text-generation.md
+++ b/pgml-cms/docs/open-source/pgml/api/pgml.transform/text-generation.md
@@ -14,7 +14,7 @@ Use this for conversational AI applications or when you need to provide instruct
SELECT pgml.transform(
task => '{
"task": "text-generation",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct"
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"
}'::JSONB,
inputs => ARRAY[
'{"role": "system", "content": "You are a friendly and helpful chatbot"}'::JSONB,
@@ -53,7 +53,7 @@ An example with some common parameters:
SELECT pgml.transform(
task => '{
"task": "text-generation",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct"
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"
}'::JSONB,
inputs => ARRAY[
'{"role": "system", "content": "You are a friendly and helpful chatbot"}'::JSONB,
@@ -80,7 +80,7 @@ Use this for simpler text-generation tasks like completing sentences or generati
SELECT pgml.transform(
task => '{
"task": "text-generation",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct"
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"
}'::JSONB,
inputs => ARRAY[
'Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone'
@@ -118,7 +118,7 @@ An example with some common parameters:
SELECT pgml.transform(
task => '{
"task": "text-generation",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct"
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"
}'::JSONB,
inputs => ARRAY[
'Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone'
diff --git a/pgml-cms/docs/open-source/pgml/api/pgml.transform_stream.md b/pgml-cms/docs/open-source/pgml/api/pgml.transform_stream.md
index 7d259a742..c4fcf3c6e 100644
--- a/pgml-cms/docs/open-source/pgml/api/pgml.transform_stream.md
+++ b/pgml-cms/docs/open-source/pgml/api/pgml.transform_stream.md
@@ -30,13 +30,13 @@ pgml.transform_stream(
| inputs | The input chat messages. |
| args | The additional arguments for the model. |
-A simple example using `meta-llama/Meta-Llama-3-8B-Instruct`:
+A simple example using `meta-llama/Meta-Llama-3.1-8B-Instruct`:
```postgresql
SELECT pgml.transform_stream(
task => '{
"task": "conversational",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct"
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"
}'::JSONB,
inputs => ARRAY[
'{"role": "system", "content": "You are a friendly and helpful chatbot"}'::JSONB,
@@ -85,7 +85,7 @@ An example with some common parameters:
SELECT pgml.transform_stream(
task => '{
"task": "conversational",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct"
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"
}'::JSONB,
inputs => ARRAY[
'{"role": "system", "content": "You are a friendly and helpful chatbot"}'::JSONB,
@@ -132,13 +132,13 @@ pgml.transform_stream(
| input | The text to complete. |
| args | The additional arguments for the model. |
-A simple example using `meta-llama/Meta-Llama-3-8B-Instruct`:
+A simple example using `meta-llama/Meta-Llama-3.1-8B-Instruct`:
```postgresql
SELECT pgml.transform_stream(
task => '{
"task": "text-generation",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct"
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"
}'::JSONB,
input => 'Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone'
) AS answer;
@@ -189,7 +189,7 @@ An example with some common parameters:
SELECT pgml.transform_stream(
task => '{
"task": "text-generation",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct"
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"
}'::JSONB,
input => 'Three Rings for the Elven-kings under the sky, Seven for the Dwarf-lords in their halls of stone',
args => '{
diff --git a/pgml-cms/docs/open-source/pgml/developers/README.md b/pgml-cms/docs/open-source/pgml/developers/README.md
new file mode 100644
index 000000000..eb352d266
--- /dev/null
+++ b/pgml-cms/docs/open-source/pgml/developers/README.md
@@ -0,0 +1,3 @@
+# Developers
+
+Documentation relevant to self-hosting, compiling or contributing to PostgresML
diff --git a/pgml-cms/docs/resources/developer-docs/contributing.md b/pgml-cms/docs/open-source/pgml/developers/contributing.md
similarity index 99%
rename from pgml-cms/docs/resources/developer-docs/contributing.md
rename to pgml-cms/docs/open-source/pgml/developers/contributing.md
index 4a6cacc73..146a0077b 100644
--- a/pgml-cms/docs/resources/developer-docs/contributing.md
+++ b/pgml-cms/docs/open-source/pgml/developers/contributing.md
@@ -127,7 +127,7 @@ SELECT pgml.version();
postgres=# select pgml.version();
version
-------------------
- 2.9.2
+ 2.9.3
(1 row)
```
{% endtab %}
diff --git a/pgml-cms/docs/resources/developer-docs/distributed-training.md b/pgml-cms/docs/open-source/pgml/developers/distributed-training.md
similarity index 100%
rename from pgml-cms/docs/resources/developer-docs/distributed-training.md
rename to pgml-cms/docs/open-source/pgml/developers/distributed-training.md
diff --git a/pgml-cms/docs/resources/developer-docs/gpu-support.md b/pgml-cms/docs/open-source/pgml/developers/gpu-support.md
similarity index 100%
rename from pgml-cms/docs/resources/developer-docs/gpu-support.md
rename to pgml-cms/docs/open-source/pgml/developers/gpu-support.md
diff --git a/pgml-cms/docs/resources/developer-docs/installation.md b/pgml-cms/docs/open-source/pgml/developers/installation.md
similarity index 99%
rename from pgml-cms/docs/resources/developer-docs/installation.md
rename to pgml-cms/docs/open-source/pgml/developers/installation.md
index f3db4a7a6..a0343f80e 100644
--- a/pgml-cms/docs/resources/developer-docs/installation.md
+++ b/pgml-cms/docs/open-source/pgml/developers/installation.md
@@ -132,7 +132,7 @@ CREATE EXTENSION
pgml_test=# SELECT pgml.version();
version
---------
- 2.9.2
+ 2.9.3
(1 row)
```
diff --git a/pgml-cms/docs/resources/developer-docs/quick-start-with-docker.md b/pgml-cms/docs/open-source/pgml/developers/quick-start-with-docker.md
similarity index 99%
rename from pgml-cms/docs/resources/developer-docs/quick-start-with-docker.md
rename to pgml-cms/docs/open-source/pgml/developers/quick-start-with-docker.md
index c8d95fc83..5d946f84e 100644
--- a/pgml-cms/docs/resources/developer-docs/quick-start-with-docker.md
+++ b/pgml-cms/docs/open-source/pgml/developers/quick-start-with-docker.md
@@ -18,7 +18,7 @@ docker run \
-v postgresml_data:/var/lib/postgresql \
-p 5433:5432 \
-p 8000:8000 \
- ghcr.io/postgresml/postgresml:2.7.13 \
+ ghcr.io/postgresml/postgresml:2.9.3 \
sudo -u postgresml psql -d postgresml
```
{% endtab %}
@@ -43,7 +43,7 @@ docker run \
--gpus all \
-p 5433:5432 \
-p 8000:8000 \
- ghcr.io/postgresml/postgresml:2.7.3 \
+ ghcr.io/postgresml/postgresml:2.9.3 \
sudo -u postgresml psql -d postgresml
```
@@ -80,7 +80,7 @@ Time: 41.520 ms
postgresml=# SELECT pgml.version();
version
---------
- 2.9.2
+ 2.9.3
(1 row)
```
diff --git a/pgml-cms/docs/resources/developer-docs/self-hosting/README.md b/pgml-cms/docs/open-source/pgml/developers/self-hosting/README.md
similarity index 100%
rename from pgml-cms/docs/resources/developer-docs/self-hosting/README.md
rename to pgml-cms/docs/open-source/pgml/developers/self-hosting/README.md
diff --git a/pgml-cms/docs/resources/developer-docs/self-hosting/backups.md b/pgml-cms/docs/open-source/pgml/developers/self-hosting/backups.md
similarity index 100%
rename from pgml-cms/docs/resources/developer-docs/self-hosting/backups.md
rename to pgml-cms/docs/open-source/pgml/developers/self-hosting/backups.md
diff --git a/pgml-cms/docs/resources/developer-docs/self-hosting/building-from-source.md b/pgml-cms/docs/open-source/pgml/developers/self-hosting/building-from-source.md
similarity index 100%
rename from pgml-cms/docs/resources/developer-docs/self-hosting/building-from-source.md
rename to pgml-cms/docs/open-source/pgml/developers/self-hosting/building-from-source.md
diff --git a/pgml-cms/docs/resources/developer-docs/self-hosting/pooler.md b/pgml-cms/docs/open-source/pgml/developers/self-hosting/pooler.md
similarity index 99%
rename from pgml-cms/docs/resources/developer-docs/self-hosting/pooler.md
rename to pgml-cms/docs/open-source/pgml/developers/self-hosting/pooler.md
index b34441afd..5809012fc 100644
--- a/pgml-cms/docs/resources/developer-docs/self-hosting/pooler.md
+++ b/pgml-cms/docs/open-source/pgml/developers/self-hosting/pooler.md
@@ -115,6 +115,6 @@ Type "help" for help.
postgresml=> SELECT pgml.version();
version
---------
- 2.9.2
+ 2.9.3
(1 row)
```
diff --git a/pgml-cms/docs/resources/developer-docs/self-hosting/replication.md b/pgml-cms/docs/open-source/pgml/developers/self-hosting/replication.md
similarity index 100%
rename from pgml-cms/docs/resources/developer-docs/self-hosting/replication.md
rename to pgml-cms/docs/open-source/pgml/developers/self-hosting/replication.md
diff --git a/pgml-cms/docs/resources/developer-docs/self-hosting/running-on-ec2.md b/pgml-cms/docs/open-source/pgml/developers/self-hosting/running-on-ec2.md
similarity index 100%
rename from pgml-cms/docs/resources/developer-docs/self-hosting/running-on-ec2.md
rename to pgml-cms/docs/open-source/pgml/developers/self-hosting/running-on-ec2.md
diff --git a/pgml-cms/docs/open-source/pgml/guides/README.md b/pgml-cms/docs/open-source/pgml/guides/README.md
new file mode 100644
index 000000000..f6221b691
--- /dev/null
+++ b/pgml-cms/docs/open-source/pgml/guides/README.md
@@ -0,0 +1,14 @@
+# Guides
+
+* [Embeddings](embeddings/)
+ * [In-database Generation](embeddings/in-database-generation.md)
+ * [Dimensionality Reduction](embeddings/dimensionality-reduction.md)
+ * [Aggregation](embeddings/vector-aggregation.md)
+ * [Similarity](embeddings/vector-similarity.md)
+ * [Normalization](embeddings/vector-normalization.md)
+* [Search](improve-search-results-with-machine-learning.md)
+* [Chatbots](chatbots/)
+* [Supervised Learning](supervised-learning.md)
+* [Unified RAG](unified-rag.md)
+* [Natural Language Processing](natural-language-processing.md)
+* [Vector Database](vector-database.md)
diff --git a/pgml-cms/docs/guides/chatbots/README.md b/pgml-cms/docs/open-source/pgml/guides/chatbots/README.md
similarity index 95%
rename from pgml-cms/docs/guides/chatbots/README.md
rename to pgml-cms/docs/open-source/pgml/guides/chatbots/README.md
index cd65d9125..74ba0718a 100644
--- a/pgml-cms/docs/guides/chatbots/README.md
+++ b/pgml-cms/docs/open-source/pgml/guides/chatbots/README.md
@@ -30,7 +30,7 @@ Here is an example flowing from:
text -> tokens -> LLM -> probability distribution -> predicted token -> text
-
The flow of inputs through an LLM. In this case the inputs are "What is Baldur's Gate 3?" and the output token "14" maps to the word "I"
+
The flow of inputs through an LLM. In this case the inputs are "What is Baldur's Gate 3?" and the output token "14" maps to the word "I"
{% hint style="info" %}
We have simplified the tokenization process. Words do not always map directly to tokens. For instance, the word "Baldur's" may actually map to multiple tokens. For more information on tokenization checkout [HuggingFace's summary](https://huggingface.co/docs/transformers/tokenizer\_summary).
@@ -108,11 +108,11 @@ What does an `embedding` look like? `Embeddings` are just vectors (for our use c
embedding_1 = embed("King") # embed returns something like [0.11, -0.32, 0.46, ...]
```
-
The flow of word -> token -> embedding
+
The flow of word -> token -> embedding
`Embeddings` aren't limited to words, we have models that can embed entire sentences.
-
The flow of sentence -> tokens -> embedding
+
The flow of sentence -> tokens -> embedding
Why do we care about `embeddings`? `Embeddings` have a very interesting property. Words and sentences that have close [semantic similarity](https://en.wikipedia.org/wiki/Semantic\_similarity) sit closer to one another in vector space than words and sentences that do not have close semantic similarity.
@@ -157,7 +157,7 @@ print(context)
There is a lot going on with this, let's check out this diagram and step through it.
-
The flow of taking a document, splitting it into chunks, embedding those chunks, and then retrieving a chunk based off of a users query
+
The flow of taking a document, splitting it into chunks, embedding those chunks, and then retrieving a chunk based off of a users query
Step 1: We take the document and split it into chunks. Chunks are typically a paragraph or two in size. There are many ways to split documents into chunks, for more information check out [this guide](https://www.pinecone.io/learn/chunking-strategies/).
@@ -202,7 +202,7 @@ Let's take this hypothetical example and make it a reality. For the rest of this
* The chatbot remembers our past conversation
* The chatbot can answer questions correctly about Baldur's Gate 3
-In reality we haven't created a SOTA LLM, but fortunately other people have and we will be using the incredibly popular `meta-llama/Meta-Llama-3-8B-Instruct`. We will be using pgml our own Python library for the remainder of this tutorial. If you want to follow along and have not installed it yet:
+In reality we haven't created a SOTA LLM, but fortunately other people have and we will be using the incredibly popular `meta-llama/Meta-Llama-3.1-8B-Instruct`. We will be using pgml our own Python library for the remainder of this tutorial. If you want to follow along and have not installed it yet:
```
pip install pgml
@@ -220,7 +220,7 @@ Let's setup a basic chat loop with our model:
from pgml import TransformerPipeline
import asyncio
-model = TransformerPipeline("text-generation", "meta-llama/Meta-Llama-3-8B-Instruct")
+model = TransformerPipeline("text-generation", "meta-llama/Meta-Llama-3.1-8B-Instruct")
async def main():
@@ -266,7 +266,7 @@ Remember LLM's are just function approximators that are designed to predict the
We need to understand that LLMs have a special format for the inputs specifically for conversations. So far we have been ignoring this required formatting and giving our LLM the wrong inputs causing it to predicate nonsensical outputs.
-What do the right inputs look like? That actually depends on the model. Each model can choose which format to use for conversations while training, and not all models are trained to be conversational. `meta-llama/Meta-Llama-3-8B-Instruct` has been trained to be conversational and expects us to format text meant for conversations like so:
+What do the right inputs look like? That actually depends on the model. Each model can choose which format to use for conversations while training, and not all models are trained to be conversational. `meta-llama/Meta-Llama-3.1-8B-Instruct` has been trained to be conversational and expects us to format text meant for conversations like so:
```
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
@@ -284,7 +284,7 @@ This is the style of input our LLM has been trained on. Let's do a simple test w
from pgml import TransformerPipeline
import asyncio
-model = TransformerPipeline("text-generation", "meta-llama/Meta-Llama-3-8B-Instruct")
+model = TransformerPipeline("text-generation", "meta-llama/Meta-Llama-3.1-8B-Instruct")
user_input = """
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
@@ -315,7 +315,7 @@ That was perfect! We got the exact response we wanted for the first question, bu
from pgml import TransformerPipeline
import asyncio
-model = TransformerPipeline("text-generation", "meta-llama/Meta-Llama-3-8B-Instruct")
+model = TransformerPipeline("text-generation", "meta-llama/Meta-Llama-3.1-8B-Instruct")
user_input = """
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
@@ -346,7 +346,7 @@ By chaining these special tags we can build a conversation that Llama has been t
This example highlights that modern LLM's are stateless function approximators. Notice we have included the first question we asked and the models response in our input. Every time we ask it a new question in our conversation, we will have to supply the entire conversation history if we want it to know what we already discussed. LLMs have no built in way to remember past questions and conversations.
{% endhint %}
-Doing this by hand seems very tedious, how do we actually accomplish this in the real world? We use [Jinja](https://jinja.palletsprojects.com/en/3.1.x/) templates. Conversational models on HuggingFace typical come with a Jinja template which can be found in the `tokenizer_config.json`. [Checkout `meta-llama/Meta-Llama-3-8B-Instruct`'s Jinja template in the `tokenizer_config.json`](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/blob/main/tokenizer_config.json). For more information on Jinja templating check out [HuggingFace's introduction](https://huggingface.co/docs/transformers/main/chat_templating).
+Doing this by hand seems very tedious, how do we actually accomplish this in the real world? We use [Jinja](https://jinja.palletsprojects.com/en/3.1.x/) templates. Conversational models on HuggingFace typical come with a Jinja template which can be found in the `tokenizer_config.json`. [Checkout `meta-llama/Meta-Llama-3.1-8B-Instruct`'s Jinja template in the `tokenizer_config.json`](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/blob/main/tokenizer_config.json). For more information on Jinja templating check out [HuggingFace's introduction](https://huggingface.co/docs/transformers/main/chat_templating).
Luckily for everyone reading this, our `pgml` library automatically handles templating and formatting inputs correctly so we can skip a bunch of boring code. We do want to change up our program a little bit to take advantage of this automatic templating:
diff --git a/pgml-cms/docs/guides/embeddings/README.md b/pgml-cms/docs/open-source/pgml/guides/embeddings/README.md
similarity index 93%
rename from pgml-cms/docs/guides/embeddings/README.md
rename to pgml-cms/docs/open-source/pgml/guides/embeddings/README.md
index 39557d79f..a50e0b673 100644
--- a/pgml-cms/docs/guides/embeddings/README.md
+++ b/pgml-cms/docs/open-source/pgml/guides/embeddings/README.md
@@ -39,7 +39,7 @@ Vectors can be stored in the native Postgres [`ARRAY[]`](https://www.postgresql.
!!! warning
-Other cloud providers claim to offer embeddings "inside the database", but [benchmarks](../../resources/benchmarks/mindsdb-vs-postgresml.md) show that they are orders of magnitude slower than PostgresML. The reason is they don't actually run inside the database with hardware acceleration. They are thin wrapper functions that make network calls to remote service providers. PostgresML is the only cloud that puts GPU hardware in the database for full acceleration, and it shows.
+Other cloud providers claim to offer embeddings "inside the database", but [benchmarks](/blog/mindsdb-vs-postgresml.md) show that they are orders of magnitude slower than PostgresML. The reason is they don't actually run inside the database with hardware acceleration. They are thin wrapper functions that make network calls to remote service providers. PostgresML is the only cloud that puts GPU hardware in the database for full acceleration, and it shows.
!!!
diff --git a/pgml-cms/docs/guides/embeddings/dimensionality-reduction.md b/pgml-cms/docs/open-source/pgml/guides/embeddings/dimensionality-reduction.md
similarity index 100%
rename from pgml-cms/docs/guides/embeddings/dimensionality-reduction.md
rename to pgml-cms/docs/open-source/pgml/guides/embeddings/dimensionality-reduction.md
diff --git a/pgml-cms/docs/guides/embeddings/in-database-generation.md b/pgml-cms/docs/open-source/pgml/guides/embeddings/in-database-generation.md
similarity index 100%
rename from pgml-cms/docs/guides/embeddings/in-database-generation.md
rename to pgml-cms/docs/open-source/pgml/guides/embeddings/in-database-generation.md
diff --git a/pgml-cms/docs/guides/embeddings/indexing-w-pgvector.md b/pgml-cms/docs/open-source/pgml/guides/embeddings/indexing-w-pgvector.md
similarity index 100%
rename from pgml-cms/docs/guides/embeddings/indexing-w-pgvector.md
rename to pgml-cms/docs/open-source/pgml/guides/embeddings/indexing-w-pgvector.md
diff --git a/pgml-cms/docs/use-cases/embeddings/personalize-embedding-results-with-application-data-in-your-database.md b/pgml-cms/docs/open-source/pgml/guides/embeddings/personalization.md
similarity index 100%
rename from pgml-cms/docs/use-cases/embeddings/personalize-embedding-results-with-application-data-in-your-database.md
rename to pgml-cms/docs/open-source/pgml/guides/embeddings/personalization.md
diff --git a/pgml-cms/docs/guides/embeddings/proprietary-models.md b/pgml-cms/docs/open-source/pgml/guides/embeddings/proprietary-models.md
similarity index 100%
rename from pgml-cms/docs/guides/embeddings/proprietary-models.md
rename to pgml-cms/docs/open-source/pgml/guides/embeddings/proprietary-models.md
diff --git a/pgml-cms/docs/guides/embeddings/re-ranking-nearest-neighbors.md b/pgml-cms/docs/open-source/pgml/guides/embeddings/re-ranking-nearest-neighbors.md
similarity index 100%
rename from pgml-cms/docs/guides/embeddings/re-ranking-nearest-neighbors.md
rename to pgml-cms/docs/open-source/pgml/guides/embeddings/re-ranking-nearest-neighbors.md
diff --git a/pgml-cms/docs/guides/embeddings/vector-aggregation.md b/pgml-cms/docs/open-source/pgml/guides/embeddings/vector-aggregation.md
similarity index 100%
rename from pgml-cms/docs/guides/embeddings/vector-aggregation.md
rename to pgml-cms/docs/open-source/pgml/guides/embeddings/vector-aggregation.md
diff --git a/pgml-cms/docs/guides/embeddings/vector-normalization.md b/pgml-cms/docs/open-source/pgml/guides/embeddings/vector-normalization.md
similarity index 100%
rename from pgml-cms/docs/guides/embeddings/vector-normalization.md
rename to pgml-cms/docs/open-source/pgml/guides/embeddings/vector-normalization.md
diff --git a/pgml-cms/docs/guides/embeddings/vector-similarity.md b/pgml-cms/docs/open-source/pgml/guides/embeddings/vector-similarity.md
similarity index 100%
rename from pgml-cms/docs/guides/embeddings/vector-similarity.md
rename to pgml-cms/docs/open-source/pgml/guides/embeddings/vector-similarity.md
diff --git a/pgml-cms/docs/guides/improve-search-results-with-machine-learning.md b/pgml-cms/docs/open-source/pgml/guides/improve-search-results-with-machine-learning.md
similarity index 100%
rename from pgml-cms/docs/guides/improve-search-results-with-machine-learning.md
rename to pgml-cms/docs/open-source/pgml/guides/improve-search-results-with-machine-learning.md
diff --git a/pgml-cms/docs/guides/natural-language-processing.md b/pgml-cms/docs/open-source/pgml/guides/natural-language-processing.md
similarity index 100%
rename from pgml-cms/docs/guides/natural-language-processing.md
rename to pgml-cms/docs/open-source/pgml/guides/natural-language-processing.md
diff --git a/pgml-cms/docs/guides/supervised-learning.md b/pgml-cms/docs/open-source/pgml/guides/supervised-learning.md
similarity index 100%
rename from pgml-cms/docs/guides/supervised-learning.md
rename to pgml-cms/docs/open-source/pgml/guides/supervised-learning.md
diff --git a/pgml-cms/docs/guides/unified-rag.md b/pgml-cms/docs/open-source/pgml/guides/unified-rag.md
similarity index 94%
rename from pgml-cms/docs/guides/unified-rag.md
rename to pgml-cms/docs/open-source/pgml/guides/unified-rag.md
index ee7e38941..32ce81bb2 100644
--- a/pgml-cms/docs/guides/unified-rag.md
+++ b/pgml-cms/docs/open-source/pgml/guides/unified-rag.md
@@ -18,7 +18,7 @@ RAG has grown rapidly in popularity. It is not an esoteric practice run only by
As quick reminder, the typical modern RAG workflow looks like this:
-
Steps one through three prepare our RAG system, and steps four through eight are RAG itself.
+
Steps one through three prepare our RAG system, and steps four through eight are RAG itself.
## Unified RAG
@@ -48,7 +48,7 @@ Here is an example of the pgml.transform function
SELECT pgml.transform(
task => ''{
"task": "text-generation",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct"
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"
}''::JSONB,
inputs => ARRAY[''AI is going to''],
args => ''{
@@ -61,7 +61,7 @@ Here is another example of the pgml.transform function
SELECT pgml.transform(
task => ''{
"task": "text-generation",
- "model": "meta-llama/Meta-Llama-3-70B-Instruct"
+ "model": "meta-llama/Meta-Llama-3.1-70B-Instruct"
}''::JSONB,
inputs => ARRAY[''AI is going to''],
args => ''{
@@ -142,9 +142,9 @@ SELECT * FROM chunks limit 10;
| id | chunk | chunk_index | document_id |
| ---- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- | ------------- |
| 1 | Here is an example of the pgml.transform function | 1 | 1 |
-| 2 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 2 | 1 |
+| 2 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 2 | 1 |
| 3 | Here is another example of the pgml.transform function | 3 | 1 |
-| 4 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 4 | 1 |
+| 4 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 4 | 1 |
| 5 | Here is a third example of the pgml.transform function | 5 | 1 |
| 6 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "microsoft/Phi-3-mini-128k-instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 6 | 1 |
| 7 | ae94d3413ae82367c3d0592a67302b25 | 1 | 2 |
@@ -250,8 +250,8 @@ LIMIT 6;
| 1 | 0.09044166306461232 | Here is an example of the pgml.transform function |
| 3 | 0.10787954026965096 | Here is another example of the pgml.transform function |
| 5 | 0.11683694289239333 | Here is a third example of the pgml.transform function |
-| 2 | 0.17699128851412282 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
-| 4 | 0.17844729798760672 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
+| 2 | 0.17699128851412282 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
+| 4 | 0.17844729798760672 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
| 6 | 0.17520464423854842 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "microsoft/Phi-3-mini-128k-instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
!!!
@@ -327,8 +327,8 @@ FROM (
| cosine_distance | rank_score | chunk |
| -------------------- | -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| 0.2124727254737595 | 0.3427378833293915 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
-| 0.2109014406365579 | 0.342184841632843 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
+| 0.2124727254737595 | 0.3427378833293915 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
+| 0.2109014406365579 | 0.342184841632843 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
| 0.21259646694819168 | 0.3332781493663788 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "microsoft/Phi-3-mini-128k-instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
| 0.19483324929456136 | 0.03163915500044823 | Here is an example of the pgml.transform function |
| 0.1685870257610742 | 0.031176624819636345 | Here is a third example of the pgml.transform function |
@@ -402,7 +402,7 @@ SELECT
pgml.transform (
task => '{
"task": "conversational",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct"
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"
}'::jsonb,
inputs => ARRAY['{"role": "system", "content": "You are a friendly and helpful chatbot."}'::jsonb, jsonb_build_object('role', 'user', 'content', replace('Given the context answer the following question: How do I write a select statement with pgml.transform? Context:\n\n{CONTEXT}', '{CONTEXT}', chunk))],
args => '{
@@ -417,7 +417,7 @@ FROM
!!! results
```text
-["To write a SELECT statement with pgml.transform, you can use the following syntax:\n\n```sql\nSELECT pgml.transform(\n task => '{\n \"task\": \"text-generation\",\n \"model\": \"meta-llama/Meta-Llama-3-70B-Instruct\"\n }'::JSONB,\n inputs => ARRAY['AI is going to'],\n args => '{\n \"max_new_tokens\": 100\n }'::JSONB\n"]
+["To write a SELECT statement with pgml.transform, you can use the following syntax:\n\n```sql\nSELECT pgml.transform(\n task => '{\n \"task\": \"text-generation\",\n \"model\": \"meta-llama/Meta-Llama-3.1-70B-Instruct\"\n }'::JSONB,\n inputs => ARRAY['AI is going to'],\n args => '{\n \"max_new_tokens\": 100\n }'::JSONB\n"]
```
!!!
@@ -426,7 +426,7 @@ FROM
We have now combined the embedding api call, the semantic search api call, the rerank api call and the text generation api call from our RAG flow into one sql query.
-We are using `meta-llama/Meta-Llama-3-8B-Instruct` to perform text generation. We have a number of different models available for text generation, but for our use case `meta-llama/Meta-Llama-3-8B-Instruct` is a fantastic mix between speed and capability. For this simple example we are only passing the top search result as context to the LLM. In real world use cases, you will want to pass more results.
+We are using `meta-llama/Meta-Llama-3.1-8B-Instruct` to perform text generation. We have a number of different models available for text generation, but for our use case `meta-llama/Meta-Llama-3.1-8B-Instruct` is a fantastic mix between speed and capability. For this simple example we are only passing the top search result as context to the LLM. In real world use cases, you will want to pass more results.
We can stream from the database by using the `pgml.transform_stream` function and cursors. Here is a query measuring time to first token.
@@ -486,7 +486,7 @@ SELECT
pgml.transform_stream(
task => '{
"task": "conversational",
- "model": "meta-llama/Meta-Llama-3-8B-Instruct"
+ "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"
}'::jsonb,
inputs => ARRAY['{"role": "system", "content": "You are a friendly and helpful chatbot."}'::jsonb, jsonb_build_object('role', 'user', 'content', replace('Given the context answer the following question: How do I write a select statement with pgml.transform? Context:\n\n{CONTEXT}', '{CONTEXT}', chunk))],
args => '{
diff --git a/pgml-cms/docs/guides/vector-database.md b/pgml-cms/docs/open-source/pgml/guides/vector-database.md
similarity index 97%
rename from pgml-cms/docs/guides/vector-database.md
rename to pgml-cms/docs/open-source/pgml/guides/vector-database.md
index a28d88218..bdc12a456 100644
--- a/pgml-cms/docs/guides/vector-database.md
+++ b/pgml-cms/docs/open-source/pgml/guides/vector-database.md
@@ -18,7 +18,7 @@ Vectors can be stored in columns, just like any other data type. To add a vector
#### Adding a vector column
-Using the example from [Tabular data](../resources/data-storage-and-retrieval/README.md), let's add a vector column to our USA House Prices table:
+Using the example from [Tabular data](../../../introduction/import-your-data/storage-and-retrieval/README.md), let's add a vector column to our USA House Prices table:
{% tabs %}
{% tab title="SQL" %}
@@ -288,4 +288,4 @@ CREATE INDEX
#### Maintaining an HNSW index
-HNSW requires little to no maintenance. When new vectors are added, they are automatically inserted at the optimal place in the graph. However, as the graph gets bigger, rebalancing it becomes more expensive, and inserting new rows becomes slower. We address this trade-off and how to solve this problem in [Partitioning](../resources/data-storage-and-retrieval/partitioning.md).
+HNSW requires little to no maintenance. When new vectors are added, they are automatically inserted at the optimal place in the graph. However, as the graph gets bigger, rebalancing it becomes more expensive, and inserting new rows becomes slower. We address this trade-off and how to solve this problem in [Partitioning](../../../introduction/import-your-data/storage-and-retrieval/partitioning.md).
diff --git a/pgml-cms/docs/resources/benchmarks/README.md b/pgml-cms/docs/resources/benchmarks/README.md
deleted file mode 100644
index ce4a798b7..000000000
--- a/pgml-cms/docs/resources/benchmarks/README.md
+++ /dev/null
@@ -1,2 +0,0 @@
-# Benchmarks
-
diff --git a/pgml-cms/docs/resources/benchmarks/making-postgres-30-percent-faster-in-production.md b/pgml-cms/docs/resources/benchmarks/making-postgres-30-percent-faster-in-production.md
deleted file mode 100644
index 030a84398..000000000
--- a/pgml-cms/docs/resources/benchmarks/making-postgres-30-percent-faster-in-production.md
+++ /dev/null
@@ -1,49 +0,0 @@
----
-description: >-
- Anyone who runs Postgres at scale knows that performance comes with trade
- offs.
----
-
-# Making Postgres 30 Percent Faster in Production
-
-Anyone who runs Postgres at scale knows that performance comes with trade offs. The typical playbook is to place a pooler like PgBouncer in front of your database and turn on transaction mode. This makes multiple clients reuse the same server connection, which allows thousands of clients to connect to your database without causing a fork bomb.
-
-Unfortunately, this comes with a trade off. Since multiple clients use the same server, they couldn't take advantage of prepared statements. Prepared statements are a way for Postgres to cache a query plan and execute it multiple times with different parameters. If you have never tried this before, you can run `pgbench` against your local DB and you'll see that `--protocol prepared` outperforms `simple` and `extended` by at least 30 percent. Giving up this feature has been a given for production deployments for as long as I can remember, but not anymore.
-
-## PgCat Prepared Statements
-
-Since [#474](https://github.com/postgresml/pgcat/pull/474), PgCat supports prepared statements in session and transaction mode. Our initial benchmarks show 30% increase over extended protocol (`--protocol extended`) and 15% against simple protocol (`--simple`). Most (all?) web frameworks use at least the extended protocol, so we are looking at a **30% performance increase across the board for everyone** who writes web apps and uses Postgres in production, by just switching to named prepared statements.
-
-In Rails apps, it's as simple as setting `prepared_statements: true`.
-
-This is not only a performance benefit, but also a usability improvement for client libraries that have to use prepared statements, like the popular Rust crate [SQLx](https://github.com/launchbadge/sqlx). Until now, the typical recommendation was to just not use a pooler.
-
-## Benchmark
-
-
-
-The benchmark was conducted using `pgbench` with 1, 10, 100 and 1000 clients sending millions of queries to PgCat, which itself was running on a different EC2 machine alongside the database. This is a simple setup often used in production. Another configuration sees a pooler use its own machine, which of course increases latency but improves on availability. The clients were on another EC2 machine to simulate the latency experienced in typical web apps deployed in Kubernetes, ECS, EC2 and others.
-
-Benchmark ran in transaction mode. Session mode is faster with fewer clients, but does not scale in production with more than a few hundred clients. Only `SELECT` statements (`-S` option) were used, since the typical `pgbench` benchmark uses a similar number of writes to reads, which is an atypical production workload. Most apps read 90% of the time, and write 10% of the time. Reads are where prepared statements truly shine.
-
-## Implementation
-
-PgCat implements an internal cache & mapping between clients' prepared statements and servers that may or may not have them. If a server has the prepared statement, PgCat just forwards the `Bind (F)`, `Execute (F)` and `Describe (F)` messages. If the server doesn't have the prepared statement, PgCat fetches it from the client cache & prepares it using the `Parse (F)` message. You can refer to [Postgres docs](https://www.postgresql.org/docs/current/protocol-flow.html) for a more detailed explanation of how the extended protocol works.
-
-An important feature of PgCat's implementation is that all prepared statements are renamed and assigned globally unique names. This means that clients that don't randomize their prepared statement names and expect it to be gone after they disconnect from the "Postgres server", work as expected (I put "Postgres server" in quotes because they are actually talking to a proxy that pretends to be a Postgres database). Typical error when using such clients with PgBouncer is `prepared statement "sqlx_s_2" already exists`, which is pretty confusing when you see it for the first time.
-
-## Metrics
-
-We've added two new metrics to the admin database: `prepare_cache_hit` and `prepare_cache_miss`. Prepare cache hits indicate that the prepared statement requested by the client already exists on the server. That's good because PgCat can just rewrite the messages and send them to the server immediately. Prepare cache misses indicate that PgCat had to issue a prepared statement call to the server, which requires additional time and decreases throughput. In the ideal scenario, the cache hits outnumber the cache misses by an order of magnitude. If they are the same or worse, the prepared statements are not being used correctly by the clients.
-
-
-
-Our benchmark had a 99.99% cache hit ratio, which is really good, but in production this number is likely to be lower. You can monitor your cache hit/miss ratios through the admin database by querying it with `SHOW SERVERS`.
-
-## Roadmap
-
-Our implementation is pretty simple and we are already seeing massive improvements, but we can still do better. A `Parse (F)` made prepared statement works, but if one prepares their statements using `PREPARE` explicitly, PgCat will ignore it and that query isn't likely to work outside of session mode.
-
-Another issue is explicit `DEALLOCATE` and `DISCARD` calls. PgCat doesn't detect them currently, and a client can potentially bust the server prepared statement cache without PgCat knowing about it. It's an easy enough fix to intercept and act on that query accordingly, but we haven't built that yet.
-
-Testing with `pgbench` is an artificial benchmark, which is good and bad. It's good because, other things being equal, we can demonstrate that one implementation & configuration of the database/pooler cluster is superior to another. It's bad because in the real world, the results can differ. We are looking for users who would be willing to test our implementation against their production traffic and tell us how we did. This feature is optional and can be enabled & disabled dynamically, without restarting PgCat, with `prepared_statements = true` in `pgcat.toml`.
diff --git a/pgml-cms/docs/resources/benchmarks/million-requests-per-second.md b/pgml-cms/docs/resources/benchmarks/million-requests-per-second.md
deleted file mode 100644
index 716b91eba..000000000
--- a/pgml-cms/docs/resources/benchmarks/million-requests-per-second.md
+++ /dev/null
@@ -1,234 +0,0 @@
----
-description: >-
- The question "Does it Scale?" has become somewhat of a meme in software
- engineering.
----
-
-# Scaling to 1 Million Requests per Second
-
-The question "Does it Scale?" has become somewhat of a meme in software engineering. There is a good reason for it though, because most businesses plan for success. If your app, online store, or SaaS becomes popular, you want to be sure that the system powering it can serve all your new customers.
-
-At PostgresML, we are very concerned with scale. Our engineering background took us through scaling PostgreSQL to 100 TB+, so we're certain that it scales, but could we scale machine learning alongside it?
-
-In this post, we'll discuss how we horizontally scale PostgresML to achieve more than **1 million XGBoost predictions per second** on commodity hardware.
-
-If you missed our previous post and are wondering why someone would combine machine learning and Postgres, take a look at our PostgresML vs. Python benchmark.
-
-## Architecture Overview
-
-If you're familiar with how one runs PostgreSQL at scale, you can skip straight to the [results](../../benchmarks/broken-reference/).
-
-Part of our thesis, and the reason why we chose Postgres as our host for machine learning, is that scaling machine learning inference is very similar to scaling read queries in a typical database cluster.
-
-Inference speed varies based on the model complexity (e.g. `n_estimators` for XGBoost) and the size of the dataset (how many features the model uses), which is analogous to query complexity and table size in the database world and, as we'll demonstrate further on, scaling the latter is mostly a solved problem.
-
-
System Architecture
-
-| Component | Description |
-| --------- | --------------------------------------------------------------------------------------------------------- |
-| Clients | Regular Postgres clients |
-| ELB | [Elastic Network Load Balancer](https://aws.amazon.com/elasticloadbalancing/) |
-| PgCat | A Postgres [pooler](https://github.com/levkk/pgcat/) with built-in load balancing, failover, and sharding |
-| Replica | Regular Postgres [replicas](https://www.postgresql.org/docs/current/high-availability.html) |
-| Primary | Regular Postgres primary |
-
-Our architecture has four components that may need to scale up or down based on load:
-
-1. Clients
-2. Load balancer
-3. [PgCat](https://github.com/levkk/pgcat/) pooler
-4. Postgres replicas
-
-We intentionally don't discuss scaling the primary in this post, because sharding, which is the most effective way to do so, is a fascinating subject that deserves its own series of posts. Spoiler alert: we sharded Postgres without any problems.
-
-### Clients
-
-Clients are regular Postgres connections coming from web apps, job queues, or pretty much anywhere that needs data. They can be long-living or ephemeral and they typically grow in number as the application scales.
-
-Most modern deployments use containers which are added as load on the app increases, and removed as the load decreases. This is called dynamic horizontal scaling, and it's an effective way to adapt to changing traffic patterns experienced by most businesses.
-
-### Load Balancer
-
-The load balancer is a way to spread traffic across horizontally scalable components, by routing new connections to targets in a round robin (or random) fashion. It's typically a very large box (or a fast router), but even those need to be scaled if traffic suddenly increases. Since we're running our system on AWS, this is already taken care of, for a reasonably small fee, by using an Elastic Load Balancer.
-
-### PgCat
-
-If you've used Postgres in the past, you know that it can't handle many concurrent connections. For large deployments, it's necessary to run something we call a pooler. A pooler routes thousands of clients to only a few dozen server connections by time-sharing when a client can use a server. Because most queries are very quick, this is a very effective way to run Postgres at scale.
-
-There are many poolers available presently, the most notable being PgBouncer, which has been around for a very long time, and is trusted by many large organizations. Unfortunately, it hasn't evolved much with the growing needs of highly available Postgres deployments, so we wrote [our own](https://github.com/levkk/pgcat/) which added important functionality we needed:
-
-* Load balancing of read queries
-* Failover in case a read replica is broken
-* Sharding (this feature is still being developed)
-
-In this benchmark, we used its load balancing feature to evenly distribute XGBoost predictions across our Postgres replicas.
-
-### Postgres Replicas
-
-Scaling Postgres reads is pretty straight forward. If more read queries are coming in, we add a replica to serve the increased load. If the load is decreasing, we remove a replica to save money. The data is replicated from the primary, so all replicas are identical, and all of them can serve any query, or in our case, an XGBoost prediction. PgCat can dynamically add and remove replicas from its config without disconnecting clients, so we can add and remove replicas as needed, without downtime.
-
-#### Parallelizing XGBoost
-
-Scaling XGBoost predictions is a little bit more interesting. XGBoost cannot serve predictions concurrently because of internal data structure locks. This is common to many other machine learning algorithms as well, because making predictions can temporarily modify internal components of the model.
-
-PostgresML bypasses that limitation because of how Postgres itself handles concurrency:
-
-
-
-_PostgresML concurrency_
-
-PostgreSQL uses the fork/multiprocessing architecture to serve multiple clients concurrently: each new client connection becomes an independent OS process. During connection startup, PostgresML loads all models inside the process' memory space. This means that each connection has its own copy of the XGBoost model and PostgresML ends up serving multiple XGBoost predictions at the same time without any lock contention.
-
-## Results
-
-We ran over a 100 different benchmarks, by changing the number of clients, poolers, replicas, and XGBoost predictions we requested. The benchmarks were meant to test the limits of each configuration, and what remediations were needed in each scenario. Our raw data is available below.
-
-One of the tests we ran used 1,000 clients, which were connected to 1, 2, and 5 replicas. The results were exactly what we expected.
-
-### Linear Scaling
-
-
-
-
Latency
-
-
-
-
Throughput
-
-
-
-Both latency and throughput, the standard measurements of system performance, scale mostly linearly with the number of replicas. Linear scaling is the north star of all horizontally scalable systems, and most are not able to achieve it because of increasing complexity that comes with synchronization.
-
-Our architecture shares nothing and requires no synchronization. The replicas don't talk to each other and the poolers don't either. Every component has the knowledge it needs (through configuration) to do its job, and they do it well.
-
-The most impressive result is serving close to a million predictions with an average latency of less than 1ms. You might notice though that `950160.7` isn't quite one million, and that's true. We couldn't reach one million with 1000 clients, so we increased to 2000 and got our magic number: **1,021,692.7 req/sec**, with an average latency of **1.7ms**.
-
-### Batching Predictions
-
-Batching is a proven method to optimize performance. If you need to get several data points, batch the requests into one query, and it will run faster than making individual requests.
-
-We should precede this result by stating that PostgresML does not yet have a batch prediction API as such. Our `pgml.predict()` function can predict multiple points, but we haven't implemented a query pattern to pass multiple rows to that function at the same time. Once we do, based on our tests, we should see a substantial increase in batch prediction performance.
-
-Regardless of that limitation, we still managed to get better results by batching queries together since Postgres needed to do less query parsing and searching, and we saved on network round trip time as well.
-
-
-
-
-
-
-
-
-
-
-
-If batching did not work at all, we would see a linear increase in latency and a linear decrease in throughput. That did not happen; instead, we got a 1.5x improvement by batching 5 predictions together, and a 1.2x improvement by batching 20. A modest success, but a success nonetheless.
-
-### Graceful Degradation and Queuing
-
-
-
-
-
-
-
-
-
-
-
-All systems, at some point in their lifetime, will come under more load than they were designed for; what happens then is an important feature (or bug) of their design. Horizontal scaling is never immediate: it takes a bit of time to spin up additional hardware to handle the load. It can take a second, or a minute, depending on availability, but in both cases, existing resources need to serve traffic the best way they can.
-
-We were hoping to test PostgresML to its breaking point, but we couldn't quite get there. As the load (number of clients) increased beyond provisioned capacity, the only thing we saw was a gradual increase in latency. Throughput remained roughly the same. This gradual latency increase was caused by simple queuing: the replicas couldn't serve requests concurrently, so the requests had to patiently wait in the poolers.
-
-
-
-_"What's taking so long over there!?"_
-
-Among many others, this is a very important feature of any proxy: it's a FIFO queue (first in, first out). If the system is underutilized, queue size is 0 and all requests are served as quickly as physically possible. If the system is overutilized, the queue size increases, holds as the number of requests stabilizes, and decreases back to 0 as the system is scaled up to accommodate new traffic.
-
-Queueing overall is not desirable, but it's a feature, not a bug. While autoscaling spins up an additional replica, the app continues to work, although a few milliseconds slower, which is a good trade off for not overspending on hardware.
-
-As the demand on PostgresML increases, the system gracefully handles the load. If the number of replicas stays the same, latency slowly increases, all the while remaining well below acceptable ranges. Throughput holds as well, as increasing number of clients evenly split available resources.
-
-If we increase the number of replicas, latency decreases and throughput increases, as the number of clients increases in parallel. We get the best result with 5 replicas, but this number is variable and can be changed as needs for latency compete with cost.
-
-## What's Next
-
-Horizontal scaling and high availability are fascinating topics in software engineering. Needing to serve 1 million predictions per second is rare, but having the ability to do that, and more if desired, is an important aspect for any new system.
-
-The next challenge for us is to scale writes horizontally. In the database world, this means sharding the database into multiple separate machines using a hashing function, and automatically routing both reads and writes to the right shards. There are many possible solutions on the market for this already, e.g. Citus and Foreign Data Wrappers, but none are as horizontally scalable as we like, although we will incorporate them into our architecture until we build the one we really want.
-
-For that purpose, we're building our own open source [Postgres proxy](https://github.com/levkk/pgcat/) which we discussed earlier in the article. As we progress further in our journey, we'll be adding more features and performance improvements.
-
-By combining PgCat with PostgresML, we are aiming to build the next generation of machine learning infrastructure that can power anything from tiny startups to unicorns and massive enterprises, without the data ever leaving our favorite database.
-
-## Methodology
-
-### ML
-
-This time, we used an XGBoost model with 100 trees:
-
-```postgresql
-SELECT * FROM pgml.train(
- 'flights',
- task => 'regression',
- relation_name => 'flights_mat_3',
- y_column_name => 'depdelayminutes',
- algorithm => 'xgboost',
- hyperparams => '{"n_estimators": 100 }',
- runtime => 'rust'
-);
-```
-
-and fetched our predictions the usual way:
-
-```postgresql
-SELECT pgml.predict(
- 'flights',
- ARRAY[
- year,
- quarter,
- month,
- distance,
- dayofweek,
- dayofmonth,
- flight_number_operating_airline,
- originairportid,
- destairportid,
- flight_number_marketing_airline,
- departure
- ]
-) AS prediction
-FROM flights_mat_3 LIMIT :limit;
-```
-
-where `:limit` is the batch size of 1, 5, and 20.
-
-#### Model
-
-The model is roughly the same as the one we used in our previous post, with just one extra feature added, which improved R2 a little bit.
-
-### Hardware
-
-#### Client
-
-The client was a `c5n.4xlarge` box on EC2. We chose the `c5n` class to have the 100 GBit NIC, since we wanted it to saturate our network as much as possible. Thousands of clients were simulated using [`pgbench`](https://www.postgresql.org/docs/current/pgbench.html).
-
-#### PgCat Pooler
-
-PgCat, written in asynchronous Rust, was running on `c5.xlarge` machines (4 vCPUs, 8GB RAM) with 4 Tokio workers. We used between 1 and 35 machines, and scaled them in increments of 5-20 at a time.
-
-The pooler did a decent amount of work around parsing queries, making sure they are read-only `SELECT`s, and routing them, at random, to replicas. If any replica was down for any reason, it would route around it to remaining machines.
-
-#### Postgres Replicas
-
-Postgres replicas were running on `c5.9xlarge` machines with 36 vCPUs and 72 GB of RAM. The hot dataset fits entirely in memory. The servers were intentionally saturated to maximum capacity before scaling up to test queuing and graceful degradation of performance.
-
-#### Raw Results
-
-Raw latency data is available [here](https://static.postgresml.org/benchmarks/reads-latency.csv) and raw throughput data is available [here](https://static.postgresml.org/benchmarks/reads-throughput.csv).
-
-## Call to Early Adopters
-
-[PostgresML](https://github.com/postgresml/postgresml/) and [PgCat](https://github.com/levkk/pgcat/) are free and open source. If your organization can benefit from simplified and fast machine learning, get in touch! We can help deploy PostgresML internally, and collaborate on new and existing features. Join our [Discord](https://discord.gg/DmyJP3qJ7U) or [email](mailto:team@postgresml.org) us!
-
-Many thanks and ❤️ to all those who are supporting this endeavor. We’d love to hear feedback from the broader ML and Engineering community about applications and other real world scenarios to help prioritize our work. You can show your support by starring us on our [Github](https://github.com/postgresml/postgresml/).
diff --git a/pgml-cms/docs/resources/benchmarks/mindsdb-vs-postgresml.md b/pgml-cms/docs/resources/benchmarks/mindsdb-vs-postgresml.md
deleted file mode 100644
index c82d4eea1..000000000
--- a/pgml-cms/docs/resources/benchmarks/mindsdb-vs-postgresml.md
+++ /dev/null
@@ -1,293 +0,0 @@
----
-description: "Compare two projects that both aim\Lto provide an SQL interface to ML algorithms and the data they require."
----
-
-# MindsDB vs PostgresML
-
-## Introduction
-
-There are a many ways to do machine learning with data in a SQL database. In this article, we'll compare 2 projects that both aim to provide a SQL interface to machine learning algorithms and the data they require: **MindsDB** and **PostgresML**. We'll look at how they work, what they can do, and how they compare to each other. The **TLDR** is that PostgresML is more opinionated, more scalable, more capable and several times faster than MindsDB. On the other hand, MindsDB is 5 times more mature than PostgresML according to age and GitHub Stars. What are the important factors?
-
-_We're occasionally asked what the difference is between PostgresML and MindsDB. We'd like to answer that question at length, and let you decide if the reasoning is fair._
-
-### At a glance
-
-Both projects are Open Source, although PostgresML allows for more permissive use with the MIT license, compared to the GPL-3.0 license used by MindsDB. PostgresML is also a significantly newer project, with the first commit in 2022, compared to MindsDB which has been around since 2017, but one of the first hints at the real differences between the two projects is the choice of programming languages. MindsDB is implemented in Python, while PostgresML is implemented with Rust. I say _in_ Python, because it's a language with a runtime, and _with_ Rust, because it's a language with a compiler that does not require a Runtime. We'll see how this difference in implementation languages leads to different outcomes.
-
-| | MindsDB | PostgresML |
-| -------- | ------- | ---------- |
-| Age | 5 years | 1 year |
-| License | GPL-3.0 | MIT |
-| Language | Python | Rust |
-
-### Algorithms
-
-Both Projects integrate several dozen machine learning algorithms, including the latest LLMs from Hugging Face.
-
-| | MindsDB | PostgresML |
-| ----------------- | ------- | ---------- |
-| Classification | ✅ | ✅ |
-| Regression | ✅ | ✅ |
-| Time Series | ✅ | ✅ |
-| LLM Support | ✅ | ✅ |
-| Embeddings | - | ✅ |
-| Vector Support | - | ✅ |
-| Full Text Search | - | ✅ |
-| Geospatial Search | - | ✅ |
-
-Both MindsDB and PostgresML support many classical machine learning algorithms to do classification and regression. They are both able to load ~~the latest LLMs~~ some models from Hugging Face, supported by underlying implementations in libtorch. I had to cross that out after exploring all the caveats in the MindsDB implementations. PostgresML supports the models released immediately as long as underlying dependencies are met. MindsDB has to release an update to support any new models, and their current model support is extremely limited. New algorithms, tasks, and models are constantly released, so it's worth checking the documentation for the latest list.
-
-Another difference is that PostgresML also supports embedding models, and closely integrates them with vector search inside the database, which is well beyond the scope of MindsDB, since it's not a database at all. PostgresML has direct access to all the functionality provided by other Postgres extensions, like vector indexes from [pgvector](https://github.com/pgvector/pgvector) to perform efficient KNN & ANN vector recall, or [PostGIS](http://postgis.net/) for geospatial information as well as built in full text search. Multiple algorithms and extensions can be combined in compound queries to build state-of-the-art systems, like search and recommendations or fraud detection that generate an end to end result with a single query, something that might take a dozen different machine learning models and microservices in a more traditional architecture.
-
-### Architecture
-
-The architectural implementations for these projects is significantly different. PostgresML takes a data centric approach with Postgres as the provider for both storage _and_ compute. To provide horizontal scalability for inference, the PostgresML team has also created [PgCat](https://github.com/postgresml/pgcat) to distribute workloads across many Postgres databases. On the other hand, MindsDB takes a service oriented approach that connects to various databases over the network.
-
-
-
-| | MindsDB | PostgresML |
-| ------------- | ------------- | ---------- |
-| Data Access | Over the wire | In process |
-| Multi Process | ✅ | ✅ |
-| Database | - | ✅ |
-| Replication | - | ✅ |
-| Sharding | - | ✅ |
-| Cloud Hosting | ✅ | ✅ |
-| On Premise | ✅ | ✅ |
-| Web UI | ✅ | ✅ |
-
-The difference in architecture leads to different tradeoffs and challenges. There are already hundreds of ways to get data into and out of a Postgres database, from just about every other service, language and platform that makes PostgresML highly compatible with other application workflows. On the other hand, the MindsDB Python service accepts connections from specifically supported clients like `psql` and provides a pseudo-SQL interface to the functionality. The service will parse incoming MindsDB commands that look similar to SQL (but are not), for tasks like configuring database connections, or doing actual machine learning. These commands typically have what looks like a sub-select, that will actually fetch data over the wire from configured databases for Machine Learning training and inference.
-
-MindsDB is actually a pretty standard Python microservice based architecture that separates data from compute over the wire, just with an SQL like API, instead of gRPC or REST. MindsDB isn't actually a DB at all, but rather an ML service with adapters for just about every database that Python can connect to.
-
-On the other hand, PostgresML runs ML algorithms inside the database itself. It shares memory with the database, and can access data directly, using pointers to avoid the serialization and networking overhead that frequently dominates data hungry machine learning applications. Rust is an important language choice for PostgresML because its memory safety simplifies the effort required to achieve stability along with performance in a large and complex memory space. The "tradeoff", is that it requires a Postgres database to actually host the data it operates on.
-
-In addition to the extension, PostgresML relies on PgCat to scale Postgres clusters horizontally using both sharding and replication strategies to provide both scalable compute and storage. Scaling a low latency and high availability feature store is often the most difficult operational challenge for Machine Learning applications. That's the primary driver of PostgresML's architectural choices. MindsDB leaves those issues as an exercise for the adopter, while also introducing a new single service bottleneck for ML compute implemented in Python.
-
-## Benchmarks
-
-If you missed our previous article benchmarking PostgresML vs Python Microservices, spoiler alert, PostgresML is between 8-40x faster than Python microservice architectures that do the same thing, even if they use "specialized" in memory databases like Redis. The network transit cost as well as data serialization is a major cost for data hungry machine learning algorithms. Since MindsDB doesn't actually provide a DB, we'll create a synthetic benchmark that doesn't use stored data in a database (even though that's the whole point of SQL ML, right?). This will negate the network serialization and transit costs a MindsDB service would typically occur, and highlight the performance differences between Python and Rust implementations.
-
-#### PostgresML
-
-We'll connect to our Postgres server running locally:
-
-```commandline
-psql postgres://postgres:password@127.0.0.1:5432
-```
-
-For both implementations, we can just pass in our data as part of the query for an apples to apples performance comparison. PostgresML adds the `pgml.transform` function, that takes an array of inputs to transform, given a task and model, without any setup beyond installing the extension. Let's see how long it takes to run a sentiment analysis model on a single sentence:
-
-!!! generic
-
-!!! code\_block time="4769.337 ms"
-
-```postgresql
-SELECT pgml.transform(
- inputs => ARRAY[
- 'I am so excited to benchmark deep learning models in SQL. I can not wait to see the results!'
- ],
- task => '{
- "task": "text-classification",
- "model": "cardiffnlp/twitter-roberta-base-sentiment"
- }'::JSONB
-);
-```
-
-!!!
-
-!!! results
-
-| positivity |
-| ---------------------------------------------------- |
-| \[{"label": "LABEL\_2", "score": 0.990081250667572}] |
-
-!!!
-
-!!!
-
-The first time `transform` is run with a particular model name, it will download that pretrained transformer from HuggingFace, and load it into RAM, or VRAM if a GPU is available. In this case, that took about 5 seconds, but let's see how fast it is now that the model is cached.
-
-!!! generic
-
-!!! code\_block time="45.094 ms"
-
-```postgresql
-SELECT pgml.transform(
- inputs => ARRAY[
- 'I don''t really know if 5 seconds is fast or slow for deep learning. How much time is spent downloading vs running the model?'
- ],
- task => '{
- "task": "text-classification",
- "model": "cardiffnlp/twitter-roberta-base-sentiment"
- }'::JSONB
-);
-```
-
-!!!
-
-!!! results
-
-| transform |
-| ------------------------------------------------------ |
-| \[{"label": "LABEL\_1", "score": 0.49658918380737305}] |
-
-!!!
-
-!!!
-
-45ms is below the level of human perception, so we could use a deep learning model like this to build an interactive application that feels instantaneous to our users. It's worth noting that PostgresML will automatically use a GPU if it's available. This benchmark machine includes an NVIDIA RTX 3090. We can also check the speed on CPU only, by setting the `device` argument to `cpu`:
-
-!!! generic
-
-!!! code\_block time="165.036 ms"
-
-```postgresql
-SELECT pgml.transform(
- inputs => ARRAY[
- 'Are GPUs really worth it? Sometimes they are more expensive than the rest of the computer combined.'
- ],
- task => '{
- "task": "text-classification",
- "model": "cardiffnlp/twitter-roberta-base-sentiment",
- "device": "cpu"
- }'::JSONB
-);
-```
-
-!!!
-
-!!! results
-
-| transform |
-| ----------------------------------------------------- |
-| \[{"label": "LABEL\_0", "score": 0.7333963513374329}] |
-
-!!!
-
-!!!
-
-The GPU is able to run this model about 4x faster than the i9-13900K with 24 cores.
-
-#### Model Outputs
-
-You might have noticed that the `inputs` the model was analyzing got less positive over time, and the model moved from `LABEL_2` to `LABEL_1` to `LABEL_0`. Some models use more descriptive outputs, but in this case I had to look at the [README](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment/blob/main/README.md) to see what the labels represent.
-
-Labels:
-
-* 0 -> Negative
-* 1 -> Neutral
-* 2 -> Positive
-
-It looks like this model did correctly pick up on the decreasing enthusiasm in the text, so not only is it relatively fast on a GPU, it's usefully accurate. Another thing to consider when it comes to model quality is that this model was trained on tweets, and these inputs were chosen to be about as long and complex as a tweet. It's not always clear how well a model will generalize to novel looking inputs, so it's always important to do a little reading about a model when you're looking for ways to test and improve the quality of it's output.
-
-#### MindsDB
-
-MindsDB requires a bit more setup than just the database, but I'm running it on the same machine with the latest version. I'll also use the same model, so we can compare apples to apples.
-
-```commandline
-python -m mindsdb --api postgres
-```
-
-Then we can connect to this Python service with our Postgres client:
-
-```
-psql postgres://mindsdb:123@127.0.0.1:55432
-```
-
-And turn timing on to see how long it takes to run the same query:
-
-```postgresql
-\timing on
-```
-
-And now we can issue some MindsDB pseudo sql:
-
-!!! code\_block time="277.722 ms"
-
-```
-CREATE MODEL mindsdb.sentiment_classifier
-PREDICT sentiment
-USING
- engine = 'huggingface',
- task = 'text-classification',
- model_name = 'cardiffnlp/twitter-roberta-base-sentiment',
- input_column = 'text',
- labels = ['negativ', 'neutral', 'positive'];
-```
-
-!!!
-
-This kicked off a background job in the Python service to download the model and set it up, which took about 4 seconds judging from the logs, but I don't have an exact time for exactly when the model became "status: complete" and was ready to handle queries.
-
-Now we can write a query that will make a prediction similar to PostgresML, using the same Huggingface model.
-
-!!! generic
-
-!!! code\_block time="741.650 ms"
-
-```
-SELECT *
-FROM mindsdb.sentiment_classifier
-WHERE text = 'I am so excited to benchmark deep learning models in SQL. I can not wait to see the results!'
-```
-
-!!!
-
-!!! results
-
-| sentiment | sentiment\_explain | text |
-| --------- | -------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |
-| positive | {"positive": 0.990081250667572, "neutral": 0.008058485575020313, "negativ": 0.0018602772615849972} | I am so excited to benchmark deep learning models in SQL. I can not wait to see the results! |
-
-!!!
-
-!!!
-
-Since we've provided the MindsDB model with more human-readable labels, they're reusing those (including the negativ typo), and returning all three scores along with the input by default. However, this seems to be a bit slower than anything we've seen so far. Let's try to speed it up by only returning the label without the full sentiment\_explain.
-
-!!! generic
-
-!!! code\_block time="841.936 ms"
-
-```
-SELECT sentiment
-FROM mindsdb.sentiment_classifier
-WHERE text = 'I am so excited to benchmark deep learning models in SQL. I can not wait to see the results!'
-```
-
-!!!
-
-!!! results
-
-| sentiment |
-| --------- |
-| positive |
-
-!!!
-
-!!!
-
-It's not the sentiment\_explain that's slowing it down. I spent several hours of debugging, and learned a lot more about the internal Python service architecture. I've confirmed that even though inside the Python service, `torch.cuda.is_available()` returns `True` when the service starts, I never see a Python process use the GPU with `nvidia-smi`. MindsDB also claims to run on GPU, but I haven't been able to find any documentation, or indication in the code why it doesn't "just work". I'm stumped on this front, but I think it's fair to assume this is a pure CPU benchmark.
-
-The other thing I learned trying to get this working is that MindsDB isn't just a single Python process. Python famously has a GIL that will impair parallelism, so the MindsDB team has cleverly built a service that can run multiple Python processes in parallel. This is great for scaling out, but it means that our query is serialized to JSON and sent to a worker, and then the worker actually runs the model and sends the results back to the parent, again as JSON, which as far as I can tell is where the 5x slow-down is happening.
-
-## Results
-
-PostgresML is the clear winner in terms of performance. It seems to me that it currently also support more models with a looser function API than the pseudo SQL required to create a MindsDB model. You'll notice the output structure for models on HuggingFace can very widely. I tried several not listed in the MindsDB documentation, but received errors on creation. PostgresML just returns the models output without restructuring, so it's able to handle more discrepancies, although that does leave it up to the end user to sort out how to use models.
-
-| task | model | MindsDB | PostgresML CPU | PostgresML GPU |
-| ----------------------- | ----------------------------------------- | ------- | -------------- | -------------- |
-| text-classification | cardiffnlp/twitter-roberta-base-sentiment | 741 | 165 | 45 |
-| translation\_en\_to\_es | t5-base | 1573 | 1148 | 294 |
-| summarization | sshleifer/distilbart-cnn-12-6 | 4289 | 3450 | 479 |
-
-There is a general trend, the larger and slower the model is, the more work is spent inside libtorch, the less the performance of the rest matters, but for interactive models and use cases there is a significant difference. We've tried to cover the most generous use case we could between these two. If we were to compare XGBoost or other classical algorithms, that can have sub millisecond prediction times in PostgresML, the 20ms Python service overhead of MindsDB just to parse the incoming query would be hundreds of times slower.
-
-## Clouds
-
-Setting these services up is a bit of work, even for someone heavily involved in the day-to-day machine learning mayhem. Managing machine learning services and databases at scale requires a significant investment over time. Both services are available in the cloud, so let's see how they compare on that front as well.
-
-MindsDB is available on the AWS marketplace on top of your own hardware instances. You can scale it out and configure your data sources through their Web UI, very similar to the local installation, but you'll also need to figure out your data sources and how to scale them for machine learning workloads. Good luck!
-
-PostgresML is available as a fully managed database service, that includes the storage, backups, metrics, and scalability through PgCat that large ML deployments need. End-to-end machine learning is rarely just about running the models, and often more about scaling the data pipelines and managing the data infrastructure around them, so in this case PostgresML also provides a large service advantage, whereas with MindsDB, you'll still need to figure out your cloud data storage solution independently.
diff --git a/pgml-cms/docs/resources/benchmarks/postgresml-is-8-40x-faster-than-python-http-microservices.md b/pgml-cms/docs/resources/benchmarks/postgresml-is-8-40x-faster-than-python-http-microservices.md
deleted file mode 100644
index c5812fd56..000000000
--- a/pgml-cms/docs/resources/benchmarks/postgresml-is-8-40x-faster-than-python-http-microservices.md
+++ /dev/null
@@ -1,177 +0,0 @@
----
-description: PostgresML is a simpler alternative to that ever-growing complexity.
----
-
-# PostgresML is 8-40x faster than Python HTTP microservices
-
-Machine learning architectures can be some of the most complex, expensive and _difficult_ arenas in modern systems. The number of technologies and the amount of required hardware compete for tightening headcount, hosting, and latency budgets. Unfortunately, the trend in the industry is only getting worse along these lines, with increased usage of state-of-the-art architectures that center around data warehouses, microservices and NoSQL databases.
-
-PostgresML is a simpler alternative to that ever-growing complexity. In this post, we explore some additional performance benefits of a more elegant architecture and discover that PostgresML outperforms traditional Python microservices by a **factor of 8** in local tests and by a **factor of 40** on AWS EC2.
-
-## Candidate architectures
-
-To consider Python microservices with every possible advantage, our first benchmark is run with Python and Redis located on the same machine. Our goal is to avoid any additional network latency, which puts it on a more even footing with PostgresML. Our second test takes place on AWS EC2, with Redis and Gunicorn separated by a network; this benchmark proves to be relatively devastating.
-
-The full source code for both benchmarks is available on [Github](https://github.com/postgresml/postgresml/tree/master/pgml-cms/docs/blog/benchmarks/python\_microservices\_vs\_postgresml).
-
-### PostgresML
-
-PostgresML architecture is composed of:
-
-1. A PostgreSQL server with PostgresML v2.0
-2. [pgbench](https://www.postgresql.org/docs/current/pgbench.html) SQL client
-
-### Python
-
-Python architecture is composed of:
-
-1. A Flask/Gunicorn server accepting and returning JSON
-2. CSV file with the training data
-3. Redis feature store with the inference dataset, serialized with JSON
-4. [ab](https://httpd.apache.org/docs/2.4/programs/ab.html) HTTP client
-
-### ML
-
-Both architectures host the same XGBoost model, running predictions against the same dataset. See [Methodology](../../benchmarks/broken-reference/) for more details.
-
-## Results
-
-### Throughput
-
-
-
-Throughput is defined as the number of XGBoost predictions the architecture can serve per second. In this benchmark, PostgresML outperformed Python and Redis, running on the same machine, by a **factor of 8**.
-
-In Python, most of the bottleneck comes from having to fetch and deserialize Redis data. Since the features are externally stored, they need to be passed through Python and into XGBoost. XGBoost itself is written in C++, and it's Python library only provides a convenient interface. The prediction coming out of XGBoost has to go through Python again, serialized as JSON, and sent via HTTP to the client.
-
-This is pretty much the bare minimum amount of work you can do for an inference microservice.
-
-PostgresML, on the other hand, collocates data and compute. It fetches data from a Postgres table, which already comes in a standard floating point format, and the Rust inference layer forwards it to XGBoost via a pointer.
-
-An interesting thing happened when the benchmark hit 20 clients: PostgresML throughput starts to quickly decrease. This may be surprising to some, but to Postgres enthusiasts it's a known issue: Postgres isn't very good at handling more concurrent active connections than CPU threads. To mitigate this, we introduced PgBouncer (a Postgres proxy and pooler) in front of the database, and the throughput increased back up, and continued to hold as we went to 100 clients.
-
-It's worth noting that the benchmarking machine had only 16 available CPU threads (8 cores). If more cores were available, the bottleneck would only occur with more clients. The general recommendation for Postgres servers it to open around 2 connections per available CPU core, although newer versions of PostgreSQL have been incrementally chipping away at this limitation.
-
-#### Why throughput is important
-
-Throughput allows you to do more with less. If you're able to serve 30,000 queries per second using a single machine, but only using 1,000 today, you're unlikely to need an upgrade anytime soon. On the other hand, if the system can only serve 5,000 requests, an expensive and possibly stressful upgrade is in your near future.
-
-### Latency
-
-
-
-Latency is defined as the time it takes to return a single XGBoost prediction. Since most systems have limited resources, throughput directly impacts latency (and vice versa). If there are many active requests, clients waiting in the queue take longer to be serviced, and overall system latency increases.
-
-In this benchmark, PostgresML outperformed Python by a **factor of 8** as well. You'll note the same issue happens at 20 clients, and the same mitigation using PgBouncer reduces its impact. Meanwhile, Python's latency continues to increase substantially.
-
-Latency is a good metric to use when describing the performance of an architecture. In other words, if I were to use this service, I would get a prediction back in at most this long, irrespective of how many other clients are using it.
-
-#### Why latency is important
-
-Latency is important in machine learning services because they are often running as an addition to the main application, and sometimes have to be accessed multiple times during the same HTTP request.
-
-Let's take the example of an e-commerce website. A typical storefront wants to show many personalization models concurrently. Examples of such models could include "buy it again" recommendations for recurring purchases (binary classification), or "popular items in your area" (geographic clustering of purchase histories) or "customers like you bought this item" (nearest neighbour model).
-
-All of these models are important because they have been proven, over time, to be very successful at driving purchases. If inference latency is high, the models start to compete for very expensive real estate, front page and checkout, and the business has to drop some of them or, more likely, suffer from slow page loads. Nobody likes a slow app when they are trying to order groceries or dinner.
-
-### Memory utilization
-
-
-
-Python is known for using more memory than more optimized languages and, in this case, it uses **7 times** more than PostgresML.
-
-PostgresML is a Postgres extension, and it shares RAM with the database server. Postgres is very efficient at fetching and allocating only the memory it needs: it reuses `shared_buffers` and OS page cache to store rows for inference, and requires very little to no memory allocation to serve queries.
-
-Meanwhile, Python must allocate memory for each feature it receives from Redis and for each HTTP response it returns. This benchmark did not measure Redis memory utilization, which is an additional and often substantial cost of running traditional machine learning microservices.
-
-#### Training
-
-
-
-Since Python often uses Pandas to load and preprocess data, it is notably more memory hungry. Before even passing the data into XGBoost, we were already at 8GB RSS (resident set size); during actual fitting, memory utilization went to almost 12GB. This test is another best case scenario for Python, since the data has already been preprocessed, and was merely passed on to the algorithm.
-
-Meanwhile, PostresML enjoys sharing RAM with the Postgres server and only allocates the memory needed by XGBoost. The dataset size was significant, but we managed to train the same model using only 5GB of RAM. PostgresML therefore allows training models on datasets at least twice as large as Python, all the while using identical hardware.
-
-#### Why memory utilization is important
-
-This is another example of doing more with less. Most machine learning algorithms, outside of FAANG and research universities, require the dataset to fit into the memory of a single machine. Distributed training is not where we want it to be, and there is still so much value to be extracted from simple linear regressions.
-
-Using less RAM allows to train larger and better models on larger and more complete datasets. If you happen to suffer from large machine learning compute bills, using less RAM can be a pleasant surprise at the end of your fiscal year.
-
-## What about UltraJSON/MessagePack/Serializer X?
-
-We spent a lot of time talking about serialization, so it makes sense to look at prior work in that field.
-
-JSON is the most user-friendly format, but it's certainly not the fastest. MessagePack and Ultra JSON, for example, are sometimes faster and more efficient at reading and storing binary information. So, would using them in this benchmark be better, instead of Python's built-in `json` module?
-
-The answer is: not really.
-
-
-
-
-
-Time to (de)serialize is important, but ultimately needing (de)serialization in the first place is the bottleneck. Taking data out of a remote system (e.g. a feature store like Redis), sending it over a network socket, parsing it into a Python object (which requires memory allocation), only to convert it again to a binary type for XGBoost, is causing unnecessary delays in the system.
-
-PostgresML does **one in-memory copy** of features from Postgres. No network, no (de)serialization, no unnecessary latency.
-
-## What about the real world?
-
-Testing over localhost is convenient, but it's not the most realistic benchmark. In production deployments, the client and the server are on different machines, and in the case of the Python + Redis architecture, the feature store is yet another network hop away.
-
-To demonstrate this, we spun up 3 EC2 instances and ran the benchmark again. This time, PostgresML outperformed Python and Redis **by a factor of 40**.
-
-
-
-
-
-Network gap between Redis and Gunicorn made things worse...a lot worse. Fetching data from a remote feature store added milliseconds to the request the Python architecture could not spare. The additional latency compounded, and in a system that has finite resources, caused contention. Most Gunicorn threads were simply waiting on the network, and thousands of requests were stuck in the queue.
-
-PostgresML didn't have this issue, because the features and the Rust inference layer live on the same system. This architectural choice removes network latency and (de)serialization from the equation.
-
-You'll note the concurrency issue we discussed earlier hit Postgres at 20 connections, and we used PgBouncer again to save the day.
-
-Scaling Postgres, once you know how to do it, isn't as difficult as it sounds.
-
-## Methodology
-
-### Hardware
-
-Both the client and the server in the first benchmark were located on the same machine. Redis was local as well. The machine is an 8 core, 16 threads AMD Ryzen 7 5800X with 32GB RAM, 1TB NVMe SSD running Ubuntu 22.04.
-
-AWS EC2 benchmarks were done with one `c5.4xlarge` instance hosting Gunicorn and PostgresML, and two `c5.large` instances hosting the client and Redis, respectively. They were located in the same VPC.
-
-### Configuration
-
-Gunicorn was running with 5 workers and 2 threads per worker. Postgres was using 1, 5 and 20 connections for 1, 5 and 20 clients, respectively. PgBouncer was given a `default_pool_size` of 10, so a maximum of 10 Postgres connections were used for 20 and 100 clients.
-
-XGBoost was allowed to use 2 threads during inference, and all available CPU cores (16 threads) during training.
-
-Both `ab` and `pgbench` use all available resources, but are very lightweight; the requests were a single JSON object and a single query respectively. Both of the clients use persistent connections, `ab` by using HTTP Keep-Alives, and `pgbench` by keeping the Postgres connection open for the duration of the benchmark.
-
-## ML
-
-### Data
-
-We used the [Flight Status Prediction](https://www.kaggle.com/datasets/robikscube/flight-delay-dataset-20182022) dataset from Kaggle. After some post-processing, it ended up being about 2 GB of floating point features. We didn't use all columns because some of them are redundant, e.g. airport name and airport identifier, which refer to the same thing.
-
-### Model
-
-Our XGBoost model was trained with default hyperparameters and 25 estimators (also known as boosting rounds).
-
-Data used for training and inference is available [here](https://static.postgresml.org/benchmarks/flights.csv). Data stored in the Redis feature store is available [here](https://static.postgresml.org/benchmarks/flights\_sub.csv). It's only a subset because it was taking hours to load the entire dataset into Redis with a single Python process (28 million rows). Meanwhile, Postgres `COPY` only took about a minute.
-
-PostgresML model is trained with:
-
-```postgresql
-SELECT * FROM pgml.train(
- project_name => 'r2',
- algorithm => 'xgboost',
- hyperparams => '{ "n_estimators": 25 }'
-);
-```
-
-It had terrible accuracy (as did the Python version), probably because we were missing any kind of weather information, the latter most likely causing delays at airports.
-
-### Source code
-
-Benchmark source code can be found on [Github](https://github.com/postgresml/postgresml/tree/master/pgml-cms/docs/blog/benchmarks/python\_microservices\_vs\_postgresml/).
diff --git a/pgml-cms/docs/resources/developer-docs/README.md b/pgml-cms/docs/resources/developer-docs/README.md
deleted file mode 100644
index b9194723c..000000000
--- a/pgml-cms/docs/resources/developer-docs/README.md
+++ /dev/null
@@ -1,2 +0,0 @@
-# Developer Docs
-
diff --git a/pgml-cms/docs/summary_draft.md b/pgml-cms/docs/summary_draft.md
deleted file mode 100644
index e207aa1be..000000000
--- a/pgml-cms/docs/summary_draft.md
+++ /dev/null
@@ -1,154 +0,0 @@
-# Table of contents
-
-## Introduction
-
-* [Overview](README.md)
-* [Getting started](introduction/getting-started/README.md)
- * [Create your database](introduction/getting-started/create-your-database.md)
- * [Connect your app](introduction/getting-started/connect-your-app.md)
-* [Import your data](introduction/getting-started/import-your-data/README.md)
- * [Logical replication](introduction/getting-started/import-your-data/logical-replication/README.md)
- * [Foreign Data Wrappers](introduction/getting-started/import-your-data/foreign-data-wrappers.md)
- * [Move data with COPY](introduction/getting-started/import-your-data/copy.md)
- * [Migrate with pg_dump](introduction/getting-started/import-your-data/pg-dump.md)
-
-## API
-
-* [Overview](api/overview.md)
-* [SQL extension](api/sql-extension/README.md)
- * [pgml.embed()](api/sql-extension/pgml.embed.md)
- * [pgml.transform()](api/sql-extension/pgml.transform/README.md)
- * [Fill-Mask](api/sql-extension/pgml.transform/fill-mask.md)
- * [Question answering](api/sql-extension/pgml.transform/question-answering.md)
- * [Summarization](api/sql-extension/pgml.transform/summarization.md)
- * [Text classification](api/sql-extension/pgml.transform/text-classification.md)
- * [Text Generation](api/sql-extension/pgml.transform/text-generation.md)
- * [Text-to-Text Generation](api/sql-extension/pgml.transform/text-to-text-generation.md)
- * [Token Classification](api/sql-extension/pgml.transform/token-classification.md)
- * [Translation](api/sql-extension/pgml.transform/translation.md)
- * [Zero-shot Classification](api/sql-extension/pgml.transform/zero-shot-classification.md)
- * [pgml.deploy()](api/sql-extension/pgml.deploy.md)
- * [pgml.decompose()](api/sql-extension/pgml.decompose.md)
- * [pgml.chunk()](api/sql-extension/pgml.chunk.md)
- * [pgml.generate()](api/sql-extension/pgml.generate.md)
- * [pgml.predict()](api/sql-extension/pgml.predict/README.md)
- * [Batch Predictions](api/sql-extension/pgml.predict/batch-predictions.md)
- * [pgml.train()](api/sql-extension/pgml.train/README.md)
- * [Regression](api/sql-extension/pgml.train/regression.md)
- * [Classification](api/sql-extension/pgml.train/classification.md)
- * [Clustering](api/sql-extension/pgml.train/clustering.md)
- * [Decomposition](api/sql-extension/pgml.train/decomposition.md)
- * [Data Pre-processing](api/sql-extension/pgml.train/data-pre-processing.md)
- * [Hyperparameter Search](api/sql-extension/pgml.train/hyperparameter-search.md)
- * [Joint Optimization](api/sql-extension/pgml.train/joint-optimization.md)
- * [pgml.tune()](api/sql-extension/pgml.tune.md)
-* [Client SDK](api/client-sdk/README.md)
- * [Collections](api/client-sdk/collections.md)
- * [Pipelines](api/client-sdk/pipelines.md)
- * [Vector Search](api/client-sdk/search.md)
- * [Document Search](api/client-sdk/document-search.md)
- * [Tutorials](api/client-sdk/tutorials/README.md)
- * [Semantic Search](api/client-sdk/tutorials/semantic-search.md)
- * [Semantic Search Using Instructor Model](api/client-sdk/tutorials/semantic-search-1.md)
-
-## Guides
-
-* [Embeddings](guides/embeddings/README.md)
- * [In-database Generation](guides/embeddings/in-database-generation.md)
- * [Dimensionality Reduction](guides/embeddings/dimensionality-reduction.md)
- * [Aggregation](guides/embeddings/vector-aggregation.md)
- * [Similarity](guides/embeddings/vector-similarity.md)
- * [Normalization](guides/embeddings/vector-normalization.md)
-
-
-
-* [Search](guides/improve-search-results-with-machine-learning.md)
-* [Chatbots](guides/chatbots/README.md)
- * [Example Application](use-cases/chatbots.md)
-* [Supervised Learning](guides/supervised-learning.md)
-* [OpenSourceAI](guides/opensourceai.md)
-* [Natural Language Processing](guides/natural-language-processing.md)
-
-
-
-## Product
-
-* [Cloud database](product/cloud-database/README.md)
- * [Serverless](product/cloud-database/serverless.md)
- * [Dedicated](product/cloud-database/dedicated.md)
- * [Enterprise](product/cloud-database/plans.md)
-* [Vector database](product/vector-database.md)
-* [PgCat pooler](product/pgcat/README.md)
- * [Features](product/pgcat/features.md)
- * [Installation](product/pgcat/installation.md)
- * [Configuration](product/pgcat/configuration.md)
-
-
-## Resources
-
-* [Architecture](resources/architecture/README.md)
- * [Why PostgresML?](resources/architecture/why-postgresml.md)
-* [FAQs](resources/faqs.md)
-* [Data Storage & Retrieval](resources/data-storage-and-retrieval/README.md)
- * [Documents](resources/data-storage-and-retrieval/documents.md)
- * [Partitioning](resources/data-storage-and-retrieval/partitioning.md)
- * [LLM based pipelines with PostgresML and dbt (data build tool)](resources/data-storage-and-retrieval/llm-based-pipelines-with-postgresml-and-dbt-data-build-tool.md)
-* [Benchmarks](resources/benchmarks/postgresml-is-8-40x-faster-than-python-http-microservices.md)
- * [PostgresML is 8-40x faster than Python HTTP microservices](resources/benchmarks/postgresml-is-8-40x-faster-than-python-http-microservices.md)
- * [Scaling to 1 Million Requests per Second](resources/benchmarks/million-requests-per-second.md)
- * [MindsDB vs PostgresML](resources/benchmarks/mindsdb-vs-postgresml.md)
- * [GGML Quantized LLM support for Huggingface Transformers](resources/benchmarks/ggml-quantized-llm-support-for-huggingface-transformers.md)
- * [Making Postgres 30 Percent Faster in Production](resources/benchmarks/making-postgres-30-percent-faster-in-production.md)
-* [Developer Docs](resources/developer-docs/README.md)
- * [Local Docker Development](resources/developer-docs/quick-start-with-docker.md)
- * [Installation](resources/developer-docs/installation.md)
- * [Contributing](resources/developer-docs/contributing.md)
- * [Distributed Training](resources/developer-docs/distributed-training.md)
- * [GPU Support](resources/developer-docs/gpu-support.md)
- * [Self-hosting](resources/developer-docs/self-hosting/README.md)
- * [Pooler](resources/developer-docs/self-hosting/pooler.md)
- * [Building from source](resources/developer-docs/self-hosting/building-from-source.md)
- * [Replication](resources/developer-docs/self-hosting/replication.md)
- * [Backups](resources/developer-docs/self-hosting/backups.md)
- * [Running on EC2](resources/developer-docs/self-hosting/running-on-ec2.md)
diff --git a/pgml-cms/docs/use-cases/README.md b/pgml-cms/docs/use-cases/README.md
deleted file mode 100644
index 9b163e6e0..000000000
--- a/pgml-cms/docs/use-cases/README.md
+++ /dev/null
@@ -1 +0,0 @@
-use-cases section is deprecated, and is being refactored into guides, or a new section under product
\ No newline at end of file
diff --git a/pgml-cms/docs/use-cases/embeddings/README.md b/pgml-cms/docs/use-cases/embeddings/README.md
deleted file mode 100644
index 1906c7873..000000000
--- a/pgml-cms/docs/use-cases/embeddings/README.md
+++ /dev/null
@@ -1,87 +0,0 @@
-# Embeddings
-
-## Embeddings
-
-Embeddings are a numeric representation of text. They are used to represent words and sentences as vectors, an array of numbers. Embeddings can be used to find similar pieces of text, by comparing the similarity of the numeric vectors using a distance measure, or they can be used as input features for other machine learning models, since most algorithms can't use text directly.
-
-Many pretrained LLMs can be used to generate embeddings from text within PostgresML. You can browse all the [models](https://huggingface.co/models?library=sentence-transformers) available to find the best solution on Hugging Face.
-
-PostgresML provides a simple interface to generate embeddings from text in your database. You can use the `pgml.embed` function to generate embeddings for a column of text. The function takes a transformer name and a text value. The transformer will automatically be downloaded and cached for reuse.
-
-### Long Form Examples
-
-For a deeper dive, check out the following articles we've written illustrating the use of embeddings:
-
-* [Generating LLM embeddings in the database with open source models](https://postgresml.org/blog/generating-llm-embeddings-with-open-source-models-in-postgresml)
-* [Tuning vector recall while generating query embeddings on the fly](https://postgresml.org/blog/tuning-vector-recall-while-generating-query-embeddings-in-the-database)
-* [Personalize embedding results with application data in your database](https://postgresml.org/blog/personalize-embedding-results-with-application-data-in-your-database)
-
-### API
-
-```postgresql
-pgml.embed(
- transformer TEXT, -- huggingface sentence-transformer name
- text TEXT, -- input to embed
- kwargs JSON -- optional arguments (see below)
-)
-```
-
-### Example
-
-Let's use the `pgml.embed` function to generate embeddings for tweets, so we can find similar ones. We will use the `distilbert-base-uncased` model. This model is a small version of the `bert-base-uncased` model. It is a good choice for short texts like tweets. To start, we'll load a dataset that provides tweets classified into different topics.
-
-```postgresql
-SELECT pgml.load_dataset('tweet_eval', 'sentiment');
-```
-
-View some tweets and their topics.
-
-```postgresql
-SELECT *
-FROM pgml.tweet_eval
-LIMIT 10;
-```
-
-Get a preview of the embeddings for the first 10 tweets. This will also download the model and cache it for reuse, since it's the first time we've used it.
-
-```postgresql
-SELECT text, pgml.embed('distilbert-base-uncased', text)
-FROM pgml.tweet_eval
-LIMIT 10;
-```
-
-It will take a few minutes to generate the embeddings for the entire dataset. We'll save the results to a new table.
-
-```postgresql
-CREATE TABLE tweet_embeddings AS
-SELECT text, pgml.embed('distilbert-base-uncased', text) AS embedding
-FROM pgml.tweet_eval;
-```
-
-Now we can use the embeddings to find similar tweets. We'll use the `pgml.cosign_similarity` function to find the tweets that are most similar to a given tweet (or any other text input).
-
-```postgresql
-WITH query AS (
- SELECT pgml.embed('distilbert-base-uncased', 'Star Wars christmas special is on Disney') AS embedding
-)
-SELECT text, pgml.cosine_similarity(tweet_embeddings.embedding, query.embedding) AS similarity
-FROM tweet_embeddings, query
-ORDER BY similarity DESC
-LIMIT 50;
-```
-
-On small datasets (<100k rows), a linear search that compares every row to the query will give sub-second results, which may be fast enough for your use case. For larger datasets, you may want to consider various indexing strategies offered by additional extensions.
-
-* [Cube](https://www.postgresql.org/docs/current/cube.html) is a built-in extension that provides a fast indexing strategy for finding similar vectors. By default it has an arbitrary limit of 100 dimensions, unless Postgres is compiled with a larger size.
-* [PgVector](https://github.com/pgvector/pgvector) supports embeddings up to 2000 dimensions out of the box, and provides a fast indexing strategy for finding similar vectors.
-
-```postgresql
-CREATE EXTENSION vector;
-CREATE TABLE items (text TEXT, embedding VECTOR(768));
-INSERT INTO items SELECT text, embedding FROM tweet_embeddings;
-CREATE INDEX ON items USING ivfflat (embedding vector_cosine_ops);
-WITH query AS (
- SELECT pgml.embed('distilbert-base-uncased', 'Star Wars christmas special is on Disney')::vector AS embedding
-)
-SELECT * FROM items, query ORDER BY items.embedding <=> query.embedding LIMIT 10;
-```
diff --git a/pgml-cms/docs/use-cases/embeddings/tuning-vector-recall-while-generating-query-embeddings-in-the-database.md b/pgml-cms/docs/use-cases/embeddings/tuning-vector-recall-while-generating-query-embeddings-in-the-database.md
deleted file mode 100644
index 96c99a15d..000000000
--- a/pgml-cms/docs/use-cases/embeddings/tuning-vector-recall-while-generating-query-embeddings-in-the-database.md
+++ /dev/null
@@ -1,502 +0,0 @@
-# Tuning vector recall while generating query embeddings in the database
-
-
-PostgresML makes it easy to generate embeddings using open source models and perform complex queries with vector indexes unlike any other database. The full expressive power of SQL as a query language is available to seamlessly combine semantic, geospatial, and full text search, along with filtering, boosting, aggregation, and ML reranking in low latency use cases. You can do all of this faster, simpler and with higher quality compared to applications built on disjoint APIs like OpenAI + Pinecone. Prove the results in this series to your own satisfaction, for free, by signing up for a GPU accelerated database.
-
-## Introduction
-
-This article is the second in a multipart series that will show you how to build a post-modern semantic search and recommendation engine, including personalization, using open source models.
-
-1. Generating LLM Embeddings with HuggingFace models
-2. Tuning vector recall with pgvector
-3. Personalizing embedding results with application data
-4. Optimizing semantic results with an XGBoost ranking model - coming soon!
-
-The previous article discussed how to generate embeddings that perform better than OpenAI's `text-embedding-ada-002` and save them in a table with a vector index. In this article, we'll show you how to query those embeddings effectively.
-
-
-_Embeddings show us the relationships between rows in the database, using natural language._
-
-Our example data is based on 5 million DVD reviews from Amazon customers submitted over a decade. For reference, that's more data than fits in a Pinecone Pod at the time of writing. Webscale: check. Let's start with a quick refresher on the data in our `pgml.amazon_us_reviews` table:
-
-!!! generic
-
-!!! code\_block time="107.207ms"
-
-```postgresql
-SELECT *
-FROM pgml.amazon_us_reviews
-LIMIT 5;
-```
-
-!!!
-
-!!! results
-
-| marketplace | customer\_id | review\_id | product\_id | product\_parent | product\_title | product\_category | star\_rating | helpful\_votes | total\_votes | vine | verified\_purchase | review\_headline | review\_body | review\_date | id | review\_embedding\_e5\_large |
-| ----------- | ------------ | -------------- | ----------- | --------------- | ----------------------------------------------------------------------------------------------------------------- | ----------------- | ------------ | -------------- | ------------ | ---- | ------------------ | ------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------ | -- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| US | 16164990 | RZKBT035JA0UQ | B00X797LUS | 883589001 | Revenge: Season 4 | Video DVD | 5 | 1 | 2 | 0 | 1 | It's a hit with me | I don't usually watch soap operas, but Revenge grabbed me from the first episode. Now I have all four seasons and can watch them over again. If you like suspense and who done it's, then you will like Revenge. The ending was terrific, not to spoil it for those who haven't seen the show, but it's more fun to start with season one. | 2015-08-31 | 11 | \[-0.44635132,-1.4744929,0.29134354,0.060305085,-0.41350508,0.5875407,-0.061205346,0.3317157,0.3318643,-0.31223094,0.4632605,1.1153598,0.8087972,0.24135485,-0.09573943,-0.6522662,0.3471857,0.06589421,-0.49588993,-0.10770899,-0.12906694,-0.6840891,-0.0079286955,0.6722917,-1.1333038,0.9841143,-0.05413917,-0.63103,0.4891317,0.49941555,0.36425045,-1.1122142,0.39679757,-0.16903037,2.0291917,-0.4769759,0.069017395,-0.13972181,0.26427677,0.05579555,0.7277221,-0.09724414,-0.4079459,0.8500204,-1.4091835,0.020688279,-0.68782306,-0.024399774,1.159901,-0.7870475,0.8028308,-0.48158854,0.7254225,0.31266358,-0.8171888,0.0016202603,0.18997599,1.1948254,-0.027479807,-0.46444815,-0.16508491,0.7332363,0.53439474,0.17962055,-0.5157759,0.6162931,-0.2308871,-1.2384704,0.9215715,0.093228154,-1.0873187,0.44506252,0.6780382,1.4210767,-0.035378184,-0.37101075,0.36248568,-0.20481548,1.7752264,0.96295184,0.25421357,0.32428253,0.15021282,1.2010641,1.3598334,-0.09641862,1.9206793,-0.6621351,-0.19654606,0.9614237,0.8942871,0.06781684,0.6154728,0.5322664,-0.47281718,-0.10806668,0.19615875,1.1427128,1.1363747,-0.7448851,-0.6235285,-0.4178455,0.2823742,0.2022872,0.4639155,-0.82450366,-1.0911003,0.29300234,0.09920952,0.35992235,-0.89154017,0.6345019,-0.3539376,0.13820754,-0.08596075,-0.016720073,-0.86973023,0.60496914,1.0057746,1.4023327,1.3364636,0.41459054,0.8762501,-0.9326738,-0.62262,0.8540947,0.46354002,-0.5997743,0.14315224,1.276051,0.22685385,-0.27431846,-0.35084888,0.124737024,1.3882787,1.27789,-2.0416644,-1.2735635,0.45739195,-0.5252866,-0.049650192,-1.2893498,-0.13299808,-0.37871423,1.3282262,0.40052852,0.7439125,0.4438182,-0.11048192,0.28375423,-0.641405,-0.393038,-0.5177149,-0.9469533,-1.1396636,-1.2370745,0.36096996,0.02870304,0.5063284,-0.07706672,0.94798875,-0.27705917,-0.29239914,0.31463885,-1.0989273,-0.656829,2.8949435,-0.17305379,0.3815719,0.42526448,0.3081009,0.5685343,0.33076203,0.72707826,0.50143975,0.5845048,0.84975934,0.42427582,0.30121675,0.5989959,-0.7319157,-0.549556,0.63867736,0.012300444,-0.45165,0.6612118,-0.512683,-0.5376379,0.47559577,-0.8463519,-1.1943918,-0.76171356,0.7841424,0.5601279,-0.82258976,-1.0125699,-0.38812968,0.4420742,-0.6571599,-0.06353831,-0.59025985,0.61750174,1.126035,-1.280225,0.04327058,1.0567118,0.5743241,-1.1305283,0.45828968,-0.74915165,-1.0058457,0.44758803,-0.41461354,0.09315924,0.33658516,-0.0040031066,-0.06580057,0.5101937,-0.45152435,0.009831754,-0.86611366,0.71392256,1.3910902,1.0870686,0.7477381,0.96166354,0.27147853,0.044556435,0.6843247,-0.82584035,0.55440176,0.07432493,-0.0876536,0.89933145,-0.20821023,1.0045182,1.3212318,0.0023916673,0.30949935,-0.49783787,-0.0894654,0.42442265,0.16125606,-0.31338125,-0.18276067,0.8512234,0.29042283,1.1811026,0.17194802,0.104081966,-0.17348862,0.3214033,0.05323091,0.452102,0.44595376,-0.54339683,1.2369651,-0.90202415,-0.14463677,-0.40089816,0.4221295,-0.27183273,-0.46332398,0.03636483,-0.4491677,0.11768485,0.25375235,-0.5391649,1.6532613,-0.44395766,0.52174264,0.46777102,-0.6175785,-0.8521162,0.4074876,0.8601743,0.16133149,1.2534949,0.17186514,-1.4400607,0.12929483,0.19184573,-0.10323317,0.17845587,-0.9316995,-0.29608884,-0.15901098,0.13879488,0.7077851,0.7130752,-0.33218113,0.65922844,-0.16829759,-0.85618913,-0.50507075,0.04030782,0.28823212,0.63344556,-0.64391583,0.82986885,0.36421177,-0.31541574,0.15703243,-0.6918284,0.07207678,0.10856655,0.1837874,0.20774966,0.5002916,0.36118835,0.15846755,-0.59214884,-0.2806985,-1.4209367,-0.8781769,0.59149474,0.09860907,0.7798751,0.08356752,-0.3816034,0.62692493,1.0605069,0.009612969,-1.1639553,0.0387234,-0.62128127,-0.65425646,0.026634911,0.13652368,-0.31386188,0.5132959,-0.2279612,1.5733948,0.9453454,-0.47791338,-0.86752695,0.2590365,0.010133599,0.0731045,-0.08996825,1.5178722,0.2790404,0.42920277,0.16204502,0.51732993,0.7824352,-0.53204685,0.6322838,0.027865775,0.1909194,0.75459373,0.5329097,-0.25675827,-0.6438361,-0.6730749,0.0419199,1.647542,-0.79603523,-0.039030924,0.57257867,0.97090834,-0.18933444,0.061723463,0.054686982,0.057177402,0.24391848,-0.45859554,0.36363262,-0.028061919,0.5537379,0.23430054,0.06542831,-0.8465644,-0.61477613,-1.8602425,-0.5563627,0.5518607,1.1379824,0.05827968,0.6034838,0.10843904,0.66301763,-0.68257576,0.49940518,-1.0600849,0.3026614,0.20583217,0.45980504,-0.54227024,0.83065176,-0.12527004,0.94367605,-0.22141562,0.2656482,-1.0248334,-0.64097667,0.9686471,-0.2892358,-0.7154707,0.33837032,0.25886488,1.754326,0.040067837,-0.0130331945,1.014779,0.6381671,-0.14163442,-0.6668947,-0.52272713,0.44740087,1.0573436,0.7079764,-0.4765707,-0.45119467,0.33266848,-0.3335042,0.6264001,0.096436426,0.4861287,-0.64570946,-0.55701566,-0.8017526,-0.3268717,0.6509844,0.51674,0.5527258,0.06715509,0.13850002,-0.16415404,0.5339686,0.7038742,-0.23962326,-0.40861428,-0.80195314,-0.2562518,-0.31416067,-0.6004696,0.17173254,-0.08187528,-0.10650221,-0.8317999,0.21745056,0.5430748,-0.95596164,0.47898734,-0.6119156,0.41032174,-0.55160147,0.23355038,0.51838225,0.6097409,0.54803956,-0.64297825,-1.095854,-1.7266736,0.46846822,0.24315582,0.93500775,-1.2847418,-0.09460731,-0.9284272,-0.58228695,0.35412273,-1.338897,0.09689145,-0.9634888,-0.105158746,-0.24354713,-1.8149018,-0.81706595,0.5610544,0.2604056,-0.15690021,-0.34233433,0.21085337,0.095561,0.3357639,-0.4168723,-0.16001065,0.019738067,-0.25119543,0.21538053,0.9338039,-1.3079301,-0.5274139,0.0042342604,-0.26708132,-1.1157236,0.41096166,-1.0650482,-0.92784685,0.1649683,-0.076478265,-0.89887,-0.49810255,-0.9988228,0.398151,-0.1489247,0.18536144,0.47142923,0.7188731,-0.19373408,-0.43892148,-0.007021479,0.27125278,-0.0755358,-0.21995014,-0.09820049,-1.1432658,-0.6438058,0.45684898,-0.16717891,-0.06339566,-0.54050285,-0.21786614,-0.009872514,0.95797646,-0.6364886,0.06476644,0.15031907,-0.114178315,-0.6920534,0.33618665,-0.20828676,-1.218436,1.0650855,0.92841274,0.15988845,1.5152671,-0.27995184,0.43647304,0.123278655,-1.320316,-0.25041837,0.24997042,0.87653285,0.12610753,-0.8309733,0.5842415,-0.840945,-0.46114716,0.51617026,-0.6507864,1.5720816,0.43062973,-0.7194931,-1.400388,-0.9877925,-0.87884194,0.46331164,-0.51055473,0.24852753,0.30240974,0.12866661,-0.84918654,-0.3372634,0.46535993,0.22479752,0.7400517,0.4833228,1.3157144,1.270739,0.93192166,0.9926317,0.7777536,-0.8000388,-0.22760339,-0.7243004,-0.90151507,-0.73649806,-0.18375495,-0.9876769,-0.22154166,0.15750378,-0.051066816,1.218425,0.58040893,-0.32723624,0.08092578,-0.41428035,-0.8565249,-1.3621647,0.42233124,0.49325675,1.4729465,0.957077,-0.40788552,-0.7064396,0.67477965,0.74812657,0.17461313,1.2278605,0.42229348,0.00287759,1.6320366,0.045381133,0.8773843,-0.23280792,0.025544237,0.75055337,0.8755495,-0.21244618,-0.6180616,-0.019127166,0.55689186,1.2838972,-0.8412692,0.8461143,0.39903468,0.1857164,-0.025012616,-0.8494315,-0.2573743,-1.1831325,-0.5007239,0.5891477,-1.2416826,0.38735542,0.41872358,1.0267426,0.2482442,-0.060767986,0.7538531,-0.24033615,0.9042795,-0.24176258,-0.44520715,0.7715707,-0.6773665,0.9288903,-0.3960447,-0.041194934,0.29724947,0.8664729,0.07247823,-1.7166628,-1.1924342,-1.1135329,0.4729775,0.5345159,0.57545316,0.14463085,-0.34623942,1.2155776,0.24223511,1.3281958,-1.0329959,-1.3902934,0.09121965,0.18269718,-1.3109862,1.4591801,0.58750343,-0.8072534,0.23610781,-1.4992374,0.71078837,0.25371152,0.85618514,0.807575,1.2301548,-0.27820417,-0.29354396,0.28911537,1.2117325,4.4740834,1.3543533,0.214103,-1.3109514,-0.013579576,-0.53262085,-0.22086248,0.24246897,-0.26330945,0.30646166,-0.21399511,1.5816526,0.64849514,0.31172174,0.57089436,1.0467637,-0.42125005,-0.2877409,0.6157391,-0.6682809,-0.44719923,-0.251028,-1.0622188,-1.5241078,1.3073357,-0.21030799,0.75480264,-1.0422926,0.23265716,0.20796475,0.73489463,0.5507254,-0.04313501,1.30877,0.19338085,0.27448726,0.04000665,-0.7004063,-1.0822202,0.6009482,0.2412081,0.33919787,0.020680452,0.7649121,-0.69652104,-0.5461974,-0.60095215,-0.9746675,0.7837197,1.2018669,-0.23473008,-0.44692823,0.12413922,-1.3088125,-1.4267013,0.82524955,0.8647329,0.16150166,-1.4038807,-0.8987668,0.61025685,-0.8479041,0.59218127,0.65450156,-0.022710972,0.19090322,-0.55995494,0.12569806,0.019536465,-0.5719187,-1.1703067,0.13916619,-1.2546546,0.3547577,-0.6583496,1.4738533,0.15210527,0.045928936,-1.7701638,-1.1357217,0.0656034,0.34817895,-0.9715934,-0.036333986,-0.54871166,-0.28730902,-0.4544463,0.0044411435,-0.091176935,0.5609336,0.8184279,1.7430352,0.14487076,-0.54478693,0.13478011,-0.78083384,-0.5450215,-0.39379802,-0.52507687,0.8898843,-0.46146545,-0.6123672,-0.20210318,0.72413814,-1.3112601,0.20672223,0.73001564,-1.4695473,-0.3112792,-0.048050843,-0.25363198,-1.0228323,-0.071546085,-0.3245472,0.12762389,-0.064207725,-0.46297944,-0.61758167,1.1423731,-1.2279893,1.4896537,-0.61985505,-0.39032778,-1.1789387,-0.05861108,0.33709309,-0.11082967,0.35026795,0.011960861,-0.73383653,-0.5427297,-0.48166794,-1.1341039,-0.07019004,-0.6253811,-0.55956876,-0.87954766,0.0038243965,-1.1747614,-0.2742908,1.3408217,-0.8604027,-0.4190716,1.0705358,-0.17213087,0.2715014,0.8245274,0.06066578,0.82805973,0.47945866,-0.37825295,0.014340248,0.9461009,0.256653,-0.19689955,1.1786914,0.18505198,0.710402,-0.59817654,0.12953508,0.48922333,0.8255816,0.4042885,-0.75975555,0.20467097,0.018755354,-0.69151515,-0.23537838,0.26312333,0.82981825,-0.10950847,-0.25987357,0.33299834,-0.31744313,-0.4765103,-0.8831548,0.056800444,0.07922315,0.5476093,-0.817339,0.22928628,0.5257919,-1.1328216,0.66853505,0.42755872,-0.18290512,-0.49680132,0.7065077,-0.2543334,0.3081367,0.5692426,0.31948256,0.668704,0.72916716,-0.3097971,0.04443544,0.5626836,1.5217534,-0.51814324,-1.2701787,0.6485761,-0.8157134,-0.74196255,0.7771558,-1.3504819,0.2796807,0.44736814,0.6552933,0.13390358,0.5573986,0.099469736,-0.48586744,-0.16189729,0.40172148,-0.18505138,0.3092212,-0.30285,-0.45625964,0.8346098,-0.14941978,-0.44034964,-0.13228996,-0.45626387,-0.5833162,-0.56918347,-0.10052125,0.011119543,-0.423692,-0.36374965,-1.0971813,0.88712555,0.38785303,-0.22129343,0.19810538,0.75521517,-0.34437984,-0.9454472,-0.006488466,-0.42379746,-0.67618704,-0.25211233,0.2702919,-0.6131363,0.896094,-0.4232919,-0.25754875,-0.39714852,1.4831372,0.064787336,-0.770308,0.036396563,0.2313668,0.5655817,-0.6738516,0.857144,0.77432656,0.1454645,-1.3901217,-0.46331334,0.109622695,0.45570934,0.92387015,-0.011060692,0.30186698,-0.35252112,0.1457121,-0.2570497,0.7082791,-0.30265188,-0.23325084,-0.026542446,-0.17957532,1.1194676,0.59331983,-0.34250805,0.39761257,-0.97051114,0.6302743,-1.0416062,-0.14316575,-0.17302139,0.25761867,-0.62417996,0.427799,-0.26894867,0.4448027,-0.6683409,-1.0712901,-0.49355477,0.46255362,-0.26607195,-0.1882482,-1.0833352,-1.2174416,-0.22160827,-0.63442576,-0.20239262,0.08509241,0.27062747,0.3231089,0.75656915,-0.59737813,0.64800847,-0.3792087,0.06189245,-1.0148673,-0.64977705,0.23959091,0.5693892,0.2220355,0.050067283,-1.1472284,-0.05411025,-0.51574,0.9436675,0.08399284,-0.1538182,-0.087096035,0.22088972,-0.74958104,-0.45439938,-0.9840612,0.18691222,-0.27567235,1.4122254,-0.5019997,0.59119046,-0.3159759,0.18572812,-0.8638007,-0.20484222,-0.22735544,0.009947425,0.08660857,-0.43803024,-0.87153643,0.06910624,1.3576175,-0.5727235,0.001615673,-0.5057925,0.93217665,-1.0369575,-0.8864083,-0.76695895,-0.6097337,0.046172515,0.4706499,-0.43419397,-0.7006992,-1.2508268,-0.5113818,0.96917367,-0.65436345,-0.83149797,-0.9900211,0.38023964,0.16216993,-0.11047968] |
-| US | 33386989 | R253N5W74SM7N3 | B00C6MXB42 | 734735137 | YOUNG INDIANA JONES CHRONICLES Volumes 1, 2 and 3 DVD Sets (Complete Collections All 3 Volumes DVD Sets Together) | Video DVD | 4 | 1 | 1 | 0 | 1 | great stuff. I thought excellent for the kids | great stuff. I thought excellent for the kids. The extras are a must after the movie. | 2015-08-31 | 12 | \[0.30739722,-1.2976353,0.44150844,0.28229898,0.8129836,0.19451006,-0.16999333,-0.07356771,0.5831099,-0.5702598,0.5513152,0.9893058,0.8913247,1.2790804,-0.21743622,-0.13258074,0.5267081,-1.1273692,0.08361904,-0.32674226,-0.7284242,-0.3742802,-0.315159,-0.06914908,-0.9370208,0.5965896,-0.46391407,-0.30802932,0.34784046,0.35328323,-0.06566019,-0.83673024,1.2235038,-0.5311309,1.7232236,0.100425154,-0.42236832,-0.4189702,0.65639615,-0.19411941,0.2861547,-0.011099293,0.6224927,0.2937978,-0.57707405,0.1723467,-1.1128687,-0.23458324,0.85969496,-0.5544667,0.69622403,0.20537117,0.5376313,0.18094051,-0.5935286,0.58459294,0.2588672,1.2592428,0.40739542,-0.3853751,0.5736207,-0.27588457,0.44027475,0.06457652,-0.40556684,-0.25630975,-0.0024269535,-0.63066584,1.435617,-0.41023165,-0.39362282,0.9855966,1.1903448,0.8181575,-0.13602419,-1.1992644,0.057811044,0.17973477,1.3552206,0.38971838,-0.021610033,0.19899082,-0.10303763,1.0268506,0.6143311,-0.21900427,2.4331384,-0.7311581,-0.07520742,0.25789547,0.78391874,-0.48391873,1.4095061,0.3000153,-1.1587081,-0.470519,0.63760203,1.212848,-0.13230722,0.1575143,0.5233601,-0.26733217,0.88544065,1.0455207,0.3242259,-0.08548101,-1.1858246,-0.34827423,0.10947221,0.7657727,-1.1886615,0.5846556,-0.06701131,-0.18275288,0.9688948,-0.44766253,-0.24283795,0.84013104,1.1865685,1.0322199,1.1621728,0.2904784,0.45513308,-0.046442263,-1.5924592,1.1268036,1.2244802,-0.12986387,-0.652806,1.3956618,0.09316843,0.0074809124,-0.40963998,0.11233859,0.23004606,1.0019808,-1.1334686,-1.6484728,0.17822856,-0.52497756,-0.97292185,-1.3860162,-0.10179921,0.41441512,0.94668996,0.6478229,-0.1378847,0.2240062,0.12373086,0.37892383,-1.0213026,-0.002514686,-0.6206891,-1.2263044,-0.81023514,-2.1251488,-0.05212076,0.5007569,-0.10503322,-0.15165941,0.80570364,-0.67640734,-0.38113695,-0.7051068,-0.7457319,-1.1459444,1.2534835,-0.48408872,0.20323983,0.49218604,-0.01939073,0.42854333,0.871685,0.3215819,-0.016663345,0.492181,0.93779576,0.59563607,1.2095222,-0.1319952,-0.74563706,-0.7584777,-0.06784309,1.0673252,-0.18296064,1.180183,-0.01517544,-0.996551,1.4614015,-0.9834482,-0.8929142,-1.1343371,1.2919606,0.67674285,-1.264175,-0.78025484,-0.91170585,0.6446593,-0.44662225,-0.02165111,-0.34166083,0.23982073,-0.0695019,-0.55098635,0.061257105,0.14019178,0.58004445,-0.22117937,0.20757008,-0.47917584,-0.23402964,0.07655301,-0.28613323,-0.24914591,-0.40391505,-0.53980047,1.0352598,0.08218856,-0.21157777,0.5807184,-1.4730825,0.3812591,0.83882,0.5867736,0.74007905,1.0515761,-0.15946862,1.1032714,0.58210975,-1.3155121,-0.74103445,-0.65089387,0.8670826,0.43553326,-0.6407162,0.47036576,1.5228021,-0.45694724,0.7269809,0.5492361,-1.1711032,0.23924577,0.34736052,-0.12079343,-0.09562126,0.74119747,-0.6178057,1.3842496,-0.24629863,0.16725276,0.543255,0.28207174,0.58856744,0.87834567,0.50831103,-1.2316333,1.2317014,-1.0706112,-0.16112426,0.6000713,0.5483024,-0.13964792,-0.75518215,-0.98008883,0.6262824,-0.056649026,-0.14632829,-0.6952095,1.1196847,0.16559249,0.8219887,0.27358034,-0.37535465,-0.45660818,0.47437778,0.54943615,0.6596993,1.3418778,0.088481836,-1.0798514,-0.20523094,-0.043823265,-0.03007651,0.6147437,-1.2054923,0.21634094,0.5619677,-0.38945594,1.1649859,0.67147845,-0.67930675,0.25937733,-0.41399506,0.14421114,0.8055827,0.11315601,-0.25499323,0.5075335,-0.96640706,0.86042404,0.27332047,-0.262736,0.1961017,-0.85305786,-0.32757896,0.008568222,-0.46760023,-0.5723287,0.353183,0.20126922,-0.022152433,0.39879513,-0.57369196,-1.1627877,-0.948688,0.54274577,0.52627236,0.7573314,-0.72570753,0.22652717,0.5562541,0.8202502,-1.0198171,-1.3022298,-0.2893229,-0.0275145,-0.46199337,0.119201764,0.73928577,0.05394686,0.5549575,0.5820973,0.5786865,0.4721187,-0.75830203,-1.2166464,-0.83674186,-0.3327995,-0.41074058,0.12167103,0.5753096,-0.39288408,0.101028144,-0.076566614,0.28128016,0.30121502,-0.45290747,0.3249064,0.29726675,0.060289554,1.012353,0.5653782,0.50774586,-1.1048855,-0.89840156,0.04853676,-0.0005516126,-0.43757257,0.52133596,0.90517247,1.2548338,0.032170154,-0.45365888,-0.32101494,0.52082396,0.06505445,-0.016106995,-0.15512307,0.4979914,0.019423941,-0.4410003,0.13686578,-0.55569375,-0.22618975,-1.3745868,0.14976598,0.31227916,0.22514923,-0.09152527,0.9595029,-0.24047574,0.9036276,0.06045522,0.4275914,-1.6211287,0.23627052,-0.123569466,1.0207809,-0.20820981,0.2928954,-0.37402752,-0.39281377,-0.9055283,0.42601687,-0.64971703,-0.83537567,-0.7551133,-0.3613483,-1.2591509,0.38164553,0.23480861,0.67463505,0.4188478,0.30875853,-0.23840418,-0.10466987,-0.45718357,-0.47870898,-0.7566724,-0.124758095,0.8912765,0.37436476,0.123713054,-0.9435858,-0.19343798,-0.7673082,0.45333877,-0.1314696,-0.046679523,-1.0924501,-0.36073965,-0.55994475,-0.25058964,0.6564909,-0.44103456,0.2519441,0.791008,0.7515483,-0.27565363,0.7055519,1.195922,0.37065807,-0.8460473,-0.070156336,0.46037647,-0.42738107,-0.40138105,0.13542275,-0.16810405,-0.17116192,-1.0791,0.094485305,0.499162,-1.3476236,0.21234894,-0.45902762,0.30559424,-0.75315285,-0.18889536,-0.18098111,0.6468135,-0.027758462,-0.4563393,-1.8142252,-1.1079813,0.15492673,0.67000175,1.7885993,-1.163623,-0.19585003,-1.265403,-0.65268534,0.8609888,-0.12089075,0.16340052,-0.40799433,0.1796395,-0.6490773,-1.1581244,-0.69040763,0.9861761,-0.94788885,-0.23661669,-0.26939982,-0.10966676,-0.2558066,0.11404798,0.2280753,1.1175905,1.2406538,-0.8405682,-0.0042185634,0.08700524,-1.490236,-0.83169794,0.80318516,-0.2759455,-1.2379494,1.2254013,-0.574187,-0.589692,-0.30691916,-0.23825237,-0.26592287,-0.34925,-1.1334181,0.18125409,-0.15863669,0.5677274,0.15621394,0.69536006,-0.7235879,-0.4440141,0.72681504,-0.071697086,-0.28574806,0.1978488,-0.29763848,-1.3379228,-1.7364287,0.4866264,-0.4246215,0.39696288,-0.39847228,-0.43619227,0.74066365,1.3941747,-0.980746,0.28616947,-0.41534734,-0.37235045,-0.3020338,-0.078414746,0.5320422,-0.8390588,0.39802805,0.9956247,0.48060423,1.0830654,-0.3462163,0.1495632,-0.70074755,-1.4337711,-0.47201052,-0.20542778,1.4469681,-0.28534025,-0.8658506,0.43706423,-0.031963903,-1.1208986,0.24726066,-0.15195882,1.6915563,0.48345947,0.36665258,-0.84477395,-0.67024755,-1.3117748,0.5186414,-0.111863896,-0.24438074,0.4496351,-0.16038479,-0.6309886,0.30835655,0.5210999,-0.08546635,0.8993058,0.79404515,0.6026624,1.415141,0.99138695,0.32465398,0.40468198,1.0601974,-0.18599145,-0.13816476,-0.6396179,-0.3233479,0.03862472,-0.17224589,0.09181578,-0.07982533,-0.5043218,1.0261234,0.18545899,-0.49497896,-0.54437244,-0.7879132,0.5358195,-1.6340284,0.25045714,-0.8396354,0.83989215,0.3047345,-0.49021208,0.05403753,1.0338433,0.6628198,-0.3480594,1.3061327,0.54290605,-0.9569749,1.8446399,-0.030642787,0.87419564,-1.2377026,0.026958525,0.50364405,1.1583173,0.38988844,-0.101992935,-0.23575047,-0.3413202,0.7004839,-0.94112486,0.46198457,-0.35058874,-0.039545525,0.23826565,-0.7062571,-0.4111793,0.25476676,-0.6673185,1.0281954,-0.9923886,0.35417762,0.42138654,1.6712382,0.408056,-0.11521088,-0.13972034,-0.14252779,-0.30223042,-0.33124694,-0.811924,0.28540173,-0.7444932,0.45001662,0.24809383,-0.35693368,0.9220196,0.28611687,-0.48261562,-0.41284987,-0.9931806,-0.8012102,-0.06244095,0.27006462,0.12398263,-0.9655248,-0.5692315,0.61817557,0.2861948,1.370767,-0.28261876,-1.6861429,-0.28172758,-0.25411567,-0.61593235,0.9216087,-0.09091336,-0.5353816,0.8020888,-0.508142,0.3009135,1.110475,0.03977944,0.8507262,1.5284235,0.10842794,-0.20826894,0.65857565,0.36973011,4.5352683,0.5847559,-0.11878182,-1.5029415,0.28518912,-1.6161069,0.024860675,-0.044661783,-0.28830758,-0.3638917,0.10329107,1.0316309,1.9032342,0.7131887,0.5412085,0.624381,-0.058650784,-0.99251175,0.61980045,-0.28385028,-0.79383695,-0.70285636,-1.2722979,-0.91541255,0.68193483,0.2765532,0.34829107,-0.4023206,0.25704393,0.5214571,0.13212398,0.28562054,0.20593974,1.0513201,0.9532814,0.095775016,-0.03877548,-0.33986154,-0.4798648,0.3228808,0.6315719,-0.10437137,0.14374955,0.48003596,-1.2454797,-0.40197062,-0.6159714,-0.6270214,0.25393748,0.72447217,-0.56466436,-0.958443,-0.096530266,-1.5505805,-1.6704174,0.8296298,0.05975852,-0.21028696,-0.5795715,-0.36282688,-0.24036546,-0.41609624,0.43595442,-0.14127952,0.6236689,-0.18053003,-0.38712737,0.70119154,-0.21448976,-0.9455639,-0.48454222,0.8712007,-0.94259155,1.1402144,-1.8355223,0.99784017,-0.10760504,0.01682847,-1.6035974,-1.2844374,0.01041493,0.258503,-0.46182942,-0.55694705,-0.36024556,-0.60274285,-0.7641168,-0.22333422,0.23358914,0.32214895,-0.2880609,2.0434432,0.021884317,-0.026297037,0.6764826,0.0018281384,-1.4232233,0.06965969,-0.6603106,1.7217827,-0.55071676,-0.5765741,0.41212377,0.47296098,-0.74749064,0.8318265,1.0190908,-0.30624846,0.1550751,-0.107695036,0.318128,-0.91269255,-0.084052026,-0.071086854,0.58557767,-0.059559256,-0.25214714,-0.37190074,0.1845709,-1.011793,1.6667081,-0.59240544,0.62364835,-0.87666374,0.5493202,0.15618894,-0.55065084,-1.1594291,0.013051172,-0.58089346,-0.69672656,-0.084555894,-1.002506,-0.12453595,-1.3197669,-0.6465615,0.18977834,0.70997524,-0.1717262,-0.06295184,0.7844014,-0.34741658,-0.79253453,0.50359297,0.12176384,0.43127277,0.51099414,-0.4762928,0.6427185,0.5405122,-0.50845987,-0.9031403,1.4412987,-0.14767419,0.2546413,0.1589461,-0.27697682,-0.2348109,-0.36988798,0.48541197,0.055055868,0.6457861,0.1634515,-0.4656323,0.09907467,-0.14479966,-0.7043871,0.36758122,0.37735868,1.0355871,-0.9822478,-0.19883083,-0.028797302,0.06903542,-0.72867984,-0.83410156,-0.44142655,-0.023862194,0.7508692,-1.2131448,0.73933,0.82066983,-0.9567533,0.8022456,-0.46039414,-0.122145995,-0.57758415,1.6009285,-0.38629133,-0.719489,-0.26290792,0.2784449,0.4006592,0.7685309,0.021456026,-0.46657726,-0.045093264,0.27306503,0.11820289,-0.010290818,1.4277694,0.37877312,-0.6586902,0.6534258,-0.4882668,-0.013708393,0.5874833,0.67575705,0.0448849,0.79752296,-0.48222196,-0.27727848,0.1908209,-0.37270054,0.2255683,0.49677694,-0.8097378,-0.041833293,1.0997742,0.24664953,-0.13645545,0.60577506,-0.36643773,-0.38665995,-0.30393195,0.8074676,0.71181476,-1.1759185,-0.43375242,-0.54943913,0.60299504,-0.29033506,0.35640588,0.2535554,0.23497777,-0.6322611,-1.0659716,-0.5208576,-0.20098525,-0.70759755,-0.20329496,0.06746797,0.4192544,0.9459473,0.3056658,-0.41945052,-0.6862448,0.92653894,-0.28863263,0.1017883,-0.16960514,0.43107504,0.6719024,-0.19271156,0.84156036,1.4232695,0.23043889,-0.36577883,0.1706496,0.4989679,1.0149425,1.6899607,-0.017684896,0.14658369,-0.5460582,0.25970757,0.21367438,-0.23919336,0.00311709,0.24278529,-0.054968767,-0.1936215,1.0572686,1.1302485,-0.14131032,0.70154583,-0.6389119,0.56687975,-0.7653478,0.73563385,0.34357715,0.54296106,-0.289852,0.8999764,-0.51342,0.42874512,-0.15059376,-0.38104424,-1.255755,0.8929743,0.035588194,-0.032178655,-1.0616962,-1.2204084,-0.23632799,-1.692825,-0.23117402,0.57683736,0.50997025,-0.374657,1.6718119,0.41329297,1.0922033,-0.032909054,0.52968246,-0.15998183,-0.8479956,-0.08485309,1.350768,0.4181131,0.2278139,-0.4233213,0.77379596,0.020778842,1.4049225,0.6989054,0.38101918,-0.14007418,-0.020670284,-0.65089977,-0.9920829,-0.373814,0.31086117,-0.43933883,1.1054604,-0.30419546,0.3853193,-1.0691531,-0.010626761,-1.2146289,-0.41391885,-0.5968098,0.70136315,0.17279832,0.030435344,-0.8829543,-0.27144116,0.045436643,-1.4135028,0.70108044,-0.73424995,1.0382471,0.89125097,-0.6630885,-0.22839329,-0.631642,0.2600539,1.0844377,-0.24859901,-1.2038339,-1.1615102,0.013521354,2.0688252,-1.1227499,0.40164688,-0.57415617,0.18793584,0.39685404,0.27067253] |
-| US | 45486371 | R2D5IFTFPHD3RN | B000EZ9084 | 821764517 | Survival Island | Video DVD | 4 | 1 | 1 | 0 | 1 | Four Stars | very good | 2015-08-31 | 13 | \[-0.04560827,-1.0738801,0.6053605,0.2644575,0.046181858,0.92946494,-0.14833489,0.12940715,0.45553935,-0.7009164,0.8873173,0.8739785,0.93965644,0.99645066,-0.3013455,0.009464348,0.49103707,-0.31142452,-0.698856,-0.68302655,0.09756764,0.08612168,-0.10133423,0.74844116,-1.1546779,-0.478543,-0.33127898,0.2641717,-0.16090837,0.77208316,-0.20998663,-1.0271599,-0.21180272,-0.441733,1.3920364,-0.29355,-0.14628173,-0.1670586,0.38985613,0.7232808,-0.1478917,-1.2944599,0.079248585,0.804303,-0.22106579,0.17671943,-0.16625091,-0.2116828,1.3004253,-1.0479127,0.7193388,-0.26320568,1.4964588,-0.10538341,-0.3048142,0.35343128,0.2383181,1.8991082,-0.18256101,-0.58556455,0.3282545,-0.5290774,1.0674107,0.5099032,-0.6321608,-0.19459783,-0.33794925,-1.2250574,0.30687732,0.10018553,-0.38825148,0.5468978,0.6464592,0.63404274,0.4275827,-0.4252685,0.20222056,0.37558758,0.67473555,0.43457538,-0.5480667,-0.5751551,-0.5282744,0.6499875,0.74931085,-0.41133487,2.1029837,-0.6469921,-0.36067986,0.87258714,0.9366592,-0.5068644,1.288624,0.42634118,-0.88624424,0.023693975,0.82858825,0.53235066,-0.21634954,-0.79934657,0.37243468,-0.43083912,0.6150686,0.9484009,-0.18876135,-0.24328673,-0.2675956,-0.6934638,-0.016312882,0.9681279,-0.93228894,0.49323967,0.08511063,-0.058108483,-0.10482833,-0.49948782,-0.50077546,0.16938816,0.6500032,1.2108738,0.98961586,0.47821587,0.88961387,-0.5261087,-0.97606266,1.334534,0.4484072,-0.15161656,-0.6182878,1.3505218,0.07164596,0.41611874,-0.19641197,0.055405065,0.7972649,0.10020526,-1.0767709,-0.90705204,0.48867372,-0.46962035,-0.7453811,-1.4456259,0.02953603,1.0104666,1.1868577,1.1099546,0.40447012,-0.042927116,-0.37483892,-0.09478704,-1.223529,-0.8275733,-0.2067015,-1.0913882,-0.3732751,-1.5847363,0.41378438,-0.29002684,-0.2014314,-0.016470056,0.32161012,-0.5640414,-0.14769524,-0.43124712,-1.4276416,-0.10542446,1.5781338,-0.2290403,0.45508677,0.080797836,0.16426548,0.63305223,1.0155399,0.28184965,0.25335202,-0.6090523,1.181813,-0.5924076,1.4182706,-0.3111642,0.12979284,-0.5306278,-0.592878,0.67098105,-0.3403599,0.8093008,-0.425102,-0.20143461,0.88729143,-1.3048863,-0.8509538,-0.64478755,0.72528464,0.27115706,-0.91018283,-0.37501037,-0.25344363,-0.28149638,-0.65170574,0.058373883,-0.279707,0.3435093,0.15421666,-0.08175891,0.37342703,1.1068349,0.370284,-1.1112201,0.791234,-0.33149278,-0.906468,0.77429736,-0.16918264,0.07161721,-0.020805538,-0.19074778,0.9714475,0.4217115,-0.99798465,0.23597187,-1.1951764,0.72325313,1.371934,-0.2528682,0.17550357,1.0121015,-0.28758067,0.52312744,0.08538565,-0.9472321,-0.7915376,-0.41640997,0.83389455,0.6387671,0.18294477,0.1850706,1.3700297,-0.43967843,0.9739228,0.25433502,-0.7903001,0.29034948,0.4432687,0.23781417,0.64576876,0.89437866,-0.92056245,0.8566781,0.2436927,-0.06929546,0.35795254,0.7436991,0.21376142,0.23869698,0.14639515,-0.87127894,0.8130877,-1.0923429,-0.3279097,0.09232058,-0.19745012,0.31907612,-1.0878816,-0.04473375,0.4249065,0.34453565,0.45376292,-0.5525641,1.6031032,-0.017522424,-0.04903584,-0.2470398,-0.06611821,-0.33618444,0.04579974,0.28910857,0.5733638,1.1579076,-0.123608775,-1.1244149,-0.32105175,-0.0028353594,0.6315558,0.20455408,-1.0754945,0.2644,0.24109934,0.042885803,1.597761,0.20982133,-1.1588631,0.47945598,-0.59829426,-0.45671254,0.15635385,-0.25241938,0.2880083,0.17821103,-0.16359845,0.35200477,1.0819628,-0.4892587,0.24970399,-0.43380582,-0.5588407,0.31640014,-0.10481888,0.10812894,0.13438466,1.0478258,0.5863666,0.035384405,-0.30704767,-1.6373035,-1.2590733,0.9295908,0.1164237,0.68977344,-0.36746788,-0.40554866,0.64503556,0.42557728,-0.6643828,-1.2095946,0.5771222,-0.6911773,-0.96415323,0.07771304,0.8753759,-0.60232115,0.5423659,0.037202258,0.9478343,0.8238534,-0.04875912,-1.5575435,-0.023152929,-0.16479905,-1.123967,0.00679872,1.4028634,-0.9268266,-0.17736283,0.17429933,0.08551961,1.1467109,-0.09408428,0.32461596,0.5739471,0.41277337,0.4900577,0.6426135,-0.28586757,-0.7086031,-1.2137725,0.45787215,0.16102555,0.27866384,0.5178121,0.7158286,1.0705677,0.07049831,-0.85161424,-0.3042984,0.42947394,0.060441002,-0.06413476,-0.25434074,0.020860653,0.18758196,-0.3637798,0.48589218,-0.38999668,-0.23843117,-1.7653351,-0.040434383,0.5825778,0.30748087,0.06381909,0.81247973,-0.39792076,0.7121066,0.2782456,0.59765404,-1.3232024,0.34060842,0.19809672,0.41175848,0.24246249,0.25381815,-0.44391263,-0.07614571,-0.87287176,0.33984363,-0.21994372,-1.4966714,0.10044764,-0.061777685,-0.71176904,-0.4737114,-0.057971925,1.3261204,0.49915332,0.3063325,-0.0374391,0.013750633,-0.19973677,-0.089847654,0.121245734,0.11679503,0.61989266,0.023939274,0.51651406,-0.7324229,0.19555955,-0.9648657,1.249217,-0.055881638,0.40515238,0.3683988,-0.42780614,-0.24780461,-0.032880165,0.6969112,0.66245943,0.54872966,0.67410636,0.35999185,-1.1955742,0.38909116,0.9214033,-0.5265669,-0.16324537,-0.49275506,-0.27807295,0.33720574,-0.6482551,0.6556906,0.09675206,0.035689153,-1.4017167,-0.42488196,0.53470165,-0.9318509,0.06659188,-0.9330244,-0.6317253,-0.5170034,-0.090258315,0.067027874,0.47430456,0.34263068,-0.034816273,-1.8725855,-2.0368457,0.43204042,0.3529114,1.3256972,-0.57799745,0.025022656,-1.2134962,-0.6376366,1.2210813,-0.8623049,0.47356188,-0.48248583,-0.30049723,-0.7189453,-0.6286008,-0.7182035,0.337718,-0.11861088,-0.67316926,0.03807467,-0.4894712,0.0021176785,0.6980891,0.24103045,0.54633296,0.58161646,-0.44642344,-0.16555169,0.7964468,-1.2131425,-0.67829454,0.4893405,-0.38461393,-1.1225401,0.44452366,-0.30833852,-0.6711606,0.051745616,-0.775163,-0.2677435,-0.39321816,-0.74936676,0.16192177,-0.059772447,0.68762016,0.53828514,0.6541142,-0.5421721,-0.26251954,-0.023202112,0.3014187,0.008828241,0.79605895,-0.3317026,-0.7724727,-1.2411877,0.31939238,-0.096119456,0.47874188,-0.7791832,-0.22323853,-0.08456612,1.0795188,-0.7827005,-0.28929207,0.46884036,-0.42510015,0.16214833,0.3501767,0.36617047,-1.119466,0.19195387,0.85851586,0.18922725,0.94338834,-0.32304144,0.4827557,-0.81715256,-1.4261038,0.49614763,0.062142983,1.249345,0.2014524,-0.6995533,-0.15864229,0.38652128,-0.659232,0.11766203,-0.2557698,1.4296027,0.9037317,-0.011628535,-1.1893693,-0.956275,-0.18136917,0.3941797,0.39998764,0.018311564,0.27029866,0.14892557,-0.48989707,0.05881763,0.49618796,-0.11214719,0.71434236,0.35651416,0.8689908,1.0284718,0.9596098,-0.009955626,0.40186208,0.4057858,-0.28830874,-0.72128904,-0.5276375,-0.44327998,-0.025095768,-0.7058158,-0.16796891,0.12855923,-0.34389406,0.4430077,0.16097692,-0.58964425,-0.80346566,0.32405907,0.06305365,-1.5064402,0.2241937,-0.6216805,0.1358616,0.3714332,-0.99806577,-0.22238642,0.33287752,0.14240637,-0.29236397,1.1396701,0.23270036,0.5262793,1.0991998,0.2879055,0.22905749,-0.95235413,0.52312446,0.10592761,0.30011278,-0.7657238,0.16400222,-0.5638396,-0.57501423,1.121968,-0.7843481,0.09353633,-0.18324867,0.21604645,-0.8815248,-0.07529478,-0.8126517,-0.011605805,-0.50744057,1.3081754,-0.852715,0.39023215,0.7651248,1.68998,0.5819176,-0.02141522,0.5877081,0.2024052,0.09264247,-0.13779058,-1.5314059,1.2719066,-1.0927896,0.48220706,0.05559338,-0.20929311,-0.4278733,0.28444275,-0.0008470379,-0.09534583,-0.6519637,-1.4282455,0.18477388,0.9507184,-0.6751443,-0.18364592,-0.37007314,1.0216024,0.6869564,1.1653348,-0.7538794,-1.3345296,0.6104916,0.08152369,-0.8394207,0.87403923,0.5290044,-0.56332856,0.37691587,-0.45009997,-0.17864561,0.5992149,-0.25145024,1.0287454,1.4305328,-0.011586349,0.3485581,0.66344,0.18219411,4.940573,1.0454609,-0.23867694,-0.8316158,0.4034564,-0.49062842,0.016044907,-0.22793365,-0.38472247,0.2440083,0.41246706,1.1865108,1.2949868,0.4173234,0.5325333,0.5680148,-0.07169041,-1.005387,0.965118,-0.340425,-0.4471613,-0.40878603,-1.1905128,-1.1868874,1.2017782,0.53103817,0.3596472,-0.9262005,0.31224424,0.72889113,0.63557464,-0.07019187,-0.68807346,0.69582283,0.45101142,0.014984587,0.577816,-0.1980364,-1.0826674,0.69556504,0.88146895,-0.2119645,0.6493935,0.9528447,-0.44620317,-0.9011973,-0.50394785,-1.0315249,-0.4472283,0.7796344,-0.15637895,-0.16639937,-0.20352335,-0.68020046,-0.98728025,0.64242256,0.31667972,-0.71397847,-1.1293691,-0.9860645,0.39156264,-0.69573534,0.30602834,-0.1618791,0.23074874,-0.3379239,-0.12191323,1.6582693,0.2339738,-0.6107068,-0.26497284,0.17334077,-0.5923304,0.10445539,-0.7599427,0.5096536,-0.20216745,0.049196683,-1.1881349,-0.9009607,-0.83798426,0.44164553,-0.48808926,-0.04667333,-0.66054153,-0.66128224,-1.7136352,-0.7366011,-0.31853634,0.30232653,-0.10852443,1.9946622,0.13590258,-0.76326686,-0.25446486,0.32006142,-1.046221,0.30643058,0.52830505,1.7721215,0.71685624,0.35536727,0.02379851,0.7471644,-1.3178513,0.26788896,1.0505391,-0.8308426,-0.44220716,-0.2996315,0.2289448,-0.8129853,-0.32032526,-0.67732286,0.49977696,-0.58026063,-0.4267268,-1.165912,0.5383717,-0.2600939,0.4909254,-0.7529048,0.5186025,-0.68272185,0.37688586,-0.16525345,0.68933797,-0.43853116,0.2531767,-0.7273167,0.0042542545,0.2527112,-0.64449465,-0.07678814,-0.57123,-0.0017966144,-0.068321034,0.6406287,-0.81944615,-0.5292494,0.67187285,-0.45312735,-0.19861545,0.5808865,0.24339013,0.19081701,-0.3795915,-1.1802675,0.5864333,0.5542488,-0.026795216,-0.27652445,0.5329341,0.29494807,0.5427568,0.84580654,-0.39151683,-0.2985327,-1.0449492,0.69868237,0.39184457,0.9617548,0.8102169,0.07298472,-0.5491848,-1.012611,-0.76594234,-0.1864931,0.5790788,0.32611984,-0.7400497,0.23077846,-0.15595563,-0.06170243,-0.26768005,-0.7510913,-0.81110775,0.044999585,1.3336306,-1.774329,0.8607937,0.8938075,-0.9528547,0.43048507,-0.49937993,-0.61716783,-0.58577335,0.6208,-0.56602585,0.6925776,-0.50487256,0.80735886,0.36914152,0.6803319,0.000295409,-0.28081727,-0.65416694,0.9890088,0.5936174,-0.38552138,0.92602617,-0.46841428,-0.07666884,0.6774499,-1.1728637,0.23638526,0.35253218,0.5990712,0.47170952,1.1473405,-0.6329502,0.07515354,-0.6493073,-0.7312147,0.003280595,0.53415585,-0.84027874,0.21279827,0.73492074,-0.08271271,-0.6393985,0.21382183,-0.5933761,0.26885328,0.31527188,-0.17841923,0.8519613,-0.87693113,0.14174065,-0.3014772,0.21034332,0.7176752,0.045435462,0.43554127,0.7759069,-0.2540516,-0.21126957,-0.1182913,0.504212,0.07782592,-0.06410891,-0.016180445,0.16819397,0.7418499,-0.028192373,-0.21616131,-0.46842667,0.8750199,0.16664875,0.4422129,-0.24636972,0.011146031,0.5407099,-0.1995775,0.9732007,0.79718286,-0.3531048,-0.17953855,-0.30455542,-0.011377579,-0.21079576,1.3742573,-0.4004308,-0.30791727,-1.06878,0.53180254,0.3412094,-0.06790889,0.08864223,-0.6960799,-0.12536404,0.24884924,0.9308994,0.46485603,0.12150945,0.8934372,-1.6594642,0.27694207,-1.1839775,-0.54069275,0.2967536,0.94271827,-0.21412376,1.5007582,-0.75979245,0.4711972,-0.005775435,-0.13180988,-0.9351274,0.5930414,0.23131478,-0.4255422,-1.1771399,-0.49364802,-0.32276222,-1.6043308,-0.27617428,0.76369554,-0.19217926,0.12788418,1.9225345,0.35335732,1.6825448,0.12466301,0.1598846,-0.43834555,-0.086372584,0.47859296,0.79709494,0.049911886,-0.52836734,-0.6721834,0.21632576,-0.36516222,1.6216894,0.8214337,0.6054308,-0.41862285,0.027636342,-0.1940268,-0.43570083,-0.14520688,0.4045223,-0.35977545,1.8254343,-0.31089872,0.19665615,-1.1023157,0.4019758,-0.4453815,-1.0864284,-0.1992614,0.11380532,0.16687272,-0.29629833,-0.728387,-0.5445154,0.23433375,-1.5238215,0.71899056,-0.8600819,1.0411007,-0.05895088,-0.8002717,-0.72914296,-0.59206986,-0.28384188,0.4074883,0.56018656,-1.068546,-1.021818,-0.050443307,1.116262,-1.3534596,0.6736171,-0.55024904,-0.31289905,0.36604482,0.004892461] |
-| US | 14006420 | R1CECK3H1URK1G | B000CEXFZG | 115883890 | Teen Titans - The Complete First Season (DC Comics Kids Collection) | Video DVD | 5 | 0 | 0 | 0 | 1 | Five Stars | Kids love the DVD. It came quickly also. | 2015-08-31 | 14 | \[-0.6312561,-1.7367789,1.2021036,-0.048960943,0.20266847,-0.53402656,0.22530322,0.58472973,0.7067528,-0.4026424,0.48143443,1.320443,1.390252,0.8614183,-0.27450773,-0.5175409,0.35882184,0.029378487,-0.7798119,-0.9161627,0.21374469,-0.5097005,0.08925354,-0.03162415,-0.777172,0.26952067,0.21780597,-0.25940415,-0.43257955,0.5047774,-0.62753534,-0.18389052,0.3908125,-0.8562782,1.197537,-0.072108865,-0.26840302,0.1337818,0.5329664,-0.02881749,0.18806009,0.15675639,-0.46279088,0.33493695,-0.5976519,0.17071217,-0.79716325,0.1967204,1.1276897,-0.20772636,0.93440086,0.34529057,0.19401568,-0.41807452,-0.86519367,0.47235286,0.33779994,1.5397296,-0.18204026,-0.016024688,0.24120326,-0.17716222,0.3138746,-0.20993066,-0.09079028,0.25766942,-0.07014277,-0.8694822,0.64777964,-0.057605933,-0.28278375,0.8075776,1.8393523,0.81496745,-0.004307902,-0.84534615,-0.03156269,0.010678162,1.8573742,0.20478101,-0.1694233,0.3143575,-0.598893,0.80677253,0.6163861,-0.46703136,2.229697,-0.53163594,-0.32738847,-0.024545679,0.729927,-0.3483534,1.2920879,0.25684443,0.34726465,0.2070297,0.47215447,1.5762097,0.5379836,-0.011129107,0.83513135,0.18692249,0.2752282,0.6455876,0.129197,-0.5211538,-1.3686453,-0.44263896,-1.0396893,0.32529148,-1.4775138,0.16855894,-0.22110634,0.5737801,1.1978029,-0.3934193,-0.2697715,0.62218326,1.4344715,0.82834864,0.766156,0.3510282,0.59684426,-0.1322549,-0.9330995,1.8485514,0.6753625,-0.33342996,-0.23867355,0.8621254,-0.4277517,-0.26068765,-0.67580503,0.13551037,0.44111,1.0628351,-1.1878395,-1.2636286,0.55473286,0.18764772,-0.06866432,-2.0283139,0.46497917,0.5886715,0.30433393,0.3501315,0.23519383,0.5980003,0.36994958,0.30603382,-0.8369203,-0.25988623,-0.93126506,-0.873884,-0.5146805,-1.8220243,-0.28068694,0.39212993,0.20002748,-0.47740325,-0.251296,-0.85625666,-1.1412939,-0.73454237,-0.7070889,-0.8038149,1.5993606,-0.42553523,0.29790545,0.75804514,-0.14183688,1.28933,0.60941213,0.89150697,0.10587394,0.74460125,0.61516047,1.3431324,0.8083828,-0.11270667,-0.5399225,-0.609704,-0.07033227,0.37664047,-0.17491077,1.3854522,-0.41539654,-0.4362298,1.1235062,-1.8496975,-2.0035222,-0.49260524,1.3446016,-0.031373296,-1.3091855,-0.19887531,-0.49534202,0.4523722,-0.16276014,-0.08273346,-0.5079003,-0.124883376,0.099591255,-0.8943932,-0.1293136,0.9836214,0.548599,-0.78369313,0.19080715,-0.088178605,-0.6870386,0.58293986,-0.39954463,-0.19963749,-0.37985775,-0.24642159,0.5121634,0.6653276,-0.4190921,1.0305376,-1.4589696,0.28977314,1.3795608,0.5321369,1.1054996,0.5312297,-0.028157832,0.4668366,1.0069275,-1.2730085,-0.11376997,-0.7962425,0.49372005,0.28656003,-0.30227122,0.24839808,1.923211,-0.37085673,0.3625795,0.16379173,-0.43515328,0.4553001,0.08762408,0.105411,-0.964348,0.66819906,-0.6617094,1.5985628,-0.23792887,0.32831386,0.38515973,-0.293926,0.5914876,-0.12198629,0.45570955,-0.703119,1.2077283,-0.82626694,-0.28149354,0.7069072,0.31349573,0.4899691,-0.4599767,-0.8091348,0.30254528,0.08147084,0.3877693,-0.79083973,1.3907013,-0.25077394,0.9531004,0.3682364,-0.8173011,-0.09942776,0.2869549,-0.045799185,0.5354464,0.6409063,-0.20659842,-0.9725278,-0.26192304,0.086217284,0.3165221,0.44227958,-0.7680571,0.5399834,0.6985113,-0.52230656,0.6970132,0.373832,-0.70743656,0.20157939,-0.6858654,-0.50790364,0.2795364,0.29279485,-0.012475173,0.076419905,-0.40851966,0.82844526,-0.48934165,-0.5245244,-0.20289789,-0.8136387,-0.5363099,0.48981985,-0.76652956,-0.1211052,-0.056907576,0.4420836,0.066036455,0.41965017,-0.6063774,-0.8071671,-1.0445249,0.66432387,0.5274697,1.0376729,-0.7697964,-0.37606835,0.3890853,0.6605356,-0.14112039,-1.5217428,-0.15197764,-0.3213161,-1.1519533,0.60909057,0.9403774,-0.27944884,0.7312047,-0.3696203,0.74681044,1.2170473,-0.69628173,-1.6213799,-0.5346468,-0.6516008,-0.33496094,-0.43141463,1.2713503,-0.8897746,-0.087588705,-0.46260807,0.5793111,0.09900403,-0.17237963,0.62258226,0.21377154,-0.010726848,0.6530878,-0.2783685,0.00858428,-1.1332816,-0.6482847,0.7085231,0.36013532,-0.92266655,0.22018129,0.9001391,0.92635745,-0.008031485,-0.5917975,-0.568456,-0.06777777,0.8137389,-0.09866476,-0.22243339,0.64311814,-0.18830536,-0.39094377,0.19102454,-0.16511707,0.025081763,-1.8210138,-0.2697892,0.6846239,0.2854376,0.18948092,1.413507,-0.32061276,1.068837,-0.43719074,0.26041105,-1.3256634,-0.3310394,-0.727746,0.5768826,0.12309951,0.64337856,-0.35449612,0.5904533,-0.93767214,0.056747835,-0.96975976,-0.50144833,-0.68525606,0.08461835,-0.956482,0.39153412,-0.47589955,1.1512613,-0.15391372,0.22249506,0.34223804,-0.30088118,-0.12304757,-0.887302,-0.41605315,-0.4448053,0.11436053,0.36566892,0.051920563,-1.0589696,-0.21019076,-0.5414011,0.57006586,0.25899884,0.27656814,-1.2040092,-1.0228744,-0.9569173,-0.40212157,0.24625045,0.0363089,0.67136663,1.2104007,0.5976004,0.3837572,1.1889356,0.8584326,-0.19918711,-0.694845,-0.114167996,-0.108385384,-0.40644845,-0.8660314,0.7782318,0.1538889,-0.33543634,-1.2151926,0.15467443,0.68193775,-1.2943494,0.5995984,-0.954463,0.08679533,-0.70457053,-0.13386653,-0.49978074,0.75912595,0.6441198,-0.24760693,-1.6255957,-1.1165076,0.06757002,0.424513,0.8805125,-1.3958868,0.20875917,-1.9329861,-0.23697405,0.55918163,-0.23028342,0.7898856,-0.31575334,-0.10341185,-0.59226173,-0.6364673,-0.70446855,0.8730485,-0.3070955,-0.62998897,-0.25874397,-0.36943534,-0.006459128,0.19268708,0.25422436,0.7851406,0.5298526,-0.7919893,0.2925912,0.2669904,-1.3556485,-0.3184692,0.6531485,-0.43356547,-0.7023434,0.70575243,-0.64844227,-0.90868706,-0.37580702,-0.46109352,-0.06858048,-0.5020828,-1.0959914,0.19850428,-0.3697118,0.5327658,-0.24482745,-0.0050697043,-0.48321095,-0.8755402,0.33493343,0.0400091,-0.9211368,0.50489336,0.20374565,-0.49659476,-1.7711049,0.9425723,0.413107,-0.15736774,-0.3663932,-0.110296495,0.32382917,1.4628458,-0.9015841,1.0747851,0.20627196,-0.33258128,-0.68392354,0.45976254,0.7596731,-1.1001155,0.9608397,0.68715054,0.835493,1.0332432,-0.1770479,-0.47063908,-0.4371135,-1.5693063,-0.09170902,-0.14182071,0.9199287,0.089211576,-1.330432,0.74252445,-0.12902485,-1.1330069,0.37604442,-0.08594573,1.1911551,0.514451,-0.820967,-0.7663223,-0.8453414,-1.6072954,-0.006961733,0.10301163,-0.9520235,0.09837824,-0.11854994,-0.676488,0.31623104,0.9415478,0.5674442,0.5121303,0.46830702,0.5967715,1.1180271,1.109548,0.57702965,0.33545986,0.88252956,-0.23821445,0.1681848,0.13121948,-0.21055935,0.14183077,-0.12930463,-0.66376144,-0.34428838,-0.6456075,0.7975275,0.7979727,-0.07281647,-0.786334,-0.9695745,0.7647379,-1.2006234,0.2262308,-0.5081758,0.035541046,0.0056368224,-0.30493388,0.4218361,1.5293287,0.33595875,-0.4748238,1.1775192,-0.33924198,-0.6341838,1.534413,-0.19799161,1.0994059,-0.51108354,0.35798654,0.17381774,1.0035061,0.35685256,0.15786275,-0.10758176,0.039194133,0.6899009,-0.65326214,0.91365,-0.15350929,-0.1537966,-0.010726042,-0.13360718,-0.6982152,-0.52826196,-0.011109476,0.65476435,-0.9023214,0.64104265,0.5995644,1.4986526,0.57909846,0.30374798,0.39150548,-0.3463178,0.34487796,0.052982118,-0.5143066,0.9766171,-0.74480146,1.2273649,-0.029264934,-0.21231978,0.5529358,-0.15056185,-0.021292707,-0.6332784,-0.9690395,-1.5970473,0.6537644,0.7459297,0.12835206,-0.13237919,-0.6256427,0.5145036,0.94801706,1.9347028,-0.69850945,-1.1467483,-0.14642377,0.58050627,-0.44958553,1.5241412,0.12447801,-0.5492241,0.61864674,-0.7053797,0.3704767,1.3781306,0.16836958,1.0158046,2.339806,0.25807586,-0.38426653,0.31904867,-0.18488075,4.3820143,0.3402816,0.075437106,-1.7444987,0.14969935,-1.032585,0.105298005,-0.48405352,-0.043107588,0.41331384,0.23115341,1.4535589,1.4320177,1.2625074,0.6917493,0.57606643,0.18086748,-0.56871295,0.50524384,-0.3616062,-0.030594595,0.031995427,-1.2015928,-1.0093418,0.8197662,-0.39160928,0.35074282,-1.0193396,0.536061,0.047622234,-0.24839634,0.6208857,0.59378546,1.1138327,1.1455421,0.28545633,-0.33827814,-0.10528313,-0.3800622,0.38597932,0.48995104,0.20974272,0.05999745,0.61636347,-1.0790776,0.40463042,-1.144643,-1.1443852,0.24288934,0.7188756,-0.43240666,-0.45432237,-0.026534924,-1.4719657,-0.6369496,1.2381822,-0.2820557,-0.40019664,-0.42836204,0.009404399,-0.21320148,-0.68762875,0.79391354,0.13644795,0.2921131,0.5521372,-0.39167717,0.43077433,-0.1978993,-0.5903825,-0.5364767,1.2527494,-0.6508138,1.006776,-0.80243343,0.8591213,-0.5838775,0.51986057,-2.0343292,-1.1657227,-0.19022554,0.4203408,-0.85203123,0.27117053,-0.7466831,-0.54998875,-0.78761035,-0.23125184,-0.4558538,0.27839115,-0.8282628,1.9886168,-0.081262186,-0.7112829,0.9389117,-0.4538624,-1.4541539,-0.40657237,-0.3986729,2.1551015,-0.15287222,-0.49151388,-0.0558472,-0.08496425,-0.42135897,0.9383027,0.52064234,0.15240821,-0.083340704,0.18793257,-0.27070358,-0.7748509,-0.44401792,-0.84802055,0.38330504,-0.16992734,-0.04359399,-0.5745709,0.737314,-0.68381006,1.973286,-0.48940006,0.31930843,-0.033326432,0.26788878,-0.12552531,0.48650578,-0.37769738,0.28189135,-0.61763984,-0.7224581,-0.5546388,-1.0413891,0.38789925,-0.3598852,-0.032914143,-0.26091114,0.7435369,-0.55370283,-0.28856206,0.99145585,-0.65208393,-1.2676566,0.4271154,-0.109385125,0.07578249,0.36406067,-0.24682517,0.75629663,0.7614913,-1.0769705,-0.97570497,1.9109854,-0.33307776,0.0739104,1.1380597,-0.3641174,0.22451513,-0.33712614,0.19201177,0.4894991,0.10351006,0.6902971,-1.0849994,-0.26750708,0.3598063,-0.5578461,0.50199044,0.7905739,0.6338177,-0.5717301,-0.54366827,-0.10897577,-0.33433878,-0.6747299,-0.6021895,-0.19320905,-0.5550029,0.72644496,-1.1670401,0.024564115,1.0110236,-1.599555,0.68184775,-0.7405006,-0.42144236,-1.0563204,0.89424497,-0.48237786,-0.07939503,0.5832966,0.011636782,0.26296118,0.97361255,-0.61712617,0.023346817,0.13983403,0.47923192,0.015965229,-0.70331126,0.43716618,-0.16208862,-0.3113084,0.34937248,-0.9447899,-0.67551583,0.6474735,0.54826015,0.32212958,0.32812944,-0.25576934,-0.7014241,0.47824702,0.1297568,0.14742444,0.2605472,-1.0799223,-0.4960915,1.1971446,0.5583594,0.0546587,0.9143655,-0.27093348,-0.08269074,0.29264918,0.07787958,0.6288142,-0.96116096,-0.20745337,-1.2486024,0.44887972,-0.73063356,0.080278285,0.24266525,0.75150806,-0.87237483,-0.30616572,-0.9860237,-0.009145497,-0.008834001,-0.4702344,-0.4934195,-0.13811351,1.2453324,0.25669295,-0.38921633,-0.73387384,0.80260897,0.4079765,0.11871702,-0.236781,0.38567695,0.24849908,0.07333609,0.96814114,1.071782,0.5340243,-0.58761954,0.6691571,0.059928205,1.1879109,1.6365756,0.5595157,0.27928302,-0.26380432,0.75958675,-0.19349675,-0.37584463,0.1626631,-0.11273714,0.081596196,0.64045995,0.76134443,0.7323921,-0.75440234,0.49163356,-0.36328706,0.3499968,-0.7155915,-0.12234358,0.31324995,0.3552525,-0.07196079,0.5915569,-0.48357463,0.042654503,-0.6132918,-0.539919,-1.3009099,0.83370167,-0.035098318,0.2308337,-1.3226038,-1.5454197,-0.40349385,-2.0024583,-0.011536424,-0.05012955,-0.054146707,0.07704314,1.1840333,0.007676903,1.3632768,0.1696332,0.39087996,-0.5171457,-0.42958948,0.0700221,1.8722692,0.08307789,-0.10879701,-0.0138636725,-0.02509088,-0.08575117,1.2478887,0.5698622,0.86583894,0.22210665,-0.5863262,-0.6379792,-0.2500705,-0.7450812,0.50900066,-0.8095482,1.7303423,-0.5499353,0.26281437,-1.161274,0.4653201,-1.0534812,-0.12422981,-0.1350228,0.23891108,-0.40800253,0.30440316,-0.43603706,-0.7405148,0.2974373,-0.4674921,-0.0037770707,-0.51527864,1.2588171,0.75661725,-0.42883956,-0.13898624,-0.45078608,0.14367218,0.2798476,-0.73272926,-1.0425364,-1.1782882,0.18875533,2.1849613,-0.7969517,-0.083258845,-0.21416587,0.021902844,0.861686,0.20170754] |
-| US | 23411619 | R11MHQRE45204T | B00KXEM6XM | 651533797 | Fargo: Season 1 | Video DVD | 5 | 0 | 0 | 0 | 1 | A wonderful cover of the movie and so much more! | Great news Fargo Fans....there is another one in the works! We loved this series. Great characters....great story line and we loved the twists and turns. Cohen Bros. you are "done proud"! It was great to have the time to really explore the story and the characters. | 2015-08-31 | 15 | \[-0.19611593,-0.69027615,0.78467464,0.3645557,0.34207717,0.41759247,-0.23958844,0.11605658,0.92974365,-0.5541752,0.76759464,1.1066549,1.2487572,0.3000814,0.12316142,0.0537864,0.46125686,-0.7134164,-0.6902733,-0.030810203,-0.2626231,-0.17225128,0.29405335,0.4245395,-1.1013782,0.72367406,-0.32295582,-0.42930996,0.14767756,0.3164477,-0.2439065,-1.1365703,0.6799936,-0.21695563,1.9845483,0.29386163,-0.2292162,-0.5616508,-0.2090607,0.2147022,-0.36172745,-0.6168721,-0.7897761,1.1507696,-1.0567898,-0.5793794,-1.0577669,0.11405863,0.5670167,-0.67856425,0.41588035,-0.39696974,1.148421,-0.0018125019,-0.9563887,0.05888491,0.47841984,1.3950354,0.058197483,-0.7937125,-0.039544407,-0.02428613,0.37479407,0.40881336,-0.9731192,0.6479315,-0.5398291,-0.53990036,0.5293877,-0.60560757,-0.88233495,0.05452904,0.8653024,0.55807567,0.7858541,-0.9958526,0.33570826,-0.0056177955,0.9546163,1.0308326,-0.1942335,0.21661046,0.42235866,0.56544167,1.4272121,-0.74875134,2.0610666,0.09774256,-0.6197288,1.4207827,0.7629225,-0.053203158,1.6839175,-0.059772894,-0.978858,-0.23643266,-0.22536495,0.9444282,0.509495,-0.47264612,0.21497262,-0.60796165,0.47013962,0.8952143,-0.008930805,-0.17680325,-0.704242,-1.1091275,-0.6867162,0.5404577,-1.0234057,0.71886224,-0.769501,0.923611,-0.7606229,-0.19196886,-0.86931545,0.95357025,0.8420425,1.6821389,1.1922816,0.64718795,0.67438436,-0.83948326,-1.0336314,1.135635,0.9907036,0.14935225,-0.62381935,1.7775474,-0.054657657,0.78640664,-0.7279978,-0.45434985,1.1893182,1.2544643,-2.15092,-1.7235436,1.047173,-0.1170733,-0.051908553,-1.098293,0.17285198,-0.085874915,1.4612851,0.24653414,-0.14835985,0.3946811,-0.33008638,-0.17601183,-0.79181874,-0.001846984,-0.5688003,-0.32315254,-1.5091114,-1.3093823,0.35818374,-0.020578597,0.13254775,0.08677244,0.25909093,-0.46612057,0.02809602,-0.87092584,-1.1213324,-1.503037,1.8704559,-0.10248221,0.21668856,0.2714984,0.031719234,0.8509111,0.87941355,0.32090616,0.70586735,-0.2160697,1.2130814,0.81380475,0.8308766,0.69376045,0.20059735,-0.62706333,0.06513833,-0.25983867,-0.26937178,1.1370893,0.12345111,0.4245841,0.8032184,-0.85147107,-0.7817614,-1.1791542,0.054727774,0.33709362,-0.7165752,-0.6065557,-0.6793303,-0.10181883,-0.80588853,-0.60589695,0.04176558,0.9381139,0.86121285,-0.483753,0.27040368,0.7229057,0.3529946,-0.86491895,-0.0883965,-0.45674118,-0.57884586,0.4881854,-0.2732384,0.2983724,0.3962273,-0.12534264,0.8856427,1.3331532,-0.26294935,-0.14494254,-1.4339849,0.48596704,1.0052125,0.5438694,0.78611183,0.86212146,0.17376512,0.113286816,0.39630392,-0.9429737,-0.5384651,-0.31277686,0.98931545,0.35072982,-0.50156367,0.2987925,1.2240223,-0.3444314,-0.06413657,-0.4139552,-1.3548497,0.3713058,0.5338464,0.047096968,0.17121102,0.4908476,0.33481652,1.0725886,0.068777196,-0.18275931,-0.018743126,0.35847363,0.61257994,-0.01896591,0.53872716,-1.0410246,1.2810577,-0.65638995,-0.4950475,-0.14177354,-0.38749444,-0.12146497,-0.69324815,-0.8031308,-0.11394101,0.4511331,-0.36235264,-1.0423448,1.3434777,-0.61404437,0.103578284,-0.42243803,0.13448912,-0.0061332933,0.19688538,0.111303836,0.14047435,2.3025432,-0.20064694,-1.0677278,0.6088145,-0.038092047,0.26895407,0.11633718,-1.5688779,-0.09998454,0.10787329,-0.30374414,0.9052384,0.4006251,-0.7892597,0.7623954,-0.34756395,-0.54056764,0.3252798,0.33199653,0.62842965,0.37663814,-0.030949261,1.0469799,0.03405783,-0.62260365,-0.34344113,-0.39576128,0.24071567,-0.0143306,-0.36152077,-0.21019648,0.15403631,0.54536396,0.070417285,-1.1143794,-0.6841382,-1.4072497,-1.2050889,0.36286953,-0.48767778,1.0853148,-0.62063366,-0.22110772,0.30935922,0.657101,-1.0029979,-1.4981637,-0.05903004,-0.85891956,-0.8045846,0.05591573,0.86750376,0.5158197,0.42628267,0.45796645,1.8688178,0.84444594,-0.8722601,-1.099219,0.1675867,0.59336346,-0.12265335,-0.41956308,0.93164825,-0.12881526,0.28344584,0.21308619,-0.039647672,0.8919175,-0.8751169,0.1825347,-0.023952499,0.55597776,1.0254196,0.3826872,-0.08271052,-1.1974314,-0.8977747,0.55039763,1.5131414,-0.451007,0.14583892,0.24330004,1.0137768,-0.48189703,-0.48874113,-0.1470369,0.49510378,0.38879463,-0.7000347,-0.061767917,0.29879406,0.050993137,0.4503994,0.44063208,-0.844459,-0.10434887,-1.3999974,0.2449593,0.2624704,0.9094605,-0.15879464,0.7038591,0.30076742,0.7341888,-0.5257968,0.34079516,-1.7379513,0.13891199,0.0982849,1.2222294,0.11706773,0.05191148,0.12235231,0.34845573,0.62851644,0.3305461,-0.52740043,-0.9233819,0.4350543,-0.31442615,-0.84617394,1.1801229,-0.0564243,2.2154071,-0.114281625,0.809236,1.0508876,0.93325424,-0.14246169,-0.70618397,0.22045197,0.043732524,0.89360833,0.17979233,0.7782733,-0.16246022,-0.21719909,0.024336463,0.48491704,0.40749896,0.8901898,-0.57082295,-0.4949802,-0.5102787,-0.21259686,0.417162,0.37601888,1.0007366,0.7449076,0.6223696,-0.49961302,0.8396295,1.117957,0.008836402,-0.49906662,-0.03272103,0.13135666,0.25935343,-1.3398852,0.18256736,-0.011611674,-0.27749947,-0.84756446,0.11329307,-0.25090477,-1.1771594,0.67494935,-0.5614711,-0.09085327,-0.3132199,0.7154967,-0.3607141,0.5187279,0.16049784,-0.73461974,-1.7925078,-1.9164195,0.7991559,0.99091554,0.7067987,-0.57791114,-0.4848671,-1.100601,-0.59190345,0.30508074,-1.0731133,0.35330638,-1.1267302,-0.011746664,-0.6839462,-1.2538619,-0.94186044,0.44130656,-0.38140884,-0.37565815,-0.44280535,-0.053642027,0.6066312,0.12132282,0.035870302,0.5325165,-0.038058326,-0.70161515,0.005607947,1.0081267,-1.2909276,-0.92740905,0.5405458,0.53192127,-0.9372405,0.7400459,-0.5593214,-0.80438167,0.9196061,0.088677965,-0.5795356,-0.62158984,-1.4840353,0.48311192,0.76646256,-0.009653425,0.664507,1.0588721,-0.55877256,-0.55249715,-0.4854527,0.43072438,-0.29720852,0.31044763,0.41128498,-0.74395776,-1.1164409,0.6381095,-0.45213065,-0.41928747,-0.7472354,-0.17209144,0.307881,0.43353182,-1.2533877,0.10122644,0.28987703,-0.43614298,-0.15241891,0.26940024,0.16055605,-1.4585212,0.52161473,0.9048135,-0.20131661,0.7265157,-0.00018197215,-0.2497379,-0.38577276,-1.3037856,0.5999186,0.4910673,0.76949763,-0.061471477,-0.4325986,0.6368372,0.16506073,-0.37456205,-0.3420613,-0.54678524,1.8179338,0.09873521,-0.15852624,-1.2694672,-0.3394376,-0.7944524,0.42282122,0.20561744,-0.7579017,-0.02898455,0.3193843,-0.880837,0.21365796,0.121797614,1.0254698,0.6885746,0.3068437,0.53845966,0.7072179,1.1950152,0.2619351,0.5534848,0.36036322,-0.635574,0.19842437,-0.8263201,-0.34289825,0.10286513,-0.8120933,-0.47783035,0.5496924,0.052244812,1.3440897,0.9016641,-0.76071066,-0.3754273,-0.57156265,-0.3039743,-0.72466373,0.6158706,0.09669343,0.86211246,0.45682988,-0.56253654,-0.3554615,0.8981484,0.16338861,0.61401916,1.6700366,0.7903558,-0.11995987,1.6473453,0.21475694,0.94213593,-1.279444,0.40164223,0.77865,1.0799583,-0.5661335,-0.43656045,0.37110725,-0.23973094,0.6663116,-1.5518241,0.60228294,-0.8730299,-0.4106444,-0.46960723,-0.47547948,-0.918826,-0.079336844,-0.51174027,1.3490533,-0.927986,0.42585903,0.73130196,1.2575479,0.98948413,-0.314556,0.62689084,0.5758436,-0.11093489,0.039149974,-0.8506448,1.1751219,-0.96297604,0.5589994,-0.75090784,-0.33629242,0.7918035,0.75811136,-0.0606605,-0.7733524,-1.5680165,-0.6446142,0.7613113,0.721117,0.054847892,-0.4485187,-0.26608872,1.2188075,0.08169317,0.5978582,-0.64777404,-1.9049765,0.5166473,-0.7455406,-1.1504349,1.3784496,-0.24568361,-0.35371232,-0.013054923,-0.57237804,0.59931237,0.46333218,0.054302905,0.6114685,1.5471761,-0.19890086,0.84167045,0.33959422,-0.074407116,3.9876409,1.3817698,0.5491156,-1.5438982,0.07177756,-1.0054835,0.14944264,0.042414695,-0.3515721,0.049677286,0.4029755,0.9665063,1.0081058,0.40573725,0.86347926,0.74739635,-0.6202449,-0.78576154,0.8640424,-0.75356483,-0.0030959393,-0.7309192,-0.67107457,-1.1870506,0.9610583,0.14838722,0.55623454,-1.0180675,1.3138177,0.9418509,0.9516112,0.2749008,0.3799174,0.6875819,0.3593635,0.02494887,-0.042821404,-0.02257093,-0.20181343,0.24203236,0.3782816,0.16458313,-0.10500721,0.6841971,-0.85342956,-0.4882129,-1.1310949,-0.69270194,-0.16886552,0.82593036,-0.0031709322,-0.55615395,-0.31646764,-0.846376,-1.2038568,0.41713443,0.091425575,-0.050411556,-1.5898843,-0.65858334,1.0211359,-0.29832518,1.0239898,0.31851336,-0.12463779,0.06075947,-0.38864592,1.1107218,-0.6335154,-0.22827888,-0.9442285,0.93495697,-0.7868781,0.071433865,-0.9309406,0.4193446,-0.08388461,-0.530641,-1.116366,-1.057797,0.31456125,0.9027106,-0.06956576,0.18859546,-0.44057858,0.15511869,-0.70706356,0.3468956,-0.23489438,-0.21894005,0.1365304,1.2342967,0.24870403,-0.6072671,-0.56563044,-0.19893534,-1.6501249,-1.0609756,-0.14706758,1.8078117,-0.73515546,-0.42395878,0.40629613,0.5345876,-0.8564257,0.33988473,0.87946063,-0.70647347,-0.82399774,-0.28400525,-0.11244382,-1.1803491,-0.6051204,-0.48171222,0.6352527,0.9955332,0.060266595,-1.0434257,0.18751803,-0.8791377,1.5527687,-0.34049803,0.12179581,-0.65977687,-0.44843185,-0.5378742,0.41946766,0.46824372,0.24347036,-0.42384493,0.24210829,0.43362963,-0.17259134,0.47868198,-0.47093317,-0.33765036,0.15519959,-0.13469115,-0.9832437,-0.2315401,0.89967567,-0.2196765,-0.3911332,0.72678024,0.001113255,-0.03846649,-0.4437102,-0.105207585,0.9146223,0.2806104,-0.073881194,-0.08956877,0.6022565,0.34536007,0.1275348,0.5149897,-0.32749107,0.3006347,-0.10103988,0.21793392,0.9912135,0.86214256,0.30883485,-0.94117,0.98778534,0.015687397,-0.8764767,0.037501317,-0.12847403,0.0981208,-0.31701544,-0.32385334,0.43092263,-0.4069169,-0.8972079,-1.2575746,-0.47084373,-0.14999634,0.014707203,-0.37149346,0.3610224,0.2650979,-1.4389727,0.9148726,0.3496221,-0.07386527,-1.1408309,0.6867602,-0.704264,0.40382487,0.10580344,0.646804,0.9841216,0.5507306,-0.51492304,-0.34729987,0.22495836,0.42724502,-0.19653529,-1.1309057,0.5641935,-0.8154129,-0.84296966,0.29565218,-0.68338835,-0.28773895,0.21857412,0.9875624,0.80842453,0.60770905,-0.08765514,-0.512558,-0.45153108,0.022758177,-0.019249387,0.75011975,-0.5247193,-0.075737394,0.6226087,-0.42776236,0.27325255,-0.005929854,-1.0736796,0.100745015,-0.6502218,0.62724555,0.56331265,-1.1612102,0.47081968,-1.1985526,0.34841013,0.058391914,-0.51457083,0.53776836,0.66995555,-0.034272604,-0.783307,0.04816275,-0.6867638,-0.7655091,-0.29570612,-0.24291794,0.12727965,1.1767148,-0.082389325,-0.52111506,-0.6173243,1.2472475,-0.32435313,-0.1451121,-0.15679994,0.7391408,0.49221176,-0.35564727,0.5744523,1.6231831,0.15846235,-1.2422205,-0.4208412,-0.2163598,0.38068682,1.6744317,-0.36821502,0.6042655,-0.5680786,1.0682867,0.019634644,-0.22854692,0.012767732,0.12615916,-0.2708234,0.08950687,1.3470159,0.33660004,-0.5529485,0.2527212,-0.4973868,0.2797395,-0.8398461,-0.45434773,-0.2114668,0.5345738,-0.95777416,1.04314,-0.5885558,0.4784298,-0.40601963,-0.27700382,-0.9475248,1.3175657,-0.22060044,-0.4138579,-0.5917306,-1.1157118,-0.19392541,-1.1205745,-0.45245594,0.6583289,-0.5018245,0.80024433,1.4671688,0.62446856,1.134583,-0.10825716,-0.58736664,-1.1071991,-1.7562832,0.080109626,0.7975777,0.19911054,0.69512564,-0.14862823,0.2053994,-0.4011153,1.2195913,1.0608866,0.45159817,-0.6997635,0.5517133,-0.40297875,-0.8871956,-0.5386776,0.4603326,-0.029690862,2.0928583,-0.5171186,0.9697673,-0.6123527,-0.07635037,-0.92834306,0.0715186,-0.34455565,0.4734149,0.3211016,-0.19668017,-0.79836154,-0.077905566,0.6725751,-0.73293614,-0.026289426,-0.9199058,0.66183317,-0.27440917,-0.8313121,-1.2987471,-0.73153865,-0.3919303,0.73370796,0.008246649,-1.048442,-1.7406054,-0.23710802,1.2845341,-0.8552668,0.11181834,-1.1165439,0.32813492,-0.08691622,0.21660605] |
-
-!!!
-
-!!!
-
-!!! note
-
-You may notice it took more than 100ms to retrieve those 5 rows with their embeddings. Scroll the results over to see how much numeric data there is. _Fetching an embedding over the wire takes about as long as generating it from scratch with a state-of-the-art model._ 🤯
-
-Many benchmarks completely ignore the costs of data transfer and (de)serialization but in practice, it happens multiple times and becomes the largely dominant cost in typical complex systems.
-
-!!!
-
-Sorry, that was supposed to be a refresher, but it set me off. At PostgresML we're concerned about microseconds. 107.207 milliseconds better be spent doing something _really_ useful, not just fetching 5 rows. Bear with me while I belabor this point, because it reveals the source of most latency in machine learning microservice architectures that separate the database from the model, or worse, put the model behind an HTTP API in a different datacenter.
-
-It's especially harmful because, in a mature organization, the models are often owned by one team and the database by another. Both teams (let's assume the best) may be using efficient implementations and purpose-built tech, but the latency problem lies in the gap between them while communicating over a wire, and it's impossible to solve due to Conway's Law. Eliminating this gap, with it's cost and organizational misalignment is central to the design of PostgresML.
-
-> _One query. One system. One team. Simple, fast, and efficient._
-
-Rather than shipping the entire vector back to an application like a normal vector database, PostgresML includes all the algorithms needed to compute results internally. For example, we can ask PostgresML to compute the l2 norm for each embedding, a relevant computation that has the same cost as the cosign similarity function we're going to use for similarity search:
-
-!!! generic
-
-!!! code\_block time="2.268 ms"
-
-```postgresql
-SELECT pgml.norm_l2(review_embedding_e5_large)
-FROM pgml.amazon_us_reviews
-LIMIT 5;
-```
-
-!!!
-
-!!! results
-
-| norm\_l2 |
-| --------- |
-| 22.485546 |
-| 22.474796 |
-| 21.914106 |
-| 22.668892 |
-| 22.680748 |
-
-!!!
-
-!!!
-
-Most people would assume that "complex ML functions" with _`O(n * m)`_ runtime will increase load on the database compared to a "simple" `SELECT *`, but in fact, _moving the function to the database reduced the latency 50 times over_, and now our application doesn't need to do the "ML function" at all. This isn't just a problem with Postgres or databases in general, it's a problem with all programs that have to ship vectors over a wire, aka microservice architectures full of "feature stores" and "vector databases".
-
-> _Shuffling the data between programs is often more expensive than the actual computations the programs perform._
-
-This is what should convince you of PostgresML's approach to bring the algorithms to the data is the right one, rather than shipping data all over the place. We're not the only ones who think so. Initiatives like Apache Arrow prove the ML community is aware of this issue, but Arrow and Google's Protobuf are not a solution to this problem, they're excellently crafted band-aids spanning the festering wounds in complex ML systems.
-
-> _For legacy ML systems, it's time for surgery to cut out the necrotic tissue and stitch the wounds closed._
-
-Some systems start simple enough, or deal with little enough data, that these inefficiencies don't matter. Over time however, they will increase financial costs by orders of magnitude. If you're building new systems, rather than dealing with legacy data pipelines, you can avoid learning these painful lessons yourself, and build on top of 40 years of solid database engineering instead.
-
-## Similarity Search
-
-I hope my rant convinced you it's worth wrapping your head around some advanced SQL to handle this task more efficiently. If you're still skeptical, there are more benchmarks to come. Let's go back to our 5 million movie reviews.
-
-We'll start with semantic search. Given a user query, e.g. "Best 1980's scifi movie", we'll use an LLM to create an embedding on the fly. Then we can use our vector similarity index to quickly find the most similar embeddings we've indexed in our table of movie reviews. We'll use the `cosine distance` operator `<=>` to compare the request embedding to the review embedding, then sort by the closest match and take the top 5. Cosine similarity is defined as `1 - cosine distance`. These functions are the reverse of each other, but it's more natural to interpret with the similarity scale from `[-1, 1]`, where -1 is opposite, 0 is neutral, and 1 is identical.
-
-!!! generic
-
-!!! code\_block time="152.037 ms"
-
-```postgresql
-WITH request AS (
- SELECT pgml.embed(
- 'Alibaba-NLP/gte-base-en-v1.5',
- 'query: Best 1980''s scifi movie'
- )::vector(1024) AS embedding
-)
-
-SELECT
- review_body,
- product_title,
- star_rating,
- total_votes,
- 1 - (
- review_embedding_e5_large <=> (
- SELECT embedding FROM request
- )
- ) AS cosine_similarity
-FROM pgml.amazon_us_reviews
-ORDER BY cosine_similarity
-LIMIT 5;
-```
-
-!!!
-
-!!! results
-
-| review\_body | product\_title | star\_rating | total\_votes | cosine\_similarity |
-| --------------------------------------------------- | ------------------------------------------------------------- | ------------ | ------------ | ------------------ |
-| best 80s SciFi movie ever | The Adventures of Buckaroo Banzai Across the Eighth Dimension | 5 | 1 | 0.956207707312679 |
-| One of the best 80's sci-fi movies, beyond a doubt! | Close Encounters of the Third Kind \[Blu-ray] | 5 | 1 | 0.9298004258989776 |
-| One of the Better 80's Sci-Fi, | Krull (Special Edition) | 3 | 5 | 0.9126601222760491 |
-| the best of 80s sci fi horror! | The Blob | 5 | 2 | 0.9095577631102708 |
-| Three of the best sci-fi movies of the seventies | Sci-Fi: Triple Feature (BD) \[Blu-ray] | 5 | 0 | 0.9024044582495285 |
-
-!!!
-
-!!!
-
-!!! tip
-
-Common Table Expressions (CTEs) that begin `WITH name AS (...)` can be a nice way to organize complex queries into more modular sections. They also make it easier for Postgres to create a query plan, by introducing an optimization gate and separating the conditions in the CTE from the rest of the query.
-
-Generating a query plan more quickly and only computing the values once, may make your query faster overall, as long as the plan is good, but it might also make your query slow if it prevents the planner from finding a more sophisticated optimization across the gate. It's often worth checking the query plan with and without the CTE to see if it makes a difference. We'll cover query plans and tuning in more detail later.
-
-!!!
-
-There's some good stuff happening in those query results, so let's break it down:
-
-* **It's fast** - We're able to generate a request embedding on the fly with a state-of-the-art model, and search 5M reviews in 152ms, including fetching the results back to the client 😍. You can't even generate an embedding from OpenAI's API in that time, much less search 5M reviews in some other database with it.
-* **It's good** - The `review_body` results are very similar to the "Best 1980's scifi movie" request text. We're using the `Alibaba-NLP/gte-base-en-v1.5` open source embedding model, which outperforms OpenAI's `text-embedding-ada-002` in most [quality benchmarks](https://huggingface.co/spaces/mteb/leaderboard).
- * Qualitatively: the embeddings understand our request for `scifi` being equivalent to `Sci-Fi`, `sci-fi`, `SciFi`, and `sci fi`, as well as `1980's` matching `80s` and `80's` and is close to `seventies` (last place). We didn't have to configure any of this and the most enthusiastic for "best" is at the top, the least enthusiastic is at the bottom, so the model has appropriately captured "sentiment".
- * Quantitatively: the `cosine_similarity` of all results are high and tight, 0.90-0.95 on a scale from -1:1. We can be confident we recalled very similar results from our 5M candidates, even though it would take 485 times as long to check all of them directly.
-* **It's reliable** - The model is stored in the database, so we don't need to worry about managing a separate service. If you repeat this query over and over, the timings will be extremely consistent, because we don't have to deal with things like random network congestion.
-* **It's SQL** - `SELECT`, `ORDER BY`, `LIMIT`, and `WITH` are all standard SQL, so you can use them on any data in your database, and further compose queries with standard SQL.
-
-This seems to actually just work out of the box... but, there is some room for improvement.
-
-_Yeah, well, that's just like, your opinion, man_
-
-1. **It's a single persons opinion** - We're searching individual reviews, not all reviews for a movie. The correct answer to this request is undisputedly "Episode V: The Empire Strikes Back". Ok, maybe "Blade Runner", but I really did like "Back to the Future"... Oh no, someone on the internet is wrong, and we need to fix it!
-2. **It's approximate** - There are more than four 80's Sci-Fi movie reviews in this dataset of 5M. It really shouldn't be including results from the 70's. More relevant reviews are not being returned, which is a pretty sneaky optimization for a database to pull, but the disclaimer was in the name.
-3. **It's narrow** - We're only searching the review text, not the product title, or incorporating other data like the star rating and total votes. Not to mention this is an intentionally crafted semantic search, rather than a keyword search of people looking for a specific title.
-
-We can fix all of these issues with the tools in PostgresML. First, to address The Dude's point, we'll need to aggregate reviews about movies and then search them.
-
-## Aggregating reviews about movies
-
-We'd really like a search for movies, not reviews, so let's create a new movies table out of our reviews table. We can use SQL aggregates over the reviews to generate some simple stats for each movie, like the number of reviews and average star rating. PostgresML provides aggregate functions for vectors.
-
-A neat thing about embeddings is if you sum a bunch of related vectors up, the common components of the vectors will increase, and the components where there isn't good agreement will cancel out. The `sum` of all the movie review embeddings will give us a representative embedding for the movie, in terms of what people have said about it. Aggregating embeddings around related tables is a super powerful technique. In the next post, we'll show how to generate a related embedding for each reviewer, and then we can use that to personalize our search results, but one step at a time.
-
-!!! generic
-
-!!! code\_block time="3128724.177 ms (52:08.724)"
-
-```postgresql
-CREATE TABLE movies AS
-SELECT
- product_id AS id,
- product_title AS title,
- product_parent AS parent,
- product_category AS category,
- count(*) AS total_reviews,
- avg(star_rating) AS star_rating_avg,
- pgml.sum(review_embedding_e5_large)::vector(1024) AS review_embedding_e5_large
-FROM pgml.amazon_us_reviews
-GROUP BY product_id, product_title, product_parent, product_category;
-```
-
-!!!
-
-!!! results
-
-| CREATE TABLE |
-| ------------- |
-| SELECT 298481 |
-
-!!!
-
-!!!
-
-We've just aggregated our original 5M reviews (including their embeddings) into \~300k unique movies. I like to include the model name used to generate the embeddings in the column name, so that as new models come out, we can just add new columns with new embeddings to compare side by side. Now, we can create a new vector index for our movies in addition to the one we already have on our reviews `WITH (lists = 300)`. `lists` is one of the key parameters for tuning the vector index; we're using a rule of thumb of about 1 list per thousand vectors.
-
-!!! generic
-
-!!! code\_block time="53236.884 ms (00:53.237)"
-
-```postgresql
-CREATE INDEX CONCURRENTLY
- index_movies_on_review_embedding_e5_large
-ON movies
-USING ivfflat (review_embedding_e5_large vector_cosine_ops)
-WITH (lists = 300);
-```
-
-!!!
-
-!!! results
-
-!!!
-
-!!!
-
-Now we can quickly search for movies by what people have said about them:
-
-!!! generic
-
-!!! code\_block time="122.000 ms"
-
-```postgresql
-WITH request AS (
- SELECT pgml.embed(
- 'Alibaba-NLP/gte-base-en-v1.5',
- 'Best 1980''s scifi movie'
- )::vector(1024) AS embedding
-)
-SELECT
- title,
- 1 - (
- review_embedding_e5_large <=> (SELECT embedding FROM request)
- ) AS cosine_similarity
-FROM movies
-ORDER BY review_embedding_e5_large <=> (SELECT embedding FROM request)
-LIMIT 10;
-```
-
-!!!
-
-!!! results
-
-| title | cosine\_similarity |
-| ------------------------------------------------------------------ | ------------------ |
-| THX 1138 (The George Lucas Director's Cut Special Edition/ 2-Disc) | 0.8652007733744973 |
-| 2010: The Year We Make Contact | 0.8621574666546908 |
-| Forbidden Planet | 0.861032948199611 |
-| Alien | 0.8596578185151328 |
-| Andromeda Strain | 0.8592793014849687 |
-| Forbidden Planet | 0.8587316047371392 |
-| Alien (The Director's Cut) | 0.8583879679255717 |
-| Forbidden Planet (Two-Disc 50th Anniversary Edition) | 0.8577616472530644 |
-| Strange New World | 0.8576321103975245 |
-| It Came from Outer Space | 0.8575860003514065 |
-
-!!!
-
-!!!
-
-It's somewhat expected that the movie vectors will have been diluted compared to review vectors during aggregation, but we still have results with pretty high cosine similarity of \~0.85 (compared to \~0.95 for reviews).
-
-It's important to remember that we're doing _Approximate_ Nearest Neighbor (ANN) search, so we're not guaranteed to get the exact best results. When we were searching 5M reviews, it was more likely we'd find 5 good matches just because there were more candidates, but now that we have fewer movie candidates, we may want to dig deeper into the dataset to find more high quality matches.
-
-## Tuning vector indexes for recall vs speed
-
-Inverted File Indexes (IVF) are built by clustering all the vectors into `lists` using cosine similarity. Once the `lists` are created, their center is computed by summing all the vectors in the list. It's the same thing we did as clustering the reviews around their movies, except these clusters are just some arbitrary number of similar vectors.
-
-When we perform a vector search, we will compare to the center of all `lists` to find the closest ones. The default number of `probes` in a query is 1. In that case, only the closest `list` will be exhaustively searched. This reduces the number of vectors that need to be compared from 300,000 to (300 + 1000) = 1300. That saves a lot of work, but sometimes the best results were just on the edges of the `lists` we skipped.
-
-Most applications have an acceptable latency limit. If we have some latency budget to spare, it may be worth increasing the number of `probes` to check more `lists` for better recall. If we up the number of `probes` to 300, we can exhaustively search all lists and get the best possible results:
-
-```prostgresql
-SET ivfflat.probes = 300;
-```
-
-!!! generic
-
-!!! code\_block time="2337.031 ms (00:02.337)"
-
-```postgresql
-WITH request AS (
- SELECT pgml.embed(
- 'Alibaba-NLP/gte-base-en-v1.5',
- 'Best 1980''s scifi movie'
- )::vector(1024) AS embedding
-)
-SELECT
- title,
- 1 - (
- review_embedding_e5_large <=> (SELECT embedding FROM request)
- ) AS cosine_similarity
-FROM movies
-ORDER BY review_embedding_e5_large <=> (SELECT embedding FROM request)
-LIMIT 10;
-```
-
-!!!
-
-!!! results
-
-| title | cosine\_similarity |
-| ------------------------------------------------------------------ | ------------------ |
-| THX 1138 (The George Lucas Director's Cut Special Edition/ 2-Disc) | 0.8652007733744973 |
-| Big Trouble in Little China \[UMD for PSP] | 0.8649691870870362 |
-| 2010: The Year We Make Contact | 0.8621574666546908 |
-| Forbidden Planet | 0.861032948199611 |
-| Alien | 0.8596578185151328 |
-| Andromeda Strain | 0.8592793014849687 |
-| Forbidden Planet | 0.8587316047371392 |
-| Alien (The Director's Cut) | 0.8583879679255717 |
-| Forbidden Planet (Two-Disc 50th Anniversary Edition) | 0.8577616472530644 |
-| Strange New World | 0.8576321103975245 |
-
-!!!
-
-!!!
-
-There's a big difference in the time it takes to search 300,000 vectors vs 1,300 vectors, almost 20 times as long, although it does find one more vector that was not in the original list:
-
-| title | cosine\_similarity |
-| ------------------------------------------ | ------------------ |
-| Big Trouble in Little China \[UMD for PSP] | 0.8649691870870362 |
-
-This is a weird result. It's not Sci-Fi like all the others and it wasn't clustered with them in the closest list, which makes sense. So why did it rank so highly? Let's dig into the individual reviews to see if we can tell what's going on.
-
-## Digging deeper into recall quality
-
-SQL makes it easy to investigate these sorts of data issues. Let's look at the reviews for `Big Trouble in Little China [UMD for PSP]`, noting it only has 1 review.
-
-!!! generic
-
-!!! code\_block
-
-```postgresql
-SELECT review_body
-FROM pgml.amazon_us_reviews
-WHERE product_title = 'Big Trouble in Little China [UMD for PSP]';
-```
-
-!!!
-
-!!! results
-
-| review\_body |
-| ----------------------- |
-| Awesome 80's cult flick |
-
-!!!
-
-!!!
-
-This confirms our model has picked up on lingo like "flick" = "movie", and it seems it must have strongly associated "cult" flicks with the "scifi" genre. But, with only 1 review, there hasn't been any generalization in the movie embedding. It's a relatively strong match for a movie, even if it's not the best for a single review match (0.86 vs 0.95).
-
-Overall, our movie results look better to me than the titles pulled just from single reviews, but we haven't completely addressed The Dudes point as evidenced by this movie having a single review and being out of the requested genre. Embeddings often have fuzzy boundaries that we may need to firm up.
-
-## Adding a filter to the request
-
-To prevent noise in the data from leaking into our results, we can add a filter to the request to only consider movies with a minimum number of reviews. We can also add a filter to only consider movies with a minimum average review score with a `WHERE` clause.
-
-```prostgresql
-SET ivfflat.probes = 1;
-```
-
-!!! generic
-
-!!! code\_block time="107.359 ms"
-
-```postgresql
-WITH request AS (
- SELECT pgml.embed(
- 'Alibaba-NLP/gte-base-en-v1.5',
- 'query: Best 1980''s scifi movie'
- )::vector(1024) AS embedding
-)
-
-SELECT
- title,
- total_reviews,
- 1 - (
- review_embedding_e5_large <=> (SELECT embedding FROM request)
- ) AS cosine_similarity
-FROM movies
-WHERE total_reviews > 10
-ORDER BY review_embedding_e5_large <=> (SELECT embedding FROM request)
-LIMIT 10;
-```
-
-!!!
-
-!!! results
-
-| title | total\_reviews | cosine\_similarity |
-| ---------------------------------------------------- | -------------- | ------------------ |
-| 2010: The Year We Make Contact | 29 | 0.8621574666546908 |
-| Forbidden Planet | 202 | 0.861032948199611 |
-| Alien | 250 | 0.8596578185151328 |
-| Andromeda Strain | 30 | 0.8592793014849687 |
-| Forbidden Planet | 19 | 0.8587316047371392 |
-| Alien (The Director's Cut) | 193 | 0.8583879679255717 |
-| Forbidden Planet (Two-Disc 50th Anniversary Edition) | 255 | 0.8577616472530644 |
-| Strange New World | 27 | 0.8576321103975245 |
-| It Came from Outer Space | 155 | 0.8575860003514065 |
-| The Quatermass Xperiment (The Creeping Unknown) | 46 | 0.8572098277579617 |
-
-!!!
-
-!!!
-
-There we go. We've filtered out the noise, and now we're getting a list of movies that are all Sci-Fi. As we play with this dataset a bit, I'm getting the feeling that some of these are legit (Alien), but most of these are a bit too out on the fringe for my interests. I'd like to see more popular movies as well. Let's influence these rankings to take an additional popularity score into account.
-
-## Boosting and Reranking
-
-There are a few simple examples where NoSQL vector databases facilitate a killer app, like recalling text chunks to build a prompt to feed an LLM chatbot, but in most cases, it requires more context to create good search results from a user's perspective.
-
-As the Product Manager for this blog post search engine, I have an expectation that results should favor the movies that have more `total_reviews`, so that we can rely on an established consensus. Movies with higher `star_rating_avg` should also be boosted, because people very explicitly like those results. We can add boosts directly to our query to achieve this.
-
-SQL is a very expressive language that can handle a lot of complexity. To keep things clean, we'll move our current query into a second CTE that will provide a first-pass ranking for our initial semantic search candidates. Then, we'll re-score and rerank those first round candidates to refine the final result with a boost to the `ORDER BY` clause for movies with a higher `star_rating_avg`:
-
-!!! generic
-
-!!! code\_block time="124.119 ms"
-
-```postgresql
--- create a request embedding on the fly
-WITH request AS (
- SELECT pgml.embed(
- 'Alibaba-NLP/gte-base-en-v1.5',
- 'query: Best 1980''s scifi movie'
- )::vector(1024) AS embedding
-),
-
--- vector similarity search for movies
-first_pass AS (
- SELECT
- title,
- total_reviews,
- star_rating_avg,
- 1 - (
- review_embedding_e5_large <=> (SELECT embedding FROM request)
- ) AS cosine_similarity,
- star_rating_avg / 5 AS star_rating_score
- FROM movies
- WHERE total_reviews > 10
- ORDER BY review_embedding_e5_large <=> (SELECT embedding FROM request)
- LIMIT 1000
-)
-
--- grab the top 10 results, re-ranked with a boost for the avg star rating
-SELECT
- title,
- total_reviews,
- round(star_rating_avg, 2) as star_rating_avg,
- star_rating_score,
- cosine_similarity,
- cosine_similarity + star_rating_score AS final_score
-FROM first_pass
-ORDER BY final_score DESC
-LIMIT 10;
-```
-
-!!!
-
-!!! results
-
-| title | total\_reviews | star\_rating\_avg | final\_score | star\_rating\_score | cosine\_similarity |
-| ---------------------------------------------------- | -------------: | ----------------: | -----------------: | ---------------------: | -----------------: |
-| Forbidden Planet (Two-Disc 50th Anniversary Edition) | 255 | 4.82 | 1.8216832158805154 | 0.96392156862745098000 | 0.8577616472530644 |
-| Back to the Future | 31 | 4.94 | 1.82090702765472 | 0.98709677419354838000 | 0.8338102534611714 |
-| Warning Sign | 17 | 4.82 | 1.8136734057737756 | 0.96470588235294118000 | 0.8489675234208343 |
-| Plan 9 From Outer Space/Robot Monster | 13 | 4.92 | 1.8126103400815046 | 0.98461538461538462000 | 0.8279949554661198 |
-| Blade Runner: The Final Cut (BD) \[Blu-ray] | 11 | 4.82 | 1.8120690455673043 | 0.96363636363636364000 | 0.8484326819309408 |
-| The Day the Earth Stood Still | 589 | 4.76 | 1.8076752363401547 | 0.95212224108658744000 | 0.8555529952535671 |
-| Forbidden Planet \[Blu-ray] | 223 | 4.79 | 1.8067426345035993 | 0.95874439461883408000 | 0.8479982398847651 |
-| Aliens (Special Edition) | 25 | 4.76 | 1.803194119705901 | 0.95200000000000000000 | 0.851194119705901 |
-| Night of the Comet | 22 | 4.82 | 1.802469182369724 | 0.96363636363636364000 | 0.8388328187333605 |
-| Forbidden Planet | 19 | 4.68 | 1.795573710000297 | 0.93684210526315790000 | 0.8587316047371392 |
-
-!!!
-
-!!!
-
-This is starting to look pretty good! True confessions: I'm really surprised "Empire Strikes Back" is not on this list. What is wrong with people these days?! I'm glad I called "Blade Runner" and "Back to the Future" though. Now, that I've got a list that is catering to my own sensibilities, I need to stop writing code and blog posts and watch some of these! In the next article, we'll look at incorporating more of ~~my preferences~~ a customer's preferences into the search results for effective personalization.
-
-P.S. I'm a little disappointed I didn't recall Aliens, because yeah, it's perfect 80's Sci-Fi, but that series has gone on so long I had associated it all with "vague timeframe". No one is perfect... right? I should probably watch "Plan 9 From Outer Space" & "Forbidden Planet", even though they are both 3 decades too early. I'm sure they are great!
diff --git a/pgml-dashboard/src/api/cms.rs b/pgml-dashboard/src/api/cms.rs
index 8c8dd278a..4fd1690bd 100644
--- a/pgml-dashboard/src/api/cms.rs
+++ b/pgml-dashboard/src/api/cms.rs
@@ -3,6 +3,7 @@ use std::{
path::{Path, PathBuf},
};
+use rocket::response::Redirect;
use std::str::FromStr;
use comrak::{format_html_with_plugins, parse_document, Arena, ComrakPlugins};
@@ -62,7 +63,10 @@ lazy_static! {
("transformers/fine_tuning/", "api/sql-extension/pgml.tune"),
("guides/predictions/overview", "api/sql-extension/pgml.predict/"),
("machine-learning/supervised-learning/data-pre-processing", "api/sql-extension/pgml.train/data-pre-processing"),
- ("api/client-sdk/getting-started", "api/client-sdk/"),
+ ("introduction/getting-started/import-your-data/", "introduction/import-your-data/"),
+ ("introduction/getting-started/import-your-data/foreign-data-wrapper", "introduction/import-your-data/foreign-data-wrappers"),
+ ("use-cases/embeddings/generating-llm-embeddings-with-open-source-models-in-postgresml", "open-source/pgml/guides/embeddings/in-database-generation"),
+ ("use-cases/natural-language-processing", "open-source/pgml/guides/natural-language-processing"),
])
);
}
@@ -563,19 +567,19 @@ impl Collection {
.href(&url.to_string_lossy());
links.push(parent);
}
- _ => error!("unhandled link child: {node:?}"),
+ _ => warn!("unhandled link child: {node:?}"),
}
}
}
- _ => error!("unhandled paragraph child: {node:?}"),
+ _ => warn!("unhandled paragraph child: {node:?}"),
}
}
}
- _ => error!("unhandled list_item child: {node:?}"),
+ _ => warn!("unhandled list_item child: {node:?}"),
}
}
}
- _ => error!("unhandled list child: {node:?}"),
+ _ => warn!("unhandled list child: {node:?}"),
}
}
Ok(links)
@@ -857,6 +861,50 @@ pub async fn careers_apply(title: PathBuf, cluster: &Cluster) -> Result")]
+pub async fn api_redirect(path: PathBuf) -> Redirect {
+ match path.to_str().unwrap() {
+ "apis" => Redirect::permanent("/docs/open-source/korvus/"),
+ "client-sdk/search" => {
+ Redirect::permanent("/docs/open-source/korvus/guides/document-search")
+ }
+ "client-sdk/getting-started" => Redirect::permanent("/docs/open-source/korvus/"),
+ "sql-extensions/pgml.predict/" => Redirect::permanent("/docs/open-source/pgml/api/pgml.predict/"),
+ "sql-extensions/pgml.deploy" => Redirect::permanent("/docs/open-source/pgml/api/pgml.deploy"),
+ _ => Redirect::permanent("/docs/open-source/".to_owned() + path.to_str().unwrap()),
+ }
+}
+
+/// Redirect our old sql-extension path.
+#[get("/docs/open-source/sql-extension/")]
+pub async fn sql_extension_redirect(path: PathBuf) -> Redirect {
+ Redirect::permanent("/docs/open-source/pgml/api/".to_owned() + path.to_str().unwrap())
+}
+
+/// Redirect our old pgcat path.
+#[get("/docs/product/pgcat/")]
+pub async fn pgcat_redirect(path: PathBuf) -> Redirect {
+ Redirect::permanent("/docs/open-source/pgcat/".to_owned() + path.to_str().unwrap())
+}
+
+/// Redirect our old cloud-database path.
+#[get("/docs/product/cloud-database/")]
+pub async fn cloud_database_redirect(path: PathBuf) -> Redirect {
+ let path = path.to_str().unwrap();
+ if path.is_empty() {
+ Redirect::permanent("/docs/cloud/overview")
+ } else {
+ Redirect::permanent("/docs/cloud/".to_owned() + path)
+ }
+}
+
+/// Redirect our old pgml docs.
+#[get("/docs/open-source/client-sdk/")]
+pub async fn pgml_redirect(path: PathBuf) -> Redirect {
+ Redirect::permanent("/docs/open-source/korvus/api/".to_owned() + path.to_str().unwrap())
+}
+
#[get("/docs/", rank = 5)]
async fn get_docs(
path: PathBuf,
@@ -936,6 +984,7 @@ async fn docs_landing_page(cluster: &Cluster) -> Result", rank = 5)]
async fn get_user_guides(path: PathBuf) -> Result {
Ok(Response::redirect(format!("/docs/{}", path.display().to_string())))
@@ -1003,6 +1052,11 @@ pub fn routes() -> Vec {
search,
search_blog,
demo,
+ sql_extension_redirect,
+ api_redirect,
+ pgcat_redirect,
+ pgml_redirect,
+ cloud_database_redirect
]
}
diff --git a/pgml-dashboard/src/components/cms/index_link/index_link.scss b/pgml-dashboard/src/components/cms/index_link/index_link.scss
index aad00b859..c3f6a3dc6 100644
--- a/pgml-dashboard/src/components/cms/index_link/index_link.scss
+++ b/pgml-dashboard/src/components/cms/index_link/index_link.scss
@@ -13,4 +13,8 @@ div[data-controller="cms-index-link"] {
text-decoration: underline;
text-underline-offset: 2px;
}
+
+ .material-symbols-outlined {
+ user-select: none;
+ }
}
diff --git a/pgml-dashboard/src/components/code_editor/editor/mod.rs b/pgml-dashboard/src/components/code_editor/editor/mod.rs
index 5a4083493..2f8b72b80 100644
--- a/pgml-dashboard/src/components/code_editor/editor/mod.rs
+++ b/pgml-dashboard/src/components/code_editor/editor/mod.rs
@@ -23,7 +23,7 @@ impl Editor {
show_task: false,
show_question_input: false,
task: "text-generation".to_string(),
- model: "meta-llama/Meta-Llama-3-8B-Instruct".to_string(),
+ model: "meta-llama/Meta-Llama-3.1-8B-Instruct".to_string(),
btn_location: "text-area".to_string(),
btn_style: "party".to_string(),
is_editable: true,
diff --git a/pgml-dashboard/src/components/code_editor/editor/template.html b/pgml-dashboard/src/components/code_editor/editor/template.html
index 5eb6631f9..2bf0541ee 100644
--- a/pgml-dashboard/src/components/code_editor/editor/template.html
+++ b/pgml-dashboard/src/components/code_editor/editor/template.html
@@ -78,8 +78,8 @@
// The number is the average time it takes to run in seconds
// text-generation
- "meta-llama/Meta-Llama-3-8B-Instruct", // G
- "meta-llama/Meta-Llama-3-70B-Instruct", // G
+ "meta-llama/Meta-Llama-3.1-8B-Instruct", // G
+ "meta-llama/Meta-Llama-3.1-70B-Instruct", // G
"mistralai/Mixtral-8x7B-Instruct-v0.1", // G
"mistralai/Mistral-7B-Instruct-v0.2", // G
diff --git a/pgml-dashboard/src/components/dropdown/mod.rs b/pgml-dashboard/src/components/dropdown/mod.rs
index 847719ca4..ddb8fa49d 100644
--- a/pgml-dashboard/src/components/dropdown/mod.rs
+++ b/pgml-dashboard/src/components/dropdown/mod.rs
@@ -72,7 +72,7 @@ pub struct Dropdown {
/// Position of the dropdown menu.
offset: String,
- /// Whether or not the dropdown is collapsable.
+ /// Whether or not the dropdown responds to horizontal collapse, i.e. in product left nav.
collapsable: bool,
offset_collapsed: String,
diff --git a/pgml-dashboard/src/components/inputs/text/search/search/search_controller.js b/pgml-dashboard/src/components/inputs/text/search/search/search_controller.js
index 70e7c2e32..005e1a2c0 100644
--- a/pgml-dashboard/src/components/inputs/text/search/search/search_controller.js
+++ b/pgml-dashboard/src/components/inputs/text/search/search/search_controller.js
@@ -30,4 +30,11 @@ export default class extends Controller {
search(id, url) {
this.element.querySelector(`turbo-frame[id=${id}]`).src = url;
}
+
+ // Hide the dropdown if the user clicks outside of it.
+ hideDropdown(e) {
+ if (!this.element.contains(e.target)) {
+ this.endSearch();
+ }
+ }
}
diff --git a/pgml-dashboard/src/components/inputs/text/search/search/template.html b/pgml-dashboard/src/components/inputs/text/search/search/template.html
index 50aa7e40a..419cc103e 100644
--- a/pgml-dashboard/src/components/inputs/text/search/search/template.html
+++ b/pgml-dashboard/src/components/inputs/text/search/search/template.html
@@ -1,14 +1,15 @@
<%
use crate::components::Dropdown;
+
%>