thub.com
diff --git a/pgml-cms/docs/.gitbook/assets/Chatbots_Flow-Diagram.svg b/pgml-cms/docs/.gitbook/assets/Chatbots_Flow-Diagram.svg
new file mode 100644
index 000000000..382cab6e3
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/Chatbots_Flow-Diagram.svg
@@ -0,0 +1,281 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/Chatbots_King-Diagram.svg b/pgml-cms/docs/.gitbook/assets/Chatbots_King-Diagram.svg
new file mode 100644
index 000000000..8f9d7f7fd
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/Chatbots_King-Diagram.svg
@@ -0,0 +1,78 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/Chatbots_Limitations-Diagram.svg b/pgml-cms/docs/.gitbook/assets/Chatbots_Limitations-Diagram.svg
new file mode 100644
index 000000000..c96b30ec4
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/Chatbots_Limitations-Diagram.svg
@@ -0,0 +1,275 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/Chatbots_Tokens-Diagram.svg b/pgml-cms/docs/.gitbook/assets/Chatbots_Tokens-Diagram.svg
new file mode 100644
index 000000000..0b7c0915a
--- /dev/null
+++ b/pgml-cms/docs/.gitbook/assets/Chatbots_Tokens-Diagram.svg
@@ -0,0 +1,238 @@
+
diff --git a/pgml-cms/docs/.gitbook/assets/chatbot_flow.png b/pgml-cms/docs/.gitbook/assets/chatbot_flow.png
deleted file mode 100644
index f9107d99f..000000000
Binary files a/pgml-cms/docs/.gitbook/assets/chatbot_flow.png and /dev/null differ
diff --git a/pgml-cms/docs/.gitbook/assets/embedding_king.png b/pgml-cms/docs/.gitbook/assets/embedding_king.png
deleted file mode 100644
index 03deebbe8..000000000
Binary files a/pgml-cms/docs/.gitbook/assets/embedding_king.png and /dev/null differ
diff --git a/pgml-cms/docs/.gitbook/assets/embeddings_tokens.png b/pgml-cms/docs/.gitbook/assets/embeddings_tokens.png
deleted file mode 100644
index 6f7a13221..000000000
Binary files a/pgml-cms/docs/.gitbook/assets/embeddings_tokens.png and /dev/null differ
diff --git a/pgml-cms/docs/guides/chatbots/README.md b/pgml-cms/docs/guides/chatbots/README.md
index cd65d9125..9237f5c38 100644
--- a/pgml-cms/docs/guides/chatbots/README.md
+++ b/pgml-cms/docs/guides/chatbots/README.md
@@ -30,7 +30,7 @@ Here is an example flowing from:
text -> tokens -> LLM -> probability distribution -> predicted token -> text
-
The flow of inputs through an LLM. In this case the inputs are "What is Baldur's Gate 3?" and the output token "14" maps to the word "I"
+
The flow of inputs through an LLM. In this case the inputs are "What is Baldur's Gate 3?" and the output token "14" maps to the word "I"
{% hint style="info" %}
We have simplified the tokenization process. Words do not always map directly to tokens. For instance, the word "Baldur's" may actually map to multiple tokens. For more information on tokenization checkout [HuggingFace's summary](https://huggingface.co/docs/transformers/tokenizer\_summary).
@@ -108,11 +108,11 @@ What does an `embedding` look like? `Embeddings` are just vectors (for our use c
embedding_1 = embed("King") # embed returns something like [0.11, -0.32, 0.46, ...]
```
-
The flow of word -> token -> embedding
+
The flow of word -> token -> embedding
`Embeddings` aren't limited to words, we have models that can embed entire sentences.
-
The flow of sentence -> tokens -> embedding
+
The flow of sentence -> tokens -> embedding
Why do we care about `embeddings`? `Embeddings` have a very interesting property. Words and sentences that have close [semantic similarity](https://en.wikipedia.org/wiki/Semantic\_similarity) sit closer to one another in vector space than words and sentences that do not have close semantic similarity.
@@ -157,7 +157,7 @@ print(context)
There is a lot going on with this, let's check out this diagram and step through it.
-
The flow of taking a document, splitting it into chunks, embedding those chunks, and then retrieving a chunk based off of a users query
+
The flow of taking a document, splitting it into chunks, embedding those chunks, and then retrieving a chunk based off of a users query
Step 1: We take the document and split it into chunks. Chunks are typically a paragraph or two in size. There are many ways to split documents into chunks, for more information check out [this guide](https://www.pinecone.io/learn/chunking-strategies/).