Skip to content

Commit 651b3d8

Browse files
authored
llama 3.1 release (#1579)
1 parent 3b56d27 commit 651b3d8

File tree

29 files changed

+165
-102
lines changed

29 files changed

+165
-102
lines changed

pgml-apps/pgml-chat/pgml_chat/main.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@ def handler(signum, frame):
123123
"--chat_completion_model",
124124
dest="chat_completion_model",
125125
type=str,
126-
default="meta-llama/Meta-Llama-3-8B-Instruct",
126+
default="meta-llama/Meta-Llama-3.1-8B-Instruct",
127127
)
128128

129129
parser.add_argument(
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
---
2+
description: >-
3+
Today we’re taking the next steps towards open source AI becoming the industry standard. We’re releasing Llama 3.1 405B, the first frontier-level open source AI model, as well as new and improved Llama 3.1 70B and 8B models. In addition to having significantly better cost/performance relative to closed models.
4+
featured: false
5+
tags: [engineering]
6+
image: ".gitbook/assets/image (2) (2).png"
7+
---
8+
9+
# Announcing Support for Meta Llama 3.1
10+
11+
<div align="left">
12+
13+
<figure><img src=".gitbook/assets/montana.jpg" alt="Author" width="125"><figcaption></figcaption></figure>
14+
15+
</div>
16+
17+
Montana Low
18+
19+
July 23, 2024
20+
21+
We're pleased to offer Meta Llama 3.1 running in our serverless cloud today. Mark Zuckerberg explained [his company's reasons for championing open source AI](https://about.fb.com/news/2024/07/open-source-ai-is-the-path-forward/), and it's great to see a strong ecosystem forming. These models are now available in our serverless cloud with optimized kernels for maximum throughput.
22+
23+
- meta-llama/Meta-Llama-3.1-8B-Instruct
24+
- meta-llama/Meta-Llama-3.1-70B-Instruct
25+
- meta-llama/Meta-Llama-3.1-405B-Instruct
26+
27+
## Is open-source AI right for you?
28+
29+
We think so. Open-source models have made remarkable strides, not only catching up to proprietary counterparts but also surpassing them across multiple domains. The advantages are clear:
30+
31+
* **Performance & reliability:** Open-source models are increasingly comparable or superior across a wide range of tasks and performance metrics. Mistral and Llama-based models, for example, are easily faster than GPT 4. Reliability is another concern you may reconsider leaving in the hands of OpenAI. OpenAI’s API has suffered from several recent outages, and their rate limits can interrupt your app if there is a surge in usage. Open-source models enable greater control over your model’s latency, scalability and availability. Ultimately, the outcome of greater control is that your organization can produce a more dependable integration and a highly reliable production application.
32+
* **Safety & privacy:** Open-source models are the clear winner when it comes to security sensitive AI applications. There are [enormous risks](https://www.infosecurity-magazine.com/news-features/chatgpts-datascraping-scrutiny/) associated with transmitting private data to external entities such as OpenAI. By contrast, open-source models retain sensitive information within an organization's own cloud environments. The data never has to leave your premises, so the risk is bypassed altogether – it’s enterprise security by default. At PostgresML, we offer such private hosting of LLM’s in your own cloud.
33+
* **Model censorship:** A growing number of experts inside and outside of leading AI companies argue that model restrictions have gone too far. The Atlantic recently published an [article on AI’s “Spicy-Mayo Problem'' ](https://www.theatlantic.com/ideas/archive/2023/11/ai-safety-regulations-uncensored-models/676076/) which delves into the issues surrounding AI censorship. The titular example describes a chatbot refusing to return commands asking for a “dangerously spicy” mayo recipe. Censorship can affect baseline performance, and in the case of apps for creative work such as Sudowrite, unrestricted open-source models can actually be a key differentiating value for users.
34+
* **Flexibility & customization:** Closed-source models like GPT3.5 Turbo are fine for generalized tasks, but leave little room for customization. Fine-tuning is highly restricted. Additionally, the headwinds at OpenAI have exposed the [dangerous reality of AI vendor lock-in](https://techcrunch.com/2023/11/21/openai-dangers-vendor-lock-in/). Open-source models such as MPT-7B, Llama V2 and Mistral 7B are designed with extensive flexibility for fine tuning, so organizations can create custom specifications and optimize model performance for their unique needs. This level of customization and flexibility opens the door for advanced techniques like DPO, PPO LoRa and more.
35+
36+
For a full list of models available in our cloud, check out our [plans and pricing](/pricing).
37+

pgml-cms/blog/introducing-korvus-the-all-in-one-rag-pipeline-for-postgresml.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ async def main():
100100
"aggregate": {"join": "\n"},
101101
},
102102
"chat": {
103-
"model": "meta-llama/Meta-Llama-3-8B-Instruct",
103+
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
104104
"messages": [
105105
{
106106
"role": "system",

pgml-cms/blog/introducing-the-openai-switch-kit-move-from-closed-to-open-source-ai-in-minutes.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ The Switch Kit is an open-source AI SDK that provides a drop in replacement for
4444
const korvus = require("korvus");
4545
const client = korvus.newOpenSourceAI();
4646
const results = client.chat_completions_create(
47-
"meta-llama/Meta-Llama-3-8B-Instruct",
47+
"meta-llama/Meta-Llama-3.1-8B-Instruct",
4848
[
4949
{
5050
role: "system",
@@ -65,7 +65,7 @@ console.log(results);
6565
import korvus
6666
client = korvus.OpenSourceAI()
6767
results = client.chat_completions_create(
68-
"meta-llama/Meta-Llama-3-8B-Instruct",
68+
"meta-llama/Meta-Llama-3.1-8B-Instruct",
6969
[
7070
{
7171
"role": "system",
@@ -96,7 +96,7 @@ print(results)
9696
],
9797
"created": 1701291672,
9898
"id": "abf042d2-9159-49cb-9fd3-eef16feb246c",
99-
"model": "meta-llama/Meta-Llama-3-8B-Instruct",
99+
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
100100
"object": "chat.completion",
101101
"system_fingerprint": "eecec9d4-c28b-5a27-f90b-66c3fb6cee46",
102102
"usage": {
@@ -113,7 +113,7 @@ We don't charge per token, so OpenAI “usage” metrics are not particularly re
113113

114114
!!!
115115

116-
The above is an example using our open-source AI SDK with Meta-Llama-3-8B-Instruct, an incredibly popular and highly efficient 8 billion parameter model.
116+
The above is an example using our open-source AI SDK with Meta-Llama-3.1-8B-Instruct, an incredibly popular and highly efficient 8 billion parameter model.
117117

118118
Notice there is near one to one relation between the parameters and return type of OpenAI’s `chat.completions.create` and our `chat_completion_create`.
119119

@@ -125,7 +125,7 @@ Here is an example of streaming:
125125
const korvus = require("korvus");
126126
const client = korvus.newOpenSourceAI();
127127
const it = client.chat_completions_create_stream(
128-
"meta-llama/Meta-Llama-3-8B-Instruct",
128+
"meta-llama/Meta-Llama-3.1-8B-Instruct",
129129
[
130130
{
131131
role: "system",
@@ -150,7 +150,7 @@ while (!result.done) {
150150
import korvus
151151
client = korvus.OpenSourceAI()
152152
results = client.chat_completions_create_stream(
153-
"meta-llama/Meta-Llama-3-8B-Instruct",
153+
"meta-llama/Meta-Llama-3.1-8B-Instruct",
154154
[
155155
{
156156
"role": "system",
@@ -182,7 +182,7 @@ for c in results:
182182
],
183183
"created": 1701296792,
184184
"id": "62a817f5-549b-43e0-8f0c-a7cb204ab897",
185-
"model": "meta-llama/Meta-Llama-3-8B-Instruct",
185+
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
186186
"object": "chat.completion.chunk",
187187
"system_fingerprint": "f366d657-75f9-9c33-8e57-1e6be2cf62f3"
188188
}
@@ -198,7 +198,7 @@ for c in results:
198198
],
199199
"created": 1701296792,
200200
"id": "62a817f5-549b-43e0-8f0c-a7cb204ab897",
201-
"model": "meta-llama/Meta-Llama-3-8B-Instruct",
201+
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct",
202202
"object": "chat.completion.chunk",
203203
"system_fingerprint": "f366d657-75f9-9c33-8e57-1e6be2cf62f3"
204204
}

pgml-cms/blog/unified-rag.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ Here is an example of the pgml.transform function
5151
SELECT pgml.transform(
5252
task => ''{
5353
"task": "text-generation",
54-
"model": "meta-llama/Meta-Llama-3-8B-Instruct"
54+
"model": "meta-llama/Meta-Llama-3.1-8B-Instruct"
5555
}''::JSONB,
5656
inputs => ARRAY[''AI is going to''],
5757
args => ''{
@@ -64,7 +64,7 @@ Here is another example of the pgml.transform function
6464
SELECT pgml.transform(
6565
task => ''{
6666
"task": "text-generation",
67-
"model": "meta-llama/Meta-Llama-3-70B-Instruct"
67+
"model": "meta-llama/Meta-Llama-3.1-70B-Instruct"
6868
}''::JSONB,
6969
inputs => ARRAY[''AI is going to''],
7070
args => ''{
@@ -145,9 +145,9 @@ SELECT * FROM chunks limit 10;
145145
| id | chunk | chunk_index | document_id |
146146
| ---- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- | ------------- |
147147
| 1 | Here is an example of the pgml.transform function | 1 | 1 |
148-
| 2 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 2 | 1 |
148+
| 2 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 2 | 1 |
149149
| 3 | Here is another example of the pgml.transform function | 3 | 1 |
150-
| 4 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 4 | 1 |
150+
| 4 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 4 | 1 |
151151
| 5 | Here is a third example of the pgml.transform function | 5 | 1 |
152152
| 6 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "microsoft/Phi-3-mini-128k-instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); | 6 | 1 |
153153
| 7 | ae94d3413ae82367c3d0592a67302b25 | 1 | 2 |
@@ -253,8 +253,8 @@ LIMIT 6;
253253
| 1 | 0.09044166306461232 | Here is an example of the pgml.transform function |
254254
| 3 | 0.10787954026965096 | Here is another example of the pgml.transform function |
255255
| 5 | 0.11683694289239333 | Here is a third example of the pgml.transform function |
256-
| 2 | 0.17699128851412282 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
257-
| 4 | 0.17844729798760672 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
256+
| 2 | 0.17699128851412282 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
257+
| 4 | 0.17844729798760672 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
258258
| 6 | 0.17520464423854842 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "microsoft/Phi-3-mini-128k-instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
259259

260260
!!!
@@ -330,8 +330,8 @@ FROM (
330330

331331
| cosine_distance | rank_score | chunk |
332332
| -------------------- | -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
333-
| 0.2124727254737595 | 0.3427378833293915 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
334-
| 0.2109014406365579 | 0.342184841632843 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
333+
| 0.2124727254737595 | 0.3427378833293915 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-70B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
334+
| 0.2109014406365579 | 0.342184841632843 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "meta-llama/Meta-Llama-3.1-8B-Instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
335335
| 0.21259646694819168 | 0.3332781493663788 | SELECT pgml.transform(\n task => ''{\n "task": "text-generation",\n "model": "microsoft/Phi-3-mini-128k-instruct"\n }''::JSONB,\n inputs => ARRAY[''AI is going to''],\n args => ''{\n "max_new_tokens": 100\n }''::JSONB\n ); |
336336
| 0.19483324929456136 | 0.03163915500044823 | Here is an example of the pgml.transform function |
337337
| 0.1685870257610742 | 0.031176624819636345 | Here is a third example of the pgml.transform function |

pgml-cms/docs/SUMMARY.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@
146146
* [Explain plans]()
147147
* [Composition]()
148148
* [LLMs]()
149-
* [LLama]()
149+
* [Llama]()
150150
* [GPT]()
151151
* [Facon]()
152152
* [Glossary]()

pgml-cms/docs/introduction/getting-started/connect-your-app.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ const pgml = require("pgml");
4242
const main = () => {
4343
const client = pgml.newOpenSourceAI();
4444
const results = client.chat_completions_create(
45-
"meta-llama/Meta-Llama-3-8B-Instruct",
45+
"meta-llama/Meta-Llama-3.1-8B-Instruct",
4646
[
4747
{
4848
role: "system",
@@ -66,7 +66,7 @@ import pgml
6666
async def main():
6767
client = pgml.OpenSourceAI()
6868
results = client.chat_completions_create(
69-
"meta-llama/Meta-Llama-3-8B-Instruct",
69+
"meta-llama/Meta-Llama-3.1-8B-Instruct",
7070
[
7171
{
7272
"role": "system",

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy