Skip to content

Montana/nl #1382

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 0 additions & 6 deletions pgml-cms/blog/announcing-support-for-aws-us-east-1-region.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,14 +27,8 @@ To demonstrate the impact of moving the data closer to your application, we've c

<figure><img src=".gitbook/assets/image (8).png" alt=""><figcaption></figcaption></figure>

\


<figure><img src=".gitbook/assets/image (9).png" alt=""><figcaption></figcaption></figure>

\


## Using the New Region

To take advantage of latency savings, you can [deploy a dedicated PostgresML database](https://postgresml.org/signup) in `us-east-1` today. We make it as simple as filling out a very short form and clicking "Create database".
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Montana Low

April 21, 2023

PostgresML makes it easy to generate embeddings from text in your database using a large selection of state-of-the-art models with one simple call to **`pgml.embed`**`(model_name, text)`. Prove the results in this series to your own satisfaction, for free, by signing up for a GPU accelerated database.
PostgresML makes it easy to generate embeddings from text in your database using a large selection of state-of-the-art models with one simple call to `pgml.embed(model_name, text)`. Prove the results in this series to your own satisfaction, for free, by signing up for a GPU accelerated database.

This article is the first in a multipart series that will show you how to build a post-modern semantic search and recommendation engine, including personalization, using open source models.

Expand Down Expand Up @@ -216,9 +216,6 @@ For comparison, it would cost about $299 to use OpenAI's cheapest embedding mode
| GPU | 17ms | $72 | 6 hours |
| OpenAI | 300ms | $299 | millennia |

\


You can also find embedding models that outperform OpenAI's `text-embedding-ada-002` model across many different tests on the [leaderboard](https://huggingface.co/spaces/mteb/leaderboard). It's always best to do your own benchmarking with your data, models, and hardware to find the best fit for your use case.

> _HTTP requests to a different datacenter cost more time and money for lower reliability than co-located compute and storage._
Expand Down
3 changes: 0 additions & 3 deletions pgml-cms/blog/meet-us-at-the-2024-postgres-conference.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ Why should you care? It's not every day you get to dive headfirst into the world
Save 25% on your ticket with our discount code: 2024\_POSTGRESML\_25
{% endhint %}

\
PostgresML CEO and founder, Montana Low, will kick off the event on April 17th with a keynote about navigating the confluence of hardware evolution and machine learning technology.&#x20;

We’ll also be hosting a masterclass in retrieval augmented generation (RAG) on April 18th. Our own Silas Marvin will give hands-on guidance to equip you with the ability to implement RAG directly within your database.&#x20;
Expand All @@ -36,5 +35,3 @@ If you’d like some 1:1 time with our team at PgConf [contact us here](https://
So, why sit on the sidelines when you could be right in the thick of it, soaking up knowledge, making connections, and maybe even stumbling upon your next big breakthrough? Clear your schedule, grab your ticket, and get ready to geek out with us in San Jose.

See you there!

\
6 changes: 0 additions & 6 deletions pgml-cms/blog/mindsdb-vs-postgresml.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,6 @@ Both Projects integrate several dozen machine learning algorithms, including the
| Full Text Search | - | ✅ |
| Geospatial Search | - | ✅ |

\


Both MindsDB and PostgresML support many classical machine learning algorithms to do classification and regression. They are both able to load ~~the latest LLMs~~ some models from Hugging Face, supported by underlying implementations in libtorch. I had to cross that out after exploring all the caveats in the MindsDB implementations. PostgresML supports the models released immediately as long as underlying dependencies are met. MindsDB has to release an update to support any new models, and their current model support is extremely limited. New algorithms, tasks, and models are constantly released, so it's worth checking the documentation for the latest list.

Another difference is that PostgresML also supports embedding models, and closely integrates them with vector search inside the database, which is well beyond the scope of MindsDB, since it's not a database at all. PostgresML has direct access to all the functionality provided by other Postgres extensions, like vector indexes from [pgvector](https://github.com/pgvector/pgvector) to perform efficient KNN & ANN vector recall, or [PostGIS](http://postgis.net/) for geospatial information as well as built in full text search. Multiple algorithms and extensions can be combined in compound queries to build state-of-the-art systems, like search and recommendations or fraud detection that generate an end to end result with a single query, something that might take a dozen different machine learning models and microservices in a more traditional architecture.
Expand Down Expand Up @@ -300,9 +297,6 @@ PostgresML is the clear winner in terms of performance. It seems to me that it c
| translation\_en\_to\_es | t5-base | 1573 | 1148 | 294 |
| summarization | sshleifer/distilbart-cnn-12-6 | 4289 | 3450 | 479 |

\


There is a general trend, the larger and slower the model is, the more work is spent inside libtorch, the less the performance of the rest matters, but for interactive models and use cases there is a significant difference. We've tried to cover the most generous use case we could between these two. If we were to compare XGBoost or other classical algorithms, that can have sub millisecond prediction times in PostgresML, the 20ms Python service overhead of MindsDB just to parse the incoming query would be hundreds of times slower.

## Clouds
Expand Down
2 changes: 1 addition & 1 deletion pgml-cms/docs/api/apis.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ We also provide Client SDKs that implement the best practices on top of the SQL
## SQL Extension

PostgreSQL is designed to be _**extensible**_. This has created a rich open-source ecosystem of additional functionality built around the core project. Some [extensions](https://www.postgresql.org/docs/current/contrib.html) are include in the base Postgres distribution, but others are also available via the [PostgreSQL Extension Network](https://pgxn.org/).\
\

There are 2 foundational extensions included in a PostgresML deployment that provide functionality inside the database through SQL APIs.

* **pgml** - provides Machine Learning and Artificial Intelligence APIs with access to more than 50 ML algorithms to train classification, clustering and regression models on your own data, or you can perform dozens of tasks with thousands of models downloaded from HuggingFace.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@ Both Projects integrate several dozen machine learning algorithms, including the
| Full Text Search | - | ✅ |
| Geospatial Search | - | ✅ |

\
Both MindsDB and PostgresML support many classical machine learning algorithms to do classification and regression. They are both able to load ~~the latest LLMs~~ some models from Hugging Face, supported by underlying implementations in libtorch. I had to cross that out after exploring all the caveats in the MindsDB implementations. PostgresML supports the models released immediately as long as underlying dependencies are met. MindsDB has to release an update to support any new models, and their current model support is extremely limited. New algorithms, tasks, and models are constantly released, so it's worth checking the documentation for the latest list.

Another difference is that PostgresML also supports embedding models, and closely integrates them with vector search inside the database, which is well beyond the scope of MindsDB, since it's not a database at all. PostgresML has direct access to all the functionality provided by other Postgres extensions, like vector indexes from [pgvector](https://github.com/pgvector/pgvector) to perform efficient KNN & ANN vector recall, or [PostGIS](http://postgis.net/) for geospatial information as well as built in full text search. Multiple algorithms and extensions can be combined in compound queries to build state-of-the-art systems, like search and recommendations or fraud detection that generate an end to end result with a single query, something that might take a dozen different machine learning models and microservices in a more traditional architecture.
Expand Down
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy