Skip to content

Support instructor embeddings #596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 24, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
path: |
~/.cargo
pgml-extension/target
~/.pgx
~/.pgrx
key: ${{ runner.os }}-rust-${{ hashFiles('pgml-extension/Cargo.lock') }}
- name: Submodules
run: |
Expand All @@ -43,20 +43,20 @@ jobs:
run: |
curl https://sh.rustup.rs -sSf | sh -s -- -y
source ~/.cargo/env
cargo install cargo-pgx --version "0.7.1"
cargo install cargo-pgrx --version "0.7.1"

if [[ ! -d ~/.pgx ]]; then
cargo pgx init
if [[ ! -d ~/.pgrx ]]; then
cargo pgrx init
fi

cargo pgx test
cargo pgrx test

cargo pgx stop
cargo pgx start
cargo pgrx stop
cargo pgrx start

# psql -p 28813 -h 127.0.0.1 -d pgml -P pager -f tests/test.sql

cargo pgx stop
cargo pgrx stop



18 changes: 9 additions & 9 deletions .github/workflows/package-extension.yml
Original file line number Diff line number Diff line change
Expand Up @@ -79,47 +79,47 @@ jobs:
curl -sLO https://github.com/deb-s3/deb-s3/releases/download/0.11.4/deb-s3-0.11.4.gem
sudo gem install deb-s3-0.11.4.gem
dpkg-deb --version
- name: Install pgx
- name: Install pgrx
uses: postgresml/gh-actions-cargo@master
with:
working-directory: pgml-extension
command: install
args: cargo-pgx --version "0.7.1"
- name: pgx init
args: cargo-pgrx --version "0.7.1"
- name: pgrx init
uses: postgresml/gh-actions-cargo@master
with:
working-directory: pgml-extension
command: pgx
command: pgrx
args: init --pg11=/usr/lib/postgresql/11/bin/pg_config --pg12=/usr/lib/postgresql/12/bin/pg_config --pg13=/usr/lib/postgresql/13/bin/pg_config --pg14=/usr/lib/postgresql/14/bin/pg_config --pg15=/usr/lib/postgresql/15/bin/pg_config
- name: Build Postgres 11
uses: postgresml/gh-actions-cargo@master
with:
working-directory: pgml-extension
command: pgx
command: pgrx
args: package --pg-config /usr/lib/postgresql/11/bin/pg_config
- name: Build Postgres 12
uses: postgresml/gh-actions-cargo@master
with:
working-directory: pgml-extension
command: pgx
command: pgrx
args: package --pg-config /usr/lib/postgresql/12/bin/pg_config
- name: Build Postgres 13
uses: postgresml/gh-actions-cargo@master
with:
working-directory: pgml-extension
command: pgx
command: pgrx
args: package --pg-config /usr/lib/postgresql/13/bin/pg_config
- name: Build Postgres 14
uses: postgresml/gh-actions-cargo@master
with:
working-directory: pgml-extension
command: pgx
command: pgrx
args: package --pg-config /usr/lib/postgresql/14/bin/pg_config
- name: Build Postgres 15
uses: postgresml/gh-actions-cargo@master
with:
working-directory: pgml-extension
command: pgx
command: pgrx
args: package --pg-config /usr/lib/postgresql/15/bin/pg_config
- name: Build debs
env:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
## PostgresML

```
cargo pgx run --release
cargo pgrx run --release
```

### Schema
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ Spoiler alert: idiomatic Rust is about 10x faster than native SQL, embedded PL/p
LIMIT 1;
```

We're building with the Rust [pgx](https://github.com/tcdi/pgx/tree/master/pgx) crate that makes our development cycle even nicer than the one we use to manage Python. It really streamlines creating an extension in Rust, so all we have to worry about is writing our functions. It took about an hour to port all of our vector operations to Rust with BLAS support, and another week to port all the "business logic" for maintaining model training and deployment. We've even gained some new capabilities for caching models across connections (independent processes), now that we have access to Postgres shared memory, without having to worry about Python's GIL and GC. This is the dream of Apache's Arrow project, realized for our applications, without having to change the world, just our implementations. 🤩 Single-copy end-to-end machine learning, with parallel processing and shared data access.
We're building with the Rust [pgrx](https://github.com/tcdi/pgrx/tree/master/pgrx) crate that makes our development cycle even nicer than the one we use to manage Python. It really streamlines creating an extension in Rust, so all we have to worry about is writing our functions. It took about an hour to port all of our vector operations to Rust with BLAS support, and another week to port all the "business logic" for maintaining model training and deployment. We've even gained some new capabilities for caching models across connections (independent processes), now that we have access to Postgres shared memory, without having to worry about Python's GIL and GC. This is the dream of Apache's Arrow project, realized for our applications, without having to change the world, just our implementations. 🤩 Single-copy end-to-end machine learning, with parallel processing and shared data access.

## What about XGBoost and friends?
ML isn't just about basic math and a little bit of business logic. It's about all those complicated algorithms beyond linear regression for gradient boosting and deep learning. The good news is that most of these libraries are implemented in C/C++, and just have Python bindings. There are also bindings for Rust ([lightgbm](https://github.com/vaaaaanquish/lightgbm-rs), [xgboost](https://github.com/davechallis/rust-xgboost), [tensorflow](https://github.com/tensorflow/rust), [torch](https://github.com/LaurentMazare/tch-rs)).
Expand Down
22 changes: 11 additions & 11 deletions pgml-docs/docs/developer_guide/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ The development environment for each differs slightly, but overall we use Python

## Postgres extension

PostgresML is a Rust extension written with `tcdi/pgx` crate. Local development therefore requires the [latest Rust compiler](https://www.rust-lang.org/learn/get-started) and PostgreSQL development headers and libraries.
PostgresML is a Rust extension written with `tcdi/pgrx` crate. Local development therefore requires the [latest Rust compiler](https://www.rust-lang.org/learn/get-started) and PostgreSQL development headers and libraries.

The extension code is located in:

Expand All @@ -72,17 +72,17 @@ cd pgml-extension/

You'll need to install basic dependencies

Once there, you can initialize `pgx` and get going:
Once there, you can initialize `pgrx` and get going:

#### Pgx command line and environments
```commandline
cargo install cargo-pgx --version "0.7.1" && \
cargo pgx init # This will take a few minutes
cargo install cargo-pgrx --version "0.7.4" && \
cargo pgrx init # This will take a few minutes
```

#### Update postgresql.conf

`pgx` uses Postgres 15 by default. Since `pgml` is using shared memory, you need to add it to `shared_preload_libraries` in `postgresql.conf` which, for `pgx`, is located in `~/.pgx/data-15/postgresql.conf`.
`pgrx` uses Postgres 15 by default. Since `pgml` is using shared memory, you need to add it to `shared_preload_libraries` in `postgresql.conf` which, for `pgrx`, is located in `~/.pgrx/data-15/postgresql.conf`.

```
shared_preload_libraries = 'pgml' # (change requires restart)
Expand All @@ -91,19 +91,19 @@ shared_preload_libraries = 'pgml' # (change requires restart)
Run the unit tests

```commandline
cargo pgx test
cargo pgrx test
```

Run the integration tests:
```commandline
cargo pgx run --release
cargo pgrx run --release
psql -h localhost -p 28813 -d pgml -f tests/test.sql -P pager
```

Run an interactive psql session

```commandline
cargo pgx run
cargo pgrx run
```

Create the extension in your database:
Expand Down Expand Up @@ -147,10 +147,10 @@ By default, the extension is built without CUDA support for XGBoost and LightGBM


```commandline
CUDACXX=/usr/local/cuda/bin/nvcc cargo pgx run --release --features pg15,python,cuda
CUDACXX=/usr/local/cuda/bin/nvcc cargo pgrx run --release --features pg15,python,cuda
```

If you ever want to reset the environment, simply spin up the database with `cargo pgx run` and drop the extension and metadata tables:
If you ever want to reset the environment, simply spin up the database with `cargo pgrx run` and drop the extension and metadata tables:

```postgresql
DROP EXTENSION IF EXISTS pgml CASCADE;
Expand Down Expand Up @@ -190,7 +190,7 @@ Basic installation can be achieved with:
cd postgresml/pgml-dashboard
```

2. Set the `DATABASE_URL` environment variable, for example to a running interactive `cargo pgx run` session started previously:
2. Set the `DATABASE_URL` environment variable, for example to a running interactive `cargo pgrx run` session started previously:
```commandline
export DATABASE_URL=postgres://localhost:28815/pgml
```
Expand Down
22 changes: 11 additions & 11 deletions pgml-docs/docs/user_guides/setup/v2/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,25 +92,25 @@ sudo apt-get install postgresql
cd pgml-extension
```

5. Install [`pgx`](https://github.com/tcdi/pgx) and build the extension (this will take a few minutes):
5. Install [`pgrx`](https://github.com/tcdi/pgrx) and build the extension (this will take a few minutes):

**With Python support:**

```bash
export POSTGRES_VERSION=15
cargo install cargo-pgx --version "0.7.1" && \
cargo pgx init --pg${POSTGRES_VERSION} /usr/bin/pg_config && \
cargo pgx package
cargo install cargo-pgrx --version "0.7.4" && \
cargo pgrx init --pg${POSTGRES_VERSION} /usr/bin/pg_config && \
cargo pgrx package
```

**Without Python support:**

```bash
export POSTGRES_VERSION=15
cp docker/Cargo.toml.no-python Cargo.toml && \
cargo install cargo-pgx --version "0.7.1" && \
cargo pgx init --pg${POSTGRES_VERSION} /usr/bin/pg_config && \
cargo pgx package
cargo install cargo-pgrx --version "0.7.4" && \
cargo pgrx init --pg${POSTGRES_VERSION} /usr/bin/pg_config && \
cargo pgrx package
```

6. Copy the extension binaries into Postgres system folders:
Expand Down Expand Up @@ -152,12 +152,12 @@ sudo apt-get install postgresql
For example, `openssl` requires some environment variables set in `~/.zsh` for
the compiler to find the library.

4. Install [`pgx`](https://github.com/tcdi/pgx) and build the extension (this will take a few minutes):
4. Install [`pgrx`](https://github.com/tcdi/pgrx) and build the extension (this will take a few minutes):

```
cargo install cargo-pgx && \
cargo pgx init --pg15 /usr/bin/pg_config && \
cargo pgx install
cargo install cargo-pgrx && \
cargo pgrx init --pg15 /usr/bin/pg_config && \
cargo pgrx install
```


Expand Down
Loading
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy