Content-Length: 386808 | pFad | https://github.com/postgresml/postgresml/pull/14

9F add a new example by montanalow · Pull Request #14 · postgresml/postgresml · GitHub
Skip to content

add a new example #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 17, 2022
Merged

add a new example #14

merged 2 commits into from
Apr 17, 2022

Conversation

montanalow
Copy link
Contributor

No description provided.

@montanalow montanalow requested a review from levkk April 17, 2022 04:32

from pgml.exceptions import PgMLException
from pgml.sql import q

def flatten(S):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this won't blow the stack on a large dataset?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s called per row, and I haven’t seen datasets with more than 4D arrays.

SELECT models.*
FROM pgml.models
WHERE project_id = {q(project.id)}
ORDER by models.metrics->>{q(metric)} DESC
Copy link
Contributor

@levkk levkk Apr 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not flatten normalize the structure into the table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relevant metrics are different depending on the objective. We could have another join table to hold just metrics per model, but that seems like overkill just yet.

Copy link
Contributor

@levkk levkk Apr 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just thinking of making them nullable and only fill in the relevant columns for the model being trained.

---
--- Predict
---
CREATE OR REPLACE FUNCTION pgml.predict(project_name TEXT, VARIADIC features DOUBLE PRECISION[])
CREATE OR REPLACE FUNCTION pgml.predict(project_name TEXT, features NUMERIC[])
Copy link
Contributor

@levkk levkk Apr 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering about this because variadic allows us to pass columns as arguments, e.g.:

SELECT pgml.predict('Red Wine Quality', quality_wine_red.acidity, quality_wine_red.color, ...)
FROM quality_wine_red
WHERE ...

I see this to be the more likely use case than passing in some raw numbers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s true, but you can always put those columns in an array just like this, and having the features as a single param will allow us to extend the API with additional Paramus in the future if we need.

@levkk
Copy link
Contributor

levkk commented Apr 17, 2022

Bump the version in pgml/__init__.py so this gets released to Pypi.

@montanalow montanalow merged commit 53c7e43 into master Apr 17, 2022
@montanalow montanalow deleted the montana/api branch May 4, 2022 17:34
Sasasu added a commit to Sasasu/postgresml that referenced this pull request Aug 18, 2023
pgml is not compatible with plpython, if using both pgml and plpython in the
same session, postgresql will crash.

minimum reproducible code:

```sql
SELECT pgml.embed('intfloat/e5-small', 'hi mom');

create or replace function pyudf()
returns int as
$$
return 0
$$ language 'plpython3u';
```

the call stack:

```
 Stack trace of thread 161970:
 #0  0x00007efc1429edb8 PyImport_Import (libpython3.9.so.1.0 + 0x9edb8)
 postgresml#1  0x00007efc1429f125 PyImport_ImportModule (libpython3.9.so.1.0 + 0x9f125)
 postgresml#2  0x00007efb04b0f496 n/a (plpython3.so + 0x10496)
 postgresml#3  0x00007efb04b1039d plpython3_validator (plpython3.so + 0x1139d)
 postgresml#4  0x0000559d0cdbc5c2 OidFunctionCall1Coll (postgres + 0x6465c2)
 postgresml#5  0x0000559d0c9d68bb ProcedureCreate (postgres + 0x2608bb)
 postgresml#6  0x0000559d0ca5030c CreateFunction (postgres + 0x2da30c)
 postgresml#7  0x0000559d0ce1c730 n/a (postgres + 0x6a6730)
 postgresml#8  0x0000559d0cc5a030 standard_ProcessUtility (postgres + 0x4e4030)
 postgresml#9  0x0000559d0cc545ed n/a (postgres + 0x4de5ed)
 postgresml#10 0x0000559d0cc546e7 n/a (postgres + 0x4de6e7)
 postgresml#11 0x0000559d0cc54beb PortalRun (postgres + 0x4debeb)
 postgresml#12 0x0000559d0cc55249 n/a (postgres + 0x4df249)
 postgresml#13 0x0000559d0cc576f0 PostgresMain (postgres + 0x4e16f0)
 postgresml#14 0x0000559d0cbc3e9c n/a (postgres + 0x44de9c)
 postgresml#15 0x0000559d0cbc50aa PostmasterMain (postgres + 0x44f0aa)
 postgresml#16 0x0000559d0c8ce7d2 main (postgres + 0x1587d2)
 postgresml#17 0x00007efc18427cd0 n/a (libc.so.6 + 0x27cd0)
 postgresml#18 0x00007efc18427d8a __libc_start_main (libc.so.6 + 0x27d8a)
 postgresml#19 0x0000559d0c8cee15 _start (postgres + 0x158e15)
```

this is because PostgreSQL is using dlopen(RTLD_GLOBAL). this will parse some
of symbols into the previous opened .so file, but the others will use a
relative offset in pgml.so, and will cause a null-pointer crash.

this commit hide all symbols except the UDF symbols (ends with `_wrapper`) and
the magic symbols (`_PG_init` `Pg_magic_func`). so dlopen(RTLD_GLOBAL) will
parse the symbols to the correct position.
SilasMarvin pushed a commit that referenced this pull request Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants








ApplySandwichStrip

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier!      Saves Data!


--- a PPN by Garber Painting Akron. With Image Size Reduction included!

Fetched URL: https://github.com/postgresml/postgresml/pull/14

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy