Skip to content

Commit 67d076d

Browse files
authored
Document inference w/ preprocessing (#520)
Co-authored-by: Montana Low <montana.low@gmail.com>
1 parent 8220b53 commit 67d076d

File tree

2 files changed

+10
-1
lines changed

2 files changed

+10
-1
lines changed

pgml-docs/docs/user_guides/training/preprocessing.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ There are 3 steps to preprocessing data:
2929
These preprocessing steps may be specified on a per-column basis to the [train()](/user_guides/training/overview/) function. By default, PostgresML does minimal preprocessing on training data, and will raise an error during analysis if NULL values are encountered without a preprocessor. All types other than `TEXT` are treated as quantitative variables and cast to floating point representations before passing them to the underlying algorithm implementations.
3030

3131
```postgresql title="pgml.train()"
32-
select pgml.train(
32+
SELECT pgml.train(
3333
project_name => 'preprocessed_model',
3434
task => 'classification',
3535
relation_name => 'weather_data',
@@ -52,6 +52,14 @@ In some cases, it may make sense to use multiple steps for a single column. For
5252
!!! note
5353
TEXT is used in this document to also refer to VARCHAR and CHAR(N) types.
5454

55+
## Predicting with Preprocessors
56+
57+
A model that has been trained with preprocessors should use a Postgres tuple for prediction, rather than a `FLOAT4[]`. Tuples may contain multiple different types (like `TEXT` and `BIGINT`), while an ARRAY may only contain a single type. You can use parenthesis around values to create a Postgres tuple.
58+
59+
```postgresql title="pgml.predict()"
60+
SELECT pgml.predict('preprocessed_model', ('jan', 'nimbus', 0.5, 7));
61+
```
62+
5563
## Categorical encodings
5664
Encoding categorical variables is an O(N log(M)) where N is the number of rows, and M is the number of distinct categories.
5765

pgml-docs/mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,6 +127,7 @@ nav:
127127
- Training:
128128
- Training Overview: user_guides/training/overview.md
129129
- Algorithm Selection: user_guides/training/algorithm_selection.md
130+
- Preprocessing Data: user_guides/training/preprocessing.md
130131
- Hyperparameter Search: user_guides/training/hyperparameter_search.md
131132
- Joint Optimization: user_guides/training/joint_optimization.md
132133
- Predictions:

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy