Skip to content

Commit 7cbee43

Browse files
committed
README updates
1 parent 3db857c commit 7cbee43

File tree

1 file changed

+24
-7
lines changed

1 file changed

+24
-7
lines changed

README.md

Lines changed: 24 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -919,26 +919,42 @@ To analyze the distribution of labels in the shuffled dataset, you can use the f
919919

920920
```sql
921921
-- Count the occurrences of each label in the shuffled dataset
922-
SELECT
923-
label,
922+
pgml=# SELECT
923+
class,
924924
COUNT(*) AS label_count
925925
FROM pgml.imdb_shuffled_view
926-
GROUP BY label
927-
ORDER BY label;
926+
GROUP BY class
927+
ORDER BY class;
928928

929+
class | label_count
930+
----------+-------------
931+
negative | 25000
932+
positive | 25000
933+
(2 rows)
934+
```
929935

930936
This query provides insights into the distribution of labels, helping you understand the balance or imbalance of classes in your dataset.
931937

932938
#### 3.2 Sample Records
933939
To get a glimpse of the data, you can retrieve a sample of records from the shuffled dataset:
934940

935941
```sql
936-
Copy code
937942
-- Retrieve a sample of records from the shuffled dataset
938-
SELECT *
943+
pgml=# SELECT LEFT(text,100) AS text, class
939944
FROM pgml.imdb_shuffled_view
940-
LIMIT 10; -- Adjust the limit based on the desired number of records
945+
LIMIT 5;
946+
text | class
947+
------------------------------------------------------------------------------------------------------+----------
948+
This is a VERY entertaining movie. A few of the reviews that I have read on this forum have been wri | positive
949+
This is one of those movies where I wish I had just stayed in the bar.<br /><br />The film is quite | negative
950+
Barbershop 2: Back in Business wasn't as good as it's original but was just as funny. The movie itse | negative
951+
Umberto Lenzi hits new lows with this recycled trash. Janet Agren plays a lady who is looking for he | negative
952+
I saw this movie last night at the Phila. Film festival. It was an interesting and funny movie that | positive
953+
(5 rows)
954+
955+
Time: 101.985 ms
941956
```
957+
942958
This query allows you to inspect a few records to understand the structure and content of the shuffled data.
943959

944960
#### 3.3 Additional Exploratory Analysis
@@ -1112,6 +1128,7 @@ During training, model is periodically uploaded to Hugging Face Hub. You will fi
11121128
Now, that we have fine-tuned model on Hugging Face Hub, we can use [`pgml.transform`](https://postgresml.org/docs/introduction/apis/sql-extensions/pgml.transform/text-classification) to perform real-time predictions as well as batch predictions.
11131129

11141130
**Real-time predictions**
1131+
11151132
Here is an example pgml.transform call for real-time predictions on the newly minted LLM fine-tuned on IMDB review dataset.
11161133
```sql
11171134
SELECT pgml.transform(

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy