|
19 | 19 | </h2>
|
20 | 20 |
|
21 | 21 | <p align="center">
|
22 |
| - Simple machine learning with |
| 22 | + Generative AI with |
23 | 23 | <a href="https://www.postgresql.org/" target="_blank">PostgreSQL</a>
|
24 | 24 | </p>
|
25 | 25 |
|
@@ -408,9 +408,61 @@ SELECT pgml.transform(
|
408 | 408 | ]
|
409 | 409 | ```
|
410 | 410 | ## Token Classification
|
411 |
| -## Table Question Answering |
412 |
| -## Question Answering |
| 411 | +Token classification is a task in natural language understanding, where labels are assigned to certain tokens in a text. Some popular subtasks of token classification include Named Entity Recognition (NER) and Part-of-Speech (PoS) tagging. NER models can be trained to identify specific entities in a text, such as individuals, places, and dates. PoS tagging, on the other hand, is used to identify the different parts of speech in a text, such as nouns, verbs, and punctuation marks. |
| 412 | + |
| 413 | + |
413 | 414 |
|
| 415 | +### Named Entity Recognition |
| 416 | +Named Entity Recognition (NER) is a task that involves identifying named entities in a text. These entities can include the names of people, locations, or organizations. The task is completed by labeling each token with a class for each named entity and a class named "0" for tokens that don't contain any entities. In this task, the input is text, and the output is the annotated text with named entities. |
| 417 | + |
| 418 | +```sql |
| 419 | +SELECT pgml.transform( |
| 420 | + inputs => ARRAY[ |
| 421 | + 'I am Omar and I live in New York City.' |
| 422 | + ], |
| 423 | + task => 'token-classification' |
| 424 | +) as ner; |
| 425 | +``` |
| 426 | +*Result* |
| 427 | +```sql |
| 428 | + ner |
| 429 | +------------------------------------------------------ |
| 430 | +[[ |
| 431 | + {"end": 9, "word": "Omar", "index": 3, "score": 0.997110, "start": 5, "entity": "I-PER"}, |
| 432 | + {"end": 27, "word": "New", "index": 8, "score": 0.999372, "start": 24, "entity": "I-LOC"}, |
| 433 | + {"end": 32, "word": "York", "index": 9, "score": 0.999355, "start": 28, "entity": "I-LOC"}, |
| 434 | + {"end": 37, "word": "City", "index": 10, "score": 0.999431, "start": 33, "entity": "I-LOC"} |
| 435 | +]] |
| 436 | +``` |
| 437 | + |
| 438 | +### Part-of-Speech (PoS) Tagging |
| 439 | +PoS tagging is a task that involves identifying the parts of speech, such as nouns, pronouns, adjectives, or verbs, in a given text. In this task, the model labels each word with a specific part of speech. |
| 440 | + |
| 441 | +Look for models with `pos` to use a zero-shot classification model on the :hugs: Hugging Face model hub. |
| 442 | +```sql |
| 443 | +select pgml.transform( |
| 444 | + inputs => array [ |
| 445 | + 'I live in Amsterdam.' |
| 446 | + ], |
| 447 | + task => '{"task": "token-classification", |
| 448 | + "model": "vblagoje/bert-english-uncased-finetuned-pos" |
| 449 | + }'::JSONB |
| 450 | +) as pos; |
| 451 | +``` |
| 452 | +*Result* |
| 453 | +```sql |
| 454 | + pos |
| 455 | +------------------------------------------------------ |
| 456 | +[[ |
| 457 | + {"end": 1, "word": "i", "index": 1, "score": 0.999, "start": 0, "entity": "PRON"}, |
| 458 | + {"end": 6, "word": "live", "index": 2, "score": 0.998, "start": 2, "entity": "VERB"}, |
| 459 | + {"end": 9, "word": "in", "index": 3, "score": 0.999, "start": 7, "entity": "ADP"}, |
| 460 | + {"end": 19, "word": "amsterdam", "index": 4, "score": 0.998, "start": 10, "entity": "PROPN"}, |
| 461 | + {"end": 20, "word": ".", "index": 5, "score": 0.999, "start": 19, "entity": "PUNCT"} |
| 462 | +]] |
| 463 | +``` |
| 464 | +## Question Answering |
| 465 | +## Table Question Answering |
414 | 466 | ## Translation
|
415 | 467 | ## Summarization
|
416 | 468 | ## Conversational
|
|
0 commit comments