Skip to content

Reorganized the SDK directory #954

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Aug 28, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Updated README and added requirements.txt
  • Loading branch information
SilasMarvin committed Aug 28, 2023
commit 59aa1e9163307eb922ac981ffcb178901dbe916a
26 changes: 17 additions & 9 deletions pgml-sdks/pgml/python/examples/README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,28 @@
## Examples
# Examples

### [Semantic Search](./semantic_search.py)
## Prerequisites
Before running any examples first install dependencies and set the DATABASE_URL environment variable:
```
pip install -r requirements.txt
export DATABASE_URL={YOUR DATABASE URL}
```

Optionally, configure a .env file containing a DATABASE_URL variable.

## [Semantic Search](./semantic_search.py)
This is a basic example to perform semantic search on a collection of documents. It loads the Quora dataset, creates a collection in a PostgreSQL database, upserts documents, generates chunks and embeddings, and then performs a vector search on a query. Embeddings are created using `intfloat/e5-small` model. The results are semantically similar documemts to the query. Finally, the collection is archived.

### [Question Answering](./question_answering.py)
## [Question Answering](./question_answering.py)
This is an example to find documents relevant to a question from the collection of documents. It loads the Stanford Question Answering Dataset (SQuAD) into the database, generates chunks and embeddings. Query is passed to vector search to retrieve documents that match closely in the embeddings space. A score is returned with each of the search result.

### [Question Answering using Instructore Model](./question_answering_instructor.py)
## [Question Answering using Instructore Model](./question_answering_instructor.py)
In this example, we will use `hknlp/instructor-base` model to build text embeddings instead of the default `intfloat/e5-small` model.

### [Extractive Question Answering](./extractive_question_answering.py)
## [Extractive Question Answering](./extractive_question_answering.py)
In this example, we will show how to use `vector_recall` result as a `context` to a HuggingFace question answering model. We will use `Builtins.transform()` to run the model on the database.

### [Table Question Answering](./table_question_answering.py)
In this example, we will use [Open Table-and-Text Question Answering (OTT-QA)
](https://github.com/wenhuchen/OTT-QA) dataset to run queries on tables. We will use `deepset/all-mpnet-base-v2-table` model that is trained for embedding tabular data for retrieval tasks.
## [Table Question Answering](./table_question_answering.py)
In this example, we will use [Open Table-and-Text Question Answering (OTT-QA)](https://github.com/wenhuchen/OTT-QA) dataset to run queries on tables. We will use `deepset/all-mpnet-base-v2-table` model that is trained for embedding tabular data for retrieval tasks.

### [Summarizing Question Answering](./summarizing_question_answering.py)
## [Summarizing Question Answering](./summarizing_question_answering.py)
This is an example to find documents relevant to a question from the collection of documents and then summarize those documents.
36 changes: 36 additions & 0 deletions pgml-sdks/pgml/python/examples/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
aiohttp==3.8.5
aiosignal==1.3.1
async-timeout==4.0.3
attrs==23.1.0
certifi==2023.7.22
charset-normalizer==3.2.0
datasets==2.14.4
dill==0.3.7
filelock==3.12.3
frozenlist==1.4.0
fsspec==2023.6.0
huggingface-hub==0.16.4
idna==3.4
markdown-it-py==3.0.0
mdurl==0.1.2
multidict==6.0.4
multiprocess==0.70.15
numpy==1.25.2
packaging==23.1
pandas==2.0.3
pgml==0.9.0
pyarrow==13.0.0
Pygments==2.16.1
python-dateutil==2.8.2
python-dotenv==1.0.0
pytz==2023.3
PyYAML==6.0.1
requests==2.31.0
rich==13.5.2
six==1.16.0
tqdm==4.66.1
typing_extensions==4.7.1
tzdata==2023.3
urllib3==2.0.4
xxhash==3.3.0
yarl==1.9.2
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy