Skip to content

Build 2.2 Bug Fixes - Large PR #441

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 40 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
c5acf71
Fix beataml Drug issue
jjacobson95 Jul 30, 2025
c4df787
liverpdo fixes
jjacobson95 Jul 31, 2025
a96c111
added novartis to build_all.py. update for liverpdo drugs
jjacobson95 Jul 31, 2025
17377b2
another liverpdo drug update
jjacobson95 Jul 31, 2025
1981a3b
testing pubchem update
jjacobson95 Jul 31, 2025
ebc79b5
working on pubchem
jjacobson95 Jul 31, 2025
bc5b859
working on pubchem2
jjacobson95 Jul 31, 2025
799c636
updated pubchem call in build/bladderpdo/02_createBladderPDODrugsFile.py
jjacobson95 Jul 31, 2025
7470735
Large drug generation overhaul
jjacobson95 Jul 31, 2025
7cf7dd1
reduced drugs in broad_sanger for debugging
jjacobson95 Aug 1, 2025
a40eca2
bug fix
jjacobson95 Aug 1, 2025
7f57630
changed to random 10 instead fo first test for debugging
jjacobson95 Aug 1, 2025
44ab62c
Speed up Docker build (and debug process) through optimizing dockerfi…
jjacobson95 Aug 1, 2025
2fafd15
Make sure helper script is actually added to the dockerfile
jjacobson95 Aug 1, 2025
e3670b0
bug fix in join
jjacobson95 Aug 1, 2025
54a9254
bug fix on join
jjacobson95 Aug 1, 2025
aee1a1d
Sorted after joining
jjacobson95 Aug 1, 2025
7f39128
ensure that first drug in first file starts at SMI_1 instead of SMI_2
jjacobson95 Aug 1, 2025
88083fe
Turning off test steps. Made a change to HCMI that should speed up I …
jjacobson95 Aug 1, 2025
533f66b
SarcPDO issues fixed for mutations and experiments
jjacobson95 Aug 2, 2025
797f37c
fixes liverpdo experiments
jjacobson95 Aug 2, 2025
09fb9e5
Updated mapping scripts with all datasets and removed cptac by defaul…
jjacobson95 Aug 4, 2025
4d74714
Added robust methods to download files for broad_sanger omics
jjacobson95 Aug 4, 2025
509b170
Dockerfile optimization. Attempting to fix broad_sanger. Hide warning…
jjacobson95 Aug 5, 2025
c349fc6
tiny changes. 05b_separate_datasets.py working now
jjacobson95 Aug 5, 2025
8018f8b
pinning polars-lts-cpu to the original version as polars pin. This is…
jjacobson95 Aug 6, 2025
c7171f2
Added 3 x retry to build_all.py for each step that fails. Attempting …
jjacobson95 Aug 6, 2025
d628dd3
Remove incorrectly-cased Dockerfile.crcPDO and add Dockerfile.crcpdo
jjacobson95 Aug 6, 2025
e3f4df3
Merge remote-tracking branch 'origin/main' into build_2.2_bug_fixes
jjacobson95 Aug 6, 2025
b347513
HCMI data streaming finally seems like it mightttt be working
jjacobson95 Aug 7, 2025
4630da8
Handle 503 Gateway errors better. Pubchem ignore_chems updated to onl…
jjacobson95 Aug 7, 2025
9881108
Renamed build to coderbuild. Hopefully I got all references, there we…
jjacobson95 Aug 7, 2025
5cf3dac
Renamed All PDO/PDX Datasets. Modified all files that reference the d…
jjacobson95 Aug 7, 2025
e98d88d
Adding missed references to build/coderbuild
jjacobson95 Aug 7, 2025
3e0aea2
Adding more missed name changes
jjacobson95 Aug 7, 2025
afdb5f7
Adding just a couple more references
jjacobson95 Aug 7, 2025
f2535f8
Patch fix for a weird bug
jjacobson95 Aug 7, 2025
57b9fe0
apparently pl.scan_csv can't handle gzipped files. fixed hcmi stream
jjacobson95 Aug 8, 2025
70b50ff
Removed previous mapping files because vast dataset changes and renaming
jjacobson95 Aug 9, 2025
759cef3
Final hcmi fix I hope. I just used bash and subprocess to dedup inste…
jjacobson95 Aug 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,22 +36,22 @@ please see the [schema description](schema/README.md).

## Building a local version

The build process can be found in our [build
directory](build/README.md). Here you can follow the instructions to
The build process can be found in our [coderbuild
directory](coderbuild/README.md). Here you can follow the instructions to
build your own local copy of the data on your machine.

## Adding a new dataset

We have standardized the build process so an additional dataset can be
We have standardized the build (coderbuild) process so an additional dataset can be
built locally or as part of the next version of coder. Here are the
steps to follow:

1. First visit the [build
directory](build/README.md) and ensure you can build a local copy of
1. First visit the [coderbuild
directory](coderbuild/README.md) and ensure you can build a local copy of
CoderData.

2. Checkout this repository and create a subdirectory of the
[build directory](build) with your own build files.
[coderbuild directory](coderbuild) with your own build files.

3. Develop your scripts to build the data files according to our
[LinkML Schema](schema/coderdata.yaml]). This will require collecting
Expand All @@ -66,10 +66,10 @@ validator](https://linkml.io/linkml/data/validating-data) together
with our schema file.

You can use the following scripts as part of your build process:
- [build/utils/fit_curve.py](build/utils/fit_curve.py): This script
- [coderbuild/utils/fit_curve.py](coderbuild/utils/fit_curve.py): This script
takes dose-response data and generates the dose-response statistics
required by CoderData/
- [build/utils/pubchem_retrieval.py](build/utils/pubchem_retreival.py):
- [coderbuild/utils/pubchem_retrieval.py](coderbuild/utils/pubchem_retreival.py):
This script retreives structure and drug synonym information
required to populate the `Drug` table.

Expand All @@ -78,13 +78,13 @@ and arguments:

| shell script | arguments | description |
|------------------|--------------------------|---------------------|
| `build_samples.sh` | [latest_samples] | Latest version of samples generated by coderdata build |
| `build_samples.sh` | [latest_samples] | Latest version of samples generated by coderbuild |
| `build_omics.sh` | [gene file] [samplefile] | This includes the `genes.csv` that was generated in the original build as well as the sample file generated above. |
| `build_drugs.sh` | [drugfile1,drugfile2,...] | This includes a comma-delimited list of all drugs files generated from previous build |
| `build_exp.sh`| [samplfile ] [drugfile] | sample file and drug file generated by previous scripts |

5. Put the Docker container file inside the [Docker
directory](./build/docker) with the name
directory](./coderbuild/docker) with the name
`Dockerfile.[datasetname]`.

6. Run `build_all.py` from the root directory, which should now add in
Expand Down
54 changes: 0 additions & 54 deletions build/bladderpdo/02_createBladderPDODrugsFile.py

This file was deleted.

11 changes: 0 additions & 11 deletions build/bladderpdo/build_exp.sh

This file was deleted.

12 changes: 0 additions & 12 deletions build/bladderpdo/build_omics.sh

This file was deleted.

7 changes: 0 additions & 7 deletions build/bladderpdo/build_samples.sh

This file was deleted.

98 changes: 0 additions & 98 deletions build/broad_sanger/03-createDrugFile.R

This file was deleted.

15 changes: 0 additions & 15 deletions build/broad_sanger/build_drugs.sh

This file was deleted.

Loading
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy