Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
"colocalisation",
"contig",
"diffpval",
"Ensembl",
"eqtl",
"finngen",
"GCST",
Expand Down
5 changes: 5 additions & 0 deletions docs/python_api/datasets/molecular_complex.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
title: Molecular Complex
---

::: gentropy.dataset.molecular_complex.MolecularComplex
8 changes: 6 additions & 2 deletions docs/python_api/datasources/_datasources.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,11 @@ This section contains information about the data source harmonisation tools avai

1. [GTEx (eQTL catalogue)](eqtl_catalogue/_eqtl_catalogue.md)
2. [UKB PPP (EUR)](ukb_ppp_eur/_ukb_ppp_eur.md)
3. [deCODE proteomics](deCODE/_decode.md)

## Protein complexes

1. [Complex Portal](complex_portal/_complex_portal.md)

## Interaction / Interval-based Experiments

Expand All @@ -39,5 +44,4 @@ This section contains information about the data source harmonisation tools avai

## Biological samples

1. [Uberon](biosample_ontologies/_uberon.md)
2. [Cell Ontology](biosample_ontologies/_cell_ontology.md)
1. [Uberon and Cell Ontology](biosample_ontologies/_biosample_ontologies.md)
14 changes: 14 additions & 0 deletions docs/python_api/datasources/complex_portal/_complex_portal.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
title: Complex Portal
---

[Complex Portal](https://www.ebi.ac.uk/complexportal/) is a manually curated resource of macromolecular complexes maintained by EMBL-EBI. It provides two complementary datasets:

- **Experimental** – complexes with direct experimental evidence.
- **Predicted** – computationally predicted complexes.

Both files are distributed in the **ComplexTAB** flat-file format and are filtered to human complexes (NCBI taxonomy ID 9606) during ingestion.

The resulting `MolecularComplex` dataset is used downstream in the deCODE proteomics pipeline to annotate multi-protein SomaScan aptamers with a `molecularComplexId`.

::: gentropy.datasource.complex_portal.ComplexTab
6 changes: 6 additions & 0 deletions docs/python_api/datasources/deCODE/.pages
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
nav:
- _decode.md
- manifest.md
- aptamer_metadata.md
- study_index.md
- summary_stats.md
15 changes: 15 additions & 0 deletions docs/python_api/datasources/deCODE/_decode.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
title: deCODE proteomics
---

[deCODE proteomics](https://www.nature.com/articles/s41586-023-06563-x) is a large-scale proteomics dataset generated by deCODE genetics, a biopharmaceutical company based in Iceland. The dataset includes measurements of protein levels in blood samples from thousands of individuals (up to ~36,000 Icelandic participants), using the SomaScan aptamer-based platform.

Two sub-datasets are provided:

- **RAW** (`deCODE-proteomics-raw`): non-SMP-normalised SomaScan measurements.
- **SMP** (`deCODE-proteomics-smp`): SMP-normalised SomaScan measurements.

For a full description of the dataset and methods, refer to [Eldjarn et al., 2023](https://www.nature.com/articles/s41586-023-06563-x).

::: gentropy.datasource.decode.deCODEDataSource
::: gentropy.datasource.decode.deCODEPublicationMetadata
5 changes: 5 additions & 0 deletions docs/python_api/datasources/deCODE/aptamer_metadata.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
title: deCODE Aptamer Metadata
---

::: gentropy.datasource.decode.aptamer_metadata.AptamerMetadata
5 changes: 5 additions & 0 deletions docs/python_api/datasources/deCODE/manifest.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
title: deCODE manifest
---

::: gentropy.datasource.decode.manifest.deCODEManifest
6 changes: 6 additions & 0 deletions docs/python_api/datasources/deCODE/study_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
title: deCODE Study Index
---

::: gentropy.datasource.decode.study_index.deCODEStudyIdParts
::: gentropy.datasource.decode.study_index.deCODEStudyIndex
6 changes: 6 additions & 0 deletions docs/python_api/datasources/deCODE/summary_stats.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
title: deCODE Summary Statistics
---

::: gentropy.datasource.decode.summary_statistics.deCODEHarmonisationConfig
::: gentropy.datasource.decode.summary_statistics.deCODESummaryStatistics
2 changes: 1 addition & 1 deletion docs/python_api/steps/biosample_index_step.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: biosample_index
title: Biosample Index Generation
---

::: gentropy.biosample_index.BiosampleIndexStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/colocalisation.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: colocalisation
title: Colocalisation
---

::: gentropy.colocalisation.ColocalisationStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/credible_set_qc_step.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: credible_set_qc
title: Credible Set Quality Control
---

::: gentropy.credible_set_qc.CredibleSetQCStep
Expand Down
5 changes: 5 additions & 0 deletions docs/python_api/steps/decode_ingestion.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
title: deCODE ingestion
---

::: gentropy.decode_ingestion
2 changes: 1 addition & 1 deletion docs/python_api/steps/eqtl_catalogue.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: eQTL Catalogue
title: eQTL Catalogue ingestion
---

::: gentropy.eqtl_catalogue.EqtlCatalogueStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/finngen_studies.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: finngen_studies
title: FinnGen Study Index generation
---

::: gentropy.finngen_studies.FinnGenStudiesStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/finngen_sumstat_preprocess.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: finngen_sumstat_preprocess
title: Finngen Summary Statistics Ingestion
---

::: gentropy.finngen_sumstat_preprocess.FinnGenSumstatPreprocessStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/finngen_ukbb_mvp_meta_step.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: FinnGen UKBB MVP Meta Analysis Step
title: FinnGen UKBB MVP Meta Analysis ingestion
---

::: gentropy.finngen_ukb_mvp_meta.FinngenUkbMvpMetaSummaryStatisticsIngestionStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/gwas_catalog_curation.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: gwas_catalog_study_curation
title: GWAS Catalog Study Curation
---

::: gentropy.gwas_catalog_study_curation.GWASCatalogStudyCurationStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/gwas_catalog_study_index.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: gwas_catalog_study_inclusion
title: GWAS Catalog Study Inclusion
---

::: gentropy.gwas_catalog_study_index.GWASCatalogStudyIndexGenerationStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/gwas_catalog_sumstat_preprocess.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: gwas_catalog_sumstat_preprocess
title: GWAS Catalog Summary Statistics ingestion
---

::: gentropy.gwas_catalog_sumstat_preprocess.GWASCatalogSumstatsPreprocessStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/gwas_catalog_top_hits.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: GWAS Catalog Top Hits Ingestion Step
title: GWAS Catalog Top Hits ingestion
---

::: gentropy.gwas_catalog_top_hits.GWASCatalogTopHitIngestionStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/ld_clump.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: ld_based_clumping
title: Linkage Disequilibrium Based Clumping
---

::: gentropy.ld_based_clumping.LDBasedClumpingStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/ld_index.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: GnomAD Linkage data ingestion
title: GnomAD Linkage Disequilibrium Index generation
---

::: gentropy.gnomad_ingestion.LDIndexStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/locus_breaker_clumping.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: locus_breaker_clumping
title: Locus Breaker Clumping
---

::: gentropy.locus_breaker_clumping.LocusBreakerClumpingStep
5 changes: 5 additions & 0 deletions docs/python_api/steps/molecular_complex.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
title: Molecular complex ingestion
---

::: gentropy.molecular_complex
2 changes: 1 addition & 1 deletion docs/python_api/steps/pics.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: pics
title: PICS
---

::: gentropy.pics.PICSStep
5 changes: 5 additions & 0 deletions docs/python_api/steps/pqtl_study.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
title: pQTL study index transformation
---

::: gentropy.pqtl_study
2 changes: 1 addition & 1 deletion docs/python_api/steps/summary_statistics_qc.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: summary_statistics_qc
title: Summary Statistics QC
---

::: gentropy.sumstat_qc_step.SummaryStatisticsQCStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/ukb_ppp_eur_sumstat_preprocess.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: ukb_ppp_eur_sumstat_preprocess
title: UKB PPP EUR Summary Statistics ingestion
---

::: gentropy.ukb_ppp_eur_sumstat_preprocess.UkbPppEurStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/variant_annotation_step.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: GnomAD variant data ingestion
title: GnomAD Variant Index generation
---

::: gentropy.gnomad_ingestion.GnomadVariantIndexStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/variant_index_step.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: variant_index
title: Variant Index generation
---

::: gentropy.variant_index.VariantIndexStep
2 changes: 1 addition & 1 deletion docs/python_api/steps/window_based_clumping.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: window_based_clumping
title: Window based clumping
---

::: gentropy.window_based_clumping.WindowBasedClumpingStep
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ plugins:
handlers:
python:
options:
filters: ["!^_", "!__new__", "__init__"]
filters: ["!^_", "!__new__", "!__init__"]
show_signature_annotations: true
show_root_heading: true
heading_level: 3
Expand Down
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ dependencies = [
"pyliftover (>=0.4.1, <0.5.0)",
"numpy>=2.3.0",
"omegaconf (>=2.3.0, <2.4.0)",
"scikit-learn (>=1.6.1, <1.8.0)",
"scikit-learn (>=1.6.1, <1.9.0)",
"pandas[gcp,parquet] (>=2.2.3, <2.4.0)",
"skops (>=0.13.0, <0.14.0)",
"shap>=0.50.0",
Expand All @@ -27,6 +27,7 @@ dependencies = [
"xgboost>=3.0.4 ; platform_machine == 'x86_64' and sys_platform == 'darwin'",
"huggingface-hub>=0.27.1",
"wandb (>=0.19.4, <0.26.0)",
"pydantic>=2.12.4",
]
classifiers = [
"Programming Language :: Python :: 3.11",
Expand Down
3 changes: 2 additions & 1 deletion src/gentropy/assets/data/gwas_population_2_LD_panel_map.json
Original file line number Diff line number Diff line change
Expand Up @@ -18,5 +18,6 @@
"NR": "nfe",
"Finnish": "fin",
"African": "afr",
"Admixed American": "amr"
"Admixed American": "amr",
"Icelandic": "nfe"
}
104 changes: 104 additions & 0 deletions src/gentropy/assets/schemas/molecular_complex.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
{
"fields": [
{ "metadata": {}, "name": "id", "nullable": true, "type": "string" },
{
"metadata": {},
"name": "description",
"nullable": true,
"type": "string"
},
{
"metadata": {},
"name": "properties",
"nullable": true,
"type": "string"
},
{ "metadata": {}, "name": "assembly", "nullable": true, "type": "string" },
{
"metadata": {},
"name": "components",
"nullable": true,
"type": {
"containsNull": false,
"elementType": {
"fields": [
{
"metadata": {},
"name": "id",
"nullable": false,
"type": "string"
},
{
"metadata": {},
"name": "stoichiometry",
"nullable": false,
"type": "string"
},
{
"metadata": {},
"name": "source",
"nullable": false,
"type": "string"
}
],
"type": "struct"
},
"type": "array"
}
},
{
"metadata": {},
"name": "evidenceCodes",
"nullable": true,
"type": {
"containsNull": false,
"elementType": "string",
"type": "array"
}
},
{
"metadata": {},
"name": "crossReferences",
"nullable": true,
"type": {
"containsNull": false,
"elementType": {
"fields": [
{
"metadata": {},
"name": "source",
"nullable": false,
"type": "string"
},
{
"metadata": {},
"name": "id",
"nullable": false,
"type": "string"
}
],
"type": "struct"
},
"type": "array"
}
},
{
"metadata": {},
"name": "source",
"nullable": false,
"type": {
"fields": [
{ "metadata": {}, "name": "id", "nullable": true, "type": "string" },
{
"metadata": {},
"name": "source",
"nullable": true,
"type": "string"
}
],
"type": "struct"
}
}
],
"type": "struct"
}
Loading
Loading