Skip to content

Commit 5c89536

Browse files
authored
Merge pull request #199 from isamplesorg/develop
Merge develop updates into main
2 parents 841cd14 + e730cd0 commit 5c89536

23 files changed

Lines changed: 6577 additions & 242 deletions

.github/workflows/build-docs.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ jobs:
5353
run: ls -l ${{ github.workspace }}/build/docs/
5454

5555
- name: Publish documentation
56-
uses: peaceiris/actions-gh-pages@v3
56+
uses: peaceiris/actions-gh-pages@v4
5757
with:
5858
github_token: ${{ secrets.GITHUB_TOKEN }}
5959
publish_dir: ./docs

README.md

Lines changed: 82 additions & 180 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
[![NSF-2004562](https://img.shields.io/badge/NSF-ID=2004562-blue.svg)](https://nsf.gov/awardsearch/showAward?AWD_ID=2004562)
1+
[![NSF-2004562](https://img.shields.io/badge/NSF-ID=2004562-blue.svg)](https://nsf.gov/awardsearch/showAward?AWD_ID=2004562)
22
[![NSF-2004815](https://img.shields.io/badge/NSF-ID=2004815-blue.svg)](https://nsf.gov/awardsearch/showAward?AWD_ID=2004815)
33
[![NSF-2004839](https://img.shields.io/badge/NSF-ID=2004839-blue.svg)](https://nsf.gov/awardsearch/showAward?AWD_ID=2004839)
44
[![NSF-2004642](https://img.shields.io/badge/NSF-ID=2004642-blue.svg)](https://nsf.gov/awardsearch/showAward?AWD_ID=2004642)
@@ -7,252 +7,154 @@
77

88
Defines the core metadata model for iSamples.
99

10-
`src/schemas/isamples_core.yaml` defines the iSamples core model in linkml. It references vocabularies contained in `[isamplesorg/vocabularies//vocabulary](https://github.com/isamplesorg/vocabularies/tree/develop/vocabulary)` which define terms for the Material Type, Sampled Feature, and Material Sample Object Type vocabularies.
10+
`src/schemas/isamples_core.yaml` defines the iSamples core model in LinkML. It references vocabularies contained in [`isamplesorg/vocabularies/vocabulary`](https://github.com/isamplesorg/vocabularies/tree/develop/vocabulary) which define terms for the Material Type, Sampled Feature, and Material Sample Object Type vocabularies.
1111

12-
The following artifacts are generated from the linkml and vocabulary sources:
12+
Documentation is available at https://isamplesorg.github.io/metadata/
1313

14-
* Documentation in HTML, available at https://isamplesorg.github.io/metadata/
14+
## Repository Structure
1515

16+
```
17+
metadata/
18+
├── src/
19+
│ └── schemas/ # LinkML schema definitions
20+
│ └── isamples_core.yaml
21+
├── background/ # Diagrams and information about existing models
22+
│ ├── DataCite/
23+
│ ├── ESS-DIVE/
24+
│ ├── GEOME-TDWG/
25+
│ ├── GeoScience/
26+
│ ├── ODM-CUAHSI/
27+
│ └── OpenContext-Archae-anthro/
28+
├── examples/ # Example metadata documents from different systems
29+
│ ├── APItesting/
30+
│ ├── GEOME/
31+
│ ├── geoJSON/
32+
│ ├── iSamples/
33+
│ ├── OpenContext/
34+
│ ├── script/
35+
│ ├── SESAR/
36+
│ └── smithonsonian/
37+
├── vocabulary/ # Vocabulary-related files
38+
├── tools/ # Modified docgen tool and templates for Quarto
39+
├── quarto/ # Quarto configuration files
40+
├── build/ # Build output (intermediate docs)
41+
│ └── docs/ # Generated markdown documentation
42+
├── tests/ # Test files
43+
└── notes/ # Development notes
44+
```
1645

1746
## Development
1847

19-
Linkml and associated tools require a python environment, version 3.9 or newer, and uses [poetry](https://python-poetry.org/) for dependency management. Poetry can be installed with `pip install poetry`.
48+
LinkML and associated tools require a Python environment (version 3.9 or newer) and uses [Poetry](https://python-poetry.org/) for dependency management. Poetry can be installed with `pip install poetry`.
2049

2150
To work on project contents and run artifact generators, first grab the source and switch to the develop branch:
2251

23-
```
52+
```bash
2453
git clone https://github.com/isamplesorg/metadata.git
2554
cd metadata
26-
checkout develop
27-
pull
55+
git checkout develop
56+
git pull
2857
```
2958

30-
Setup a virtual environment (e.g. using poetry or mkvirtualenv):
59+
Setup a virtual environment using Poetry:
3160

32-
```
61+
```bash
3362
poetry shell
3463
poetry install
3564
```
3665

37-
3866
(To exit poetry shell, use `exit`).
3967

40-
Artifacts in the `generated/` folder are produced by running `make` or `make all`.
41-
42-
Documentation is rendered with [Quarto]() rather than the defaults `mkdocs` or `Sphinx` (Quarto offers many additional features for including computed examples which are planned). To generate the documentation, install a version of [Quarto >= 1.2](), then run `make`, `make all` or `make gen-docs`.
68+
Artifacts are produced by running `make` or `make all`.
4369

44-
This will generate markdown intermediate files in the `build/docs` folder then invoke `quarto render` to generate the HTML docs in the `docs/` folder.
70+
### Documentation Generation
4571

46-
Note that this project uses a version of the `linkml` `docgen` tool and templates modified to render markdown for `quarto`. The modified `docgen` and templates is located in the `tools/` folder.
72+
Documentation is rendered with [Quarto](https://quarto.org/) rather than the default `mkdocs` or `Sphinx` (Quarto offers many additional features for including computed examples). To generate the documentation:
4773

74+
1. Install [Quarto >= 1.2](https://quarto.org/docs/get-started/)
75+
2. Run `make`, `make all` or `make gen-docs`
4876

49-
## Older notes below
77+
This will generate markdown intermediate files in the `build/docs` folder, then invoke `quarto render` to generate HTML documentation.
5078

51-
Collation of metadata examples and notes for the project
79+
Note that this project uses a modified version of the LinkML `docgen` tool and templates to render markdown for Quarto. The modified `docgen` and templates are located in the `tools/` folder.
5280

53-
- background: contains diagrams and information about some existing models that include metadata for samples; files are organized broadly by domain.
54-
- examples: example metadata documents from different systems. Subfolders are
55-
- raw: metadata from the originating system
56-
- test: corresponding records generated manually using the iSamples basic template
57-
- transform: corresponding records generated by automated ETL process from raw records
58-
- vocabulary: vocabularies related to sample metadata from various systems
81+
## LinkML Schema Operations
5982

60-
# linkML (Current version 1.1.15)
61-
This branch implments how to use linkML to generate various output and operations for iSamples.
83+
### Convert YAML schema to JSON schema
6284

63-
## Current workflow (01/01/2022)
64-
![workflow](https://github.com/isamplesorg/metadata/blob/docker/linkmlExperiment/linkML%201-1-2022%20workflow.png)
85+
```bash
86+
gen-json-schema -t PhysicalSampleRecord --not-closed src/schemas/isamples_core.yaml > isamples_core.schema.json
87+
```
6588

89+
The `-t PhysicalSampleRecord` option makes the "PhysicalSampleRecord" class the top-level class in the JSON schema.
6690

67-
## iSamples YAML schema to JSON schema
68-
We could use the following command to convert iSamples YAML schema to JSON schema.
91+
### Generate JSON-LD context
6992

93+
```bash
94+
gen-jsonld-context src/schemas/isamples_core.yaml > isamples_core.jsonld
7095
```
71-
gen-json-schema -t PhysicalSampleRecord --not-closed iSamplesSchemaBasic0.3.yaml > iSamplesSchemaBasic0.3.schema.json
72-
```
73-
In this command, `-t PhysicalSampleRecord` means to make "physicalSampleRecord" class become the top level class. And the prepoerties of the class become the top level properties in the JSON-schema. The converted JSON scheme file is "iSamplesSchemaBasic0.3.schema.json".
7496

75-
## Generating JSON-LD context
76-
```
77-
gen-jsonld-context iSamplesSchemaBasic0.3.yaml > iSampleSchemaBasic0.3.jsonld
78-
```
79-
The command will save the result in the jsonld file. After we have the converted JSON-LD context. The enumeration part of JSON-context should be modified by us manually.
80-
<details>
81-
<summary>Modified JSON-LD context example</summary>
82-
<pre>
83-
"@context": {
84-
"dct": "http://purl.org/dc/terms/",
85-
"isam": "http://resource.isamples.org/schema/",
86-
"mat": "http://resource.isamples.org/vocabulary/material/",
87-
"pur": "http://resource.isamples.org/vocabulary/samplepurpose/",
88-
"rdfs": "http://www.w3.org/2000/01/rdf-schema#",
89-
"sf": "http://resource.isamples.org/vocabulary/sampledFeature/",
90-
"skos": "http://www.w3.org/2004/02/skos/core#",
91-
"spt": "http://resource.isamples.org/vocabulary/sampleobjecttype/",
92-
"w3cpos": "http://www.w3.org/2003/01/geo/wgs84_pos#",
93-
"xsd": "http://www.w3.org/2001/XMLSchema#",
94-
"@vocab": "http://resource.isamples.org/schema/",
95-
"curation": {
96-
"@type": "@id"
97-
},
98-
"hasContextCategory": {
99-
"@type":"contextcategory"
100-
},
101-
"hasMaterialCategory": {
102-
"@type":"materialtype"
103-
},
104-
"has_sample_object_type": {
105-
"@type":"specimencategory"
106-
},
107-
"id": "@id",
108-
"latitude": {
109-
"@type": "xsd:decimal"
110-
},
111-
"location": {
112-
"@type": "@id"
113-
},
114-
"longitude": {
115-
"@type": "xsd:decimal"
116-
},
117-
"producedBy": {
118-
"@type": "@id"
119-
},
120-
"relatedResource": {
121-
"@type": "@id"
122-
},
123-
"resultTime": {
124-
"@type": "xsd:date"
125-
},
126-
"samplingSite": {
127-
"@type": "@id"
128-
}
129-
}
130-
</pre>
131-
</details>
132-
This is an example of modified JSON-LD context. For each enumeartion, we use `@type` to declare enumeration type.
97+
After generating the JSON-LD context, the enumeration part may need manual modification. For each enumeration, use `@type` to declare the enumeration type.
13398

134-
## Validating schema and instance file
135-
Before we valideting all instance files, we need to add modified JSON-LD context to the front of instances properties.
13699
<details>
137-
<summary>Full instance example</summary>
138-
<pre>
100+
<summary>Example modified JSON-LD context</summary>
101+
102+
```json
139103
{
140104
"@context": {
141105
"dct": "http://purl.org/dc/terms/",
142106
"isam": "http://resource.isamples.org/schema/",
143107
"mat": "http://resource.isamples.org/vocabulary/material/",
144-
"pur": "http://resource.isamples.org/vocabulary/samplepurpose/",
145-
"rdfs": "http://www.w3.org/2000/01/rdf-schema#",
146108
"sf": "http://resource.isamples.org/vocabulary/sampledFeature/",
147109
"skos": "http://www.w3.org/2004/02/skos/core#",
148110
"spt": "http://resource.isamples.org/vocabulary/sampleobjecttype/",
149-
"w3cpos": "http://www.w3.org/2003/01/geo/wgs84_pos#",
150111
"xsd": "http://www.w3.org/2001/XMLSchema#",
151112
"@vocab": "http://resource.isamples.org/schema/",
152-
"curation": {
153-
"@type": "@id"
154-
},
155113
"hasContextCategory": {
156-
"@type":"contextcategory"
114+
"@type": "contextcategory"
157115
},
158116
"hasMaterialCategory": {
159-
"@type":"materialtype"
117+
"@type": "materialtype"
160118
},
161119
"has_sample_object_type": {
162-
"@type":"specimencategory"
120+
"@type": "specimencategory"
163121
},
164122
"id": "@id",
165123
"latitude": {
166124
"@type": "xsd:decimal"
167125
},
168-
"location": {
169-
"@type": "@id"
170-
},
171126
"longitude": {
172127
"@type": "xsd:decimal"
173128
},
174-
"producedBy": {
175-
"@type": "@id"
176-
},
177-
"relatedResource": {
178-
"@type": "@id"
179-
},
180129
"resultTime": {
181130
"@type": "xsd:date"
182-
},
183-
"samplingSite": {
184-
"@type": "@id"
185131
}
186-
},
187-
188-
189-
"@schema": "../../iSamplesSchemaBasic0.2.json",
190-
"@id": "metadata/21547/Car2PIRE_0334",
191-
"label": "PIRE_0334",
192-
"sampleidentifier": "ark:/21547/Car2PIRE_0334",
193-
"description": "",
194-
"hasContextCategory": ["Marine Biome"],
195-
"hasMaterialCategory": ["Organic Material"],
196-
"has_sample_object_type": ["Whole Organism"],
197-
"informalClassification": ["Gastropoda"],
198-
"keywords": ["Aceh", "Sumatra","Indonesia","Asia", "Mollusca"],
199-
"producedBy": {
200-
"@id":"ark:/21547/Cas2INDO_2016_SEU_1B",
201-
"label": "INDO_2016_SEU_1B",
202-
"description": "expeditionCode: INDO_PIRE | samplingProtocol: ARMS | taxonomy team: MINV | projectId: 80",
203-
"hasFeatureOfInterest": "coral reef",
204-
"responsibility": ["Aji Wahyu Anggoro","Andrianus Sembiring"],
205-
"resultTime": "2016-08-09",
206-
"samplingSite": {
207-
"description": "Shallow, coastal reef. Apparent exposure to current, Porites dominated. Less impacted bleaching site, high recruitment, 12 m.",
208-
"label": "",
209-
"location": {
210-
"elevation": "maximumDepthInMeters: 12",
211-
"latitude": 5.89430,
212-
"longitude": 95.25293
213-
},
214-
"placeName": ["Pulau Seulako"]
215-
}
216-
},
217-
"registrant": "Chris Meyer",
218-
"samplingPurpose": "genomic analysis",
219-
"curation": {
220-
"accessConstraints": "",
221-
"curationLocation": "",
222-
"responsibility": ""
223-
},
224-
"relatedResource": {
225-
"label":"subsample tissue",
226-
"description":"",
227-
"target":"ark:/21547/Cat2INDO106431.1",
228-
"relationship":"subsample"
229-
}
132+
}
230133
}
231-
</pre>
134+
```
232135
</details>
233136

234-
We need to use the following command to validate our instance files with schema.
235-
```
236-
linkml-validate -s iSamplesSchemaBasic0.3.yaml instance.json
237-
jsonschema -i instance.json iSamplesSchemaBasic0.3.schema.json
137+
### Validate instance files
138+
139+
```bash
140+
linkml-validate -s src/schemas/isamples_core.yaml instance.json
141+
jsonschema -i instance.json isamples_core.schema.json
238142
```
239-
The first command is to validate instance file with yaml schema. The second command is to validate instance file with json schema.
240143

241-
## Run tools in a Docker container
242-
The iSamples Metadata Docker container is based on the Docker container from the LinkML project [https://hub.docker.com/r/monarchinitiative/linkml/tags]
144+
The first command validates an instance file against the YAML schema. The second command validates against the JSON schema.
243145

244-
First you'll build the image:
245-
`docker build -t isamples_linkml .`
146+
## Docker
246147

247-
Then, running it will open a bash shell opened to `/work`, which is the Docker container volume representing the iSamples metadata repository:
248-
``docker run -a stdin -a stdout -i -t -v `pwd`:/work isamples_linkml``
148+
The iSamples Metadata Docker container is based on the Docker container from the LinkML project ([https://hub.docker.com/r/monarchinitiative/linkml/tags](https://hub.docker.com/r/monarchinitiative/linkml/tags)).
249149

250-
Then use the following commands to generate LinkML:
251-
* Command 1
252-
* Command 2
253-
* Command 3
150+
Build the image:
254151

255-
## To do
256-
- We still focus on implementing the iSamples schema under linkML requirements.
257-
- There are some bugs or unimplemented parts in the linkML.
258-
- The different pc platform will have different results or errors. We prefer to use [docker](https://www.docker.com/products/docker-desktop) to run linkML. Please follow the [linkML tutorial](https://linkml.io/linkml/intro/install.html)
152+
```bash
153+
docker build -t isamples_linkml .
154+
```
155+
156+
Run the container (opens a bash shell with the repository mounted at `/work`):
157+
158+
```bash
159+
docker run -a stdin -a stdout -i -t -v `pwd`:/work isamples_linkml
160+
```

UMLModel.qea

4 KB
Binary file not shown.

examples/APItesting/APIThing-identifier.json renamed to examples/APItesting/APIThing-identifier-SMR-Samsung.json

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,21 +7,27 @@
77
"has_context_category": [
88
{
99
"label": "Any sampled feature",
10+
1011
"identifier": "https://w3id.org/isample/vocabulary/sampledfeature/anysampledfeature"
12+
1113
}
1214
],
1315
"has_context_category_confidence": [1.0],
1416
"has_material_category": [
1517
{
1618
"label": "Mineral",
19+
1720
"identifier": "https://w3id.org/isample/vocabulary/material/mineral"
21+
1822
}
1923
],
2024
"has_material_category_confidence": "None",
2125
"has_specimen_category": [
2226
{
2327
"label": "Physical specimen",
28+
2429
"identifier": "https://w3id.org/isample/vocabulary/specimentype/physicalspecimen"
30+
2531
}
2632
],
2733
"has_specimen_category_confidence": [1.0],
@@ -74,5 +80,7 @@
7480
"producedBy_samplingSite_location_h3_11": "8bacb32e6408fff",
7581
"producedBy_samplingSite_location_h3_12": "8cacb32e64083ff",
7682
"producedBy_samplingSite_location_h3_13": "8dacb32e64082bf",
83+
7784
"producedBy_samplingSite_location_h3_14": "8eacb32e640828f"
78-
}
85+
}
86+

0 commit comments

Comments
 (0)