Skip to content

Commit bd636f1

Browse files
authored
added the ability to link visits by their ids instead of just their timestamps, which should work better in case dates are wrong or noisy in mimic3 (#547)
Co-authored-by: John Wu <johnwu3@sunlab-work-01.cs.illinois.edu> Honestly, this is really bad practice, but we need to push this forward, because I need this merge for another PR lol
1 parent 8b2b769 commit bd636f1

File tree

3 files changed

+424
-278
lines changed

3 files changed

+424
-278
lines changed

pyhealth/datasets/configs/mimic3.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,14 +54,22 @@ tables:
5454
- "dischtime"
5555
timestamp: "dischtime"
5656
attributes:
57+
- "hadm_id"
5758
- "icd9_code"
5859
- "seq_num"
5960

6061
prescriptions:
6162
file_path: "PRESCRIPTIONS.csv.gz"
6263
patient_id: "subject_id"
6364
timestamp: "startdate"
65+
join:
66+
- file_path: "ADMISSIONS.csv.gz"
67+
"on": "hadm_id"
68+
how: "inner"
69+
columns:
70+
- "dischtime"
6471
attributes:
72+
- "hadm_id"
6573
- "drug"
6674
- "drug_type"
6775
- "drug_name_poe"
@@ -88,6 +96,7 @@ tables:
8896
- "dischtime"
8997
timestamp: "dischtime"
9098
attributes:
99+
- "hadm_id"
91100
- "icd9_code"
92101
- "seq_num"
93102

@@ -116,9 +125,16 @@ tables:
116125
noteevents:
117126
file_path: "NOTEEVENTS.csv.gz"
118127
patient_id: "subject_id"
128+
join:
129+
- file_path: "ADMISSIONS.csv.gz"
130+
"on": "hadm_id"
131+
how: "inner"
132+
columns:
133+
- "dischtime"
119134
timestamp:
120135
- "charttime"
121136
attributes:
137+
- "hadm_id"
122138
- "text"
123139
- "category"
124140
- "description"

0 commit comments

Comments
 (0)