Skip to content

Commit cf27acf

Browse files
committed
dataset processing jupiter
1 parent d662d18 commit cf27acf

1 file changed

Lines changed: 381 additions & 0 deletions

File tree

Lines changed: 381 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,381 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {
6+
"collapsed": false
7+
},
8+
"source": [
9+
"# Overview\n",
10+
"\n",
11+
"This tutorial will give you an overview how to get from a DICOM dump to a processed Dataset with segmentations.\n",
12+
"\n",
13+
"abbreviations:\n",
14+
"POI: Point of interest\n",
15+
"\n",
16+
"Steps:\n",
17+
"\n",
18+
"(1) Dicom export to BIDS dataset\n",
19+
"\n",
20+
"(2) ~~Inter-scan image registration​.~~\n",
21+
"\n",
22+
"(2.1) ~~Rigide Movement correction with automatic Spine POIs~~\n",
23+
"\n",
24+
"(2.2) ~~Deformable Movement~~\n",
25+
"\n",
26+
"(3) Stitching\n",
27+
"\n",
28+
"(3.1) ~~Stitching with rigid movement compensation. (From 2.1)~~\n",
29+
"\n",
30+
"(3.2) ~~Stitching with deformable movement compensation. (From 2.2)~~\n",
31+
"\n",
32+
"\n",
33+
"(4) Segmentation TotalVibeSegmentator, Spineps ...\n",
34+
"\n",
35+
"(5) ~~MR Deformable Registration (From 2.1,2.2)~~\n",
36+
"\n",
37+
"(6) ~~Water Fat Swap detection in VIBE and MEVIBE~~\n"
38+
]
39+
},
40+
{
41+
"cell_type": "markdown",
42+
"metadata": {},
43+
"source": [
44+
"## 1 Dicom export to BIDS dataset\n",
45+
"\n",
46+
"Short overview:\n",
47+
"\n",
48+
"A BIDS dataset is a file naming convection.\n",
49+
"\n",
50+
"The following rules should be known and weakly enforced:\n",
51+
"\n",
52+
"- A dataset folder should start with 'dataset-{YOUR-NAME}'\n",
53+
"- The next level folder are:\n",
54+
" - rawdata: for all imaging data.\n",
55+
" - derivative: for all generated data, like segmentation.\n",
56+
"A file should look like:\n",
57+
"\n",
58+
"sub-{Subject name}_ses-{Session}_{key}-{value}*_{format}.{filetype}\n",
59+
"- Subject name: Unique identifier \n",
60+
"- Session: Session id. Optional if there is only one session\n",
61+
"- Any number of key-values. Keys are unique. The defined keys are here: https://bids-specification.readthedocs.io/en/stable/appendices/entities.html . Our tool enforces a certain order. See tutorial_BIDS_files.ipynb\n",
62+
"- format: type of acquisition like ct, T2w, VIBE, MPRage\n",
63+
"Do not use '_' in any key or values. \n",
64+
"\n",
65+
"See https://bids-specification.readthedocs.io/en/stable/ for detailed description what BIDS ist."
66+
]
67+
},
68+
{
69+
"cell_type": "code",
70+
"execution_count": null,
71+
"metadata": {},
72+
"outputs": [],
73+
"source": [
74+
"from TPTBox.core.bids_files import formats,entities_keys\n",
75+
"print('Known formats:\\n','\\n'.join(formats))\n",
76+
"print()\n",
77+
"print()\n",
78+
"print(\"Order of keys we enforce:\\n\", '\\n'.join(entities_keys.keys()))"
79+
]
80+
},
81+
{
82+
"cell_type": "markdown",
83+
"metadata": {},
84+
"source": [
85+
"\n",
86+
"This function extracts a dicom folder to a BIDS-like Niffty folder.\n",
87+
"\n",
88+
"The names are created like this: DICOM:Key is given dicom key\n",
89+
" \n",
90+
"'dataset-{NAME}/rawdate/sub-{DICOM:PatientID}/ses-{DICOM:StudyDate}/{format}/sub-{DICOM:PatientID}_ses-{DICOM:StudyDate}_sequ-{DICOM:SeriesNumber}_acq-{sag|ax|cor|iso}_{format}.nii.gz'\n",
91+
"\n",
92+
"and a .json, where the and DICOM-Keys are saved.\n",
93+
"\n",
94+
"To get {format} we use string matching and the dicom \"SeriesDescription\" key. As this is a free text this will not always work. Than we default to \"mr\" and you have to manually rename them.\n",
95+
"\n",
96+
"\n",
97+
"For very large dataset you can use make_subject_chunks = n [int]. Than we put a additional folder with the first n letters between rawdata and the sub- folder."
98+
]
99+
},
100+
{
101+
"cell_type": "code",
102+
"execution_count": null,
103+
"metadata": {},
104+
"outputs": [],
105+
"source": [
106+
"from pathlib import Path\n",
107+
"from TPTBox.core.dicom.dicom_extract import extract_dicom_folder\n",
108+
"\n",
109+
"path_to_dicom_dataset = \"TODO\"\n",
110+
"dataset_name = 'example-name'\n",
111+
"\n",
112+
"path_to_dicom_dataset = \"/media/data/robert/datasets/dicom_example/VR-DICOM/\"\n",
113+
"dataset_name = 'VR-DICOM2'\n",
114+
"target_folder = Path(path_to_dicom_dataset).parent\n",
115+
"dataset = target_folder / f\"dataset-{dataset_name}\"\n",
116+
"extract_dicom_folder(Path(path_to_dicom_dataset), dataset,use_session=True,n_cpu=1)"
117+
]
118+
},
119+
{
120+
"cell_type": "code",
121+
"execution_count": null,
122+
"metadata": {},
123+
"outputs": [],
124+
"source": [
125+
"from pathlib import Path\n",
126+
"path_to_dicom_dataset = \"/media/data/robert/datasets/dicom_example/VR-DICOM/\"\n",
127+
"dataset_name = 'VR-DICOM2'\n",
128+
"target_folder = Path(path_to_dicom_dataset).parent\n",
129+
"dataset = target_folder / f\"dataset-{dataset_name}\""
130+
]
131+
},
132+
{
133+
"cell_type": "markdown",
134+
"metadata": {},
135+
"source": [
136+
"We have tool that automat scans Bids folders an creates a grouped dictionary, where you can pick out the relevant."
137+
]
138+
},
139+
{
140+
"cell_type": "code",
141+
"execution_count": null,
142+
"metadata": {},
143+
"outputs": [],
144+
"source": [
145+
"from TPTBox import BIDS_Global_info,BIDS_FILE\n",
146+
"from TPTBox.core.bids_constants import sequence_splitting_keys\n",
147+
"print(\"if one of the values of these keys is diffrent, than it is considered a other sequence:\", sequence_splitting_keys)\n",
148+
"print(\"sub will alway split\")\n",
149+
"\n",
150+
"print(\"Lets search for candidate for merging. For this we have to remove the sequ-key from sequence_splitting_keys\")\n",
151+
"my_splitting_keys = sequence_splitting_keys.copy()\n",
152+
"my_splitting_keys.remove(\"sequ\")\n",
153+
"my_splitting_keys.append(\"part\")\n",
154+
"\n",
155+
"bgi = BIDS_Global_info(dataset,[\"rawdata\",\"derivative\"],sequence_splitting_keys=my_splitting_keys)\n",
156+
"stitching_candidate:list[BIDS_FILE] = []\n",
157+
"epsilon = 0.2\n",
158+
"for name, subj in bgi.iter_subjects():\n",
159+
" print('Subject identifier',name)\n",
160+
" q = subj.new_query()\n",
161+
" #Filter by some rules\n",
162+
" q.flatten()\n",
163+
" q.filter_filetype('nii.gz')\n",
164+
" q.unflatten()\n",
165+
" for fam in q.loop_dict():\n",
166+
" print(fam)\n",
167+
" for key, file_list in fam.items():\n",
168+
" if key == \"mr\":\n",
169+
" continue\n",
170+
" if len(file_list) == 1:\n",
171+
" continue\n",
172+
" # This code is only an example, where we group images with the same orientation and zoom, so we know what are potential stitching targets.\n",
173+
" # We use _format key as the initial split, so T1w and T2w will not be stiched\n",
174+
" matching_group = []\n",
175+
" for files in range(len(file_list)):\n",
176+
" f1 = file_list[files]\n",
177+
" if f1 is None:\n",
178+
" continue\n",
179+
" grid1 = f1.get_grid_info()\n",
180+
" if grid1 is None:\n",
181+
" continue\n",
182+
" current_group = [f1] # Start a new group with the current file\n",
183+
" for j in range(files + 1, len(file_list)):\n",
184+
" f2 = file_list[j]\n",
185+
" if f2 is None:\n",
186+
" continue\n",
187+
" grid2 = f2.get_grid_info()\n",
188+
" if grid2 is None:\n",
189+
" continue\n",
190+
" # Check if orientation matches\n",
191+
" if grid1.orientation == grid2.orientation:\n",
192+
" # Check if zoom is within the tolerance\n",
193+
" zoom_diff = [abs(z1 - z2) for z1, z2 in zip(grid1.zoom, grid2.zoom,strict=False)]\n",
194+
" if all(diff <= epsilon for diff in zoom_diff):\n",
195+
" current_group.append(f2)\n",
196+
" file_list[j] = None # type: ignore\n",
197+
" # Add the group if it has more than one file\n",
198+
" if len(current_group) > 1:\n",
199+
" stitching_candidate.append(current_group)\n",
200+
"for files in stitching_candidate:\n",
201+
" print(files)"
202+
]
203+
},
204+
{
205+
"cell_type": "markdown",
206+
"metadata": {},
207+
"source": [
208+
"# 3 Stitching \n",
209+
"Torax/Fullbody images are often in chunks. We can stich them with the stitching function"
210+
]
211+
},
212+
{
213+
"cell_type": "code",
214+
"execution_count": null,
215+
"metadata": {},
216+
"outputs": [],
217+
"source": [
218+
"from TPTBox.stitching import stitching\n",
219+
"from TPTBox import to_nii\n",
220+
"from concurrent.futures import ProcessPoolExecutor\n",
221+
"\n",
222+
"derivative_folder = \"derivative_stiched\"\n",
223+
"\n",
224+
"def process_files(files):\n",
225+
" files = sorted(files) # noqa: PLW2901\n",
226+
" sequ: str = (files[0].get(\"sequ\", \"\") + \"-\" if \"sequ\" in files[0].info else \"\") + \"stiched\" # type: ignore\n",
227+
" out_name = files[0].get_changed_path(\"nii.gz\", info={\"sequ\": sequ}, parent=derivative_folder)\n",
228+
" if not out_name.exists():\n",
229+
" stitching(*files, out=out_name, is_seg=False, is_ct=files[0].bids_format == \"ct\", dtype=to_nii(files[0]).dtype)\n",
230+
" nii = to_nii(out_name)\n",
231+
" nii.apply_crop_(nii.compute_crop())\n",
232+
" nii.save(out_name)\n",
233+
"# Test\n",
234+
"process_files(stitching_candidate[0])\n",
235+
"# Execute the loop in parallel using a ProcessPoolExecutor\n",
236+
"with ProcessPoolExecutor() as executor:\n",
237+
" executor.map(process_files, stitching_candidate)"
238+
]
239+
},
240+
{
241+
"cell_type": "markdown",
242+
"metadata": {},
243+
"source": [
244+
"# 3 Segmentation \n",
245+
"\n",
246+
"Note: by default we do not install Deep-learning stuff.\n",
247+
"\n",
248+
"Install:\n",
249+
"\n",
250+
"```pip install SPINEPS ruamel.yaml configargparse```\n",
251+
"\n",
252+
"trouble shouting: nnunetv2==2.4.2\n"
253+
]
254+
},
255+
{
256+
"cell_type": "markdown",
257+
"metadata": {},
258+
"source": [
259+
"### TotalVibeSegmentator\n",
260+
"\n",
261+
"https://arxiv.org/abs/2406.00125\n",
262+
"\n",
263+
"https://github.com/robert-graf/TotalVibeSegmentator\n"
264+
]
265+
},
266+
{
267+
"cell_type": "code",
268+
"execution_count": null,
269+
"metadata": {},
270+
"outputs": [],
271+
"source": [
272+
"from TPTBox.segmentation import run_totalvibeseg\n",
273+
"from TPTBox import BIDS_FILE\n",
274+
"# run_totalvibeseg\n",
275+
"# You can alos use a string/Path if you want to set the path yourself.\n",
276+
"dataset = \"/media/data/robert/datasets/dicom_example/dataset-VR-DICOM2/\"\n",
277+
"in_file = BIDS_FILE(f\"{dataset}/derivative_stiched/sub-111168222/T2w/sub-111168222_sequ-301-stiched_acq-ax_part-water_T2w.nii.gz\",dataset)\n",
278+
"out_file = in_file.get_changed_path(\"nii.gz\",\"msk\",parent=\"derivative\",info={\"seg\":\"TotalVibeSegmentator\",\"mod\":in_file.bids_format})\n",
279+
"run_totalvibeseg(in_file,out_file)"
280+
]
281+
},
282+
{
283+
"cell_type": "markdown",
284+
"metadata": {},
285+
"source": [
286+
"## spineps\n",
287+
"\n",
288+
"Spineps can segment spine images in a instance and semantic mask. Running automatic over a dataset is very opinionated, what to segment. \n",
289+
"TODO: make a way to manully define output paths\n",
290+
"\n",
291+
"https://github.com/Hendrik-code/spineps/tree/main"
292+
]
293+
},
294+
{
295+
"cell_type": "code",
296+
"execution_count": null,
297+
"metadata": {},
298+
"outputs": [],
299+
"source": [
300+
"# If your model is BIDS compliant you can auto run spineps\n",
301+
"from TPTBox.segmentation import run_spineps_all\n",
302+
"#run_spineps_all(dataset)\n"
303+
]
304+
},
305+
{
306+
"cell_type": "code",
307+
"execution_count": null,
308+
"metadata": {},
309+
"outputs": [],
310+
"source": [
311+
"# Pick a fitting model:\n",
312+
"from spineps.models import modelid2folder_semantic,modelid2folder_instance\n",
313+
"print('Available Semantic Models',modelid2folder_semantic())\n",
314+
"print('Available Instance Models',modelid2folder_instance())\n",
315+
"\n",
316+
"print(modelid2folder_semantic().keys())\n",
317+
"print(modelid2folder_instance().keys())\n",
318+
"dataset = \"/media/data/robert/datasets/dicom_example/dataset-VR-DICOM2\"\n",
319+
"file_path = f\"{dataset}/derivative_stiched/sub-111168223/T2w/sub-111168223_sequ-401-stiched_acq-sag_part-inphase_T2w.nii.gz\"\n",
320+
"#file_path = f\"{dataset}/derivative_stiched/sub-111168223/T2w/sub-111168223_sequ-201-stiched_acq-ax_part-inphase_T2w.nii.gz\"\n",
321+
"\n",
322+
"model_semantic = \"t2w\"\n",
323+
"model_instance = \"instance\"\n",
324+
"derivative_name = \"derivative\"\n"
325+
]
326+
},
327+
{
328+
"cell_type": "code",
329+
"execution_count": null,
330+
"metadata": {},
331+
"outputs": [],
332+
"source": [
333+
"from TPTBox.segmentation.spineps import run_spineps_single\n",
334+
"#With 'ignore_compatibility_issues = True' you can force to rund\n",
335+
"out_paths = run_spineps_single(\n",
336+
" file_path,\n",
337+
" dataset=dataset,\n",
338+
" model_semantic=model_semantic,\n",
339+
" model_instance=model_instance,\n",
340+
" derivative_name=derivative_name,\n",
341+
" ignore_compatibility_issues=False,)\n",
342+
"print(out_paths)"
343+
]
344+
},
345+
{
346+
"cell_type": "code",
347+
"execution_count": null,
348+
"metadata": {},
349+
"outputs": [],
350+
"source": []
351+
},
352+
{
353+
"cell_type": "code",
354+
"execution_count": null,
355+
"metadata": {},
356+
"outputs": [],
357+
"source": []
358+
}
359+
],
360+
"metadata": {
361+
"kernelspec": {
362+
"display_name": "py3.11",
363+
"language": "python",
364+
"name": "python3"
365+
},
366+
"language_info": {
367+
"codemirror_mode": {
368+
"name": "ipython",
369+
"version": 3
370+
},
371+
"file_extension": ".py",
372+
"mimetype": "text/x-python",
373+
"name": "python",
374+
"nbconvert_exporter": "python",
375+
"pygments_lexer": "ipython3",
376+
"version": "3.11.4"
377+
}
378+
},
379+
"nbformat": 4,
380+
"nbformat_minor": 0
381+
}

0 commit comments

Comments
 (0)