Skip to content

nickersonff/breast_cancer_classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pre-processing Pipeline and Imaging Equipment Impact the Performance of CNN for Breast Cancer Detection in Mammograms: Evidence from CBIS-DDSM and VinDr-Mammo


1. Files

code/pt/learners

  • local_mammo_learner.py - Main script that manages the model's lifecycle, including functions for training, loading, and saving weights. This class also serves as the entry point for running preprocessing tests.

code/pt/utils

  • birads_categories.json - Configuration file that defines the mapping of BI-RADS categories, used for model class reduction.
  • img_utils.py - Module with utility functions for image manipulation and transformation, applied during the preprocessing stage. preprocess_dicom - Script responsible for processing DICOM files, including loading the file, extracting image data, and applying preprocessing routines.
  • preprocess_json.py - Module containing utility functions for reading and parsing the dataset's JSON file.

data/datasets This path stores the JSON manifest files for all datasets. Each file contains metadata and paths for the images within its respective dataset. Each file contains the image dataset for a specific test scenario.

data/preprocessed Output directory for the final images generated by the pre-processing pipeline.

2. How to Configure and Run Tests

The main functions for configuring test scenarios are: pipelines and preprocessing.

  • pipelines : function responsible for building the test scenario, which consists of generating 25 random pipelines for validation for specific dataset.

  • preprocessing : This function executes a predefined set of preprocessing steps to test the following stages: normalization/standardization, image resizing, and filter application.

2.1 Run tests by passing arguments through the command line.

python3 ./code/pt/learners/local_mammo_learner.py '/home/nfferreira/data/dataset_site-1_DDSM.json' './data/preprocess/' pipelines(debug_datalist=argumentos[1], debug_dataset_root=argumentos[2])

  • debug_datalist: path to the dataset JSON file.
  • debug_dataset_root: output folder for image processing.

preprocessing(debug_datalist=argumentos[1], debug_dataset_root=argumentos[2])

  • debug_datalist: path to the dataset JSON file.
  • debug_dataset_root: output folder for image processing.

2.2 Running Tests on Multiple Datasets.

lista = ['/home/nfferreira/data/dataset_site-1_VINDR_DDSM.json', '/home/nfferreira/data/dataset_site-1_DDSM.json', '/home/nfferreira/data/dataset_site-1_VINDR_ALLMAN.json']
for i in lista:
    preprocessing(debug_datalist=i, debug_dataset_root=argumentos[2])

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors