A command-line tool for analyzing and visualizing BioImage Archive (BIA) study statistics.
Clone this repository and install dependencies:
git clone https://github.com/bioimage-archive/bia-study-stats.git
cd bia-study-stats
pip install -e .-
print_accessions: Display a table of accession IDs and their sizesbia-study-stats print_accessions stats.json
-
summarize: Show summary statistics including total accessions and storage usagebia-study-stats summarize stats.json
-
merge_df_sizes: Merge size information from adfcommand output filebia-study-stats merge_df_sizes stats.json df_output.txt
-
merge_s3_cache: Update sizes using an S3 cache filebia-study-stats merge_s3_cache stats.json s3_cache.json
-
update_from_fire: Fetch sizes directly from S3/FIRE storage for studies with zero sizebia-study-stats update_from_fire stats.json --failed-log errors.log
-
data_added_after: Calculate total data volume added after a specific datebia-study-stats data_added_after stats.json 2023-01-01
-
plot_cumulative_size: Generate a bar chart showing cumulative data size by quarterbia-study-stats plot_cumulative_size stats.json
-
plot_cumulative_entries: Create a bar chart of cumulative study count by quarterbia-study-stats plot_cumulative_entries stats.json
-
print_ebi_stats: Output monthly cumulative size statistics in EBI formatbia-study-stats print_ebi_stats stats.json
For commands that interact with S3/FIRE storage, create a .env file with:
S3_BUCKET=your-bucket-name
S3_ENDPOINT=https://your-endpoint.com # Optional
AWS_PROFILE=your-profile # Optional
quarterly_cumulative_size.png: Generated byplot_cumulative_sizequarterly_cumulative_entries.png: Generated byplot_cumulative_entries