Skip to content

Comments

Expanded unit tests#620

Merged
kclem merged 1 commit intopinellolab:masterfrom
edilytics:master
Feb 20, 2026
Merged

Expanded unit tests#620
kclem merged 1 commit intopinellolab:masterfrom
edilytics:master

Conversation

@Colelyman
Copy link
Contributor

This PR expands the unit tests across CRISPResso, ensuring correct functionality.

* Bug Fix - 367 (#35)

* - Fixed references to ref_names_for_pe

* removed extra tabs

* trying to match empty line, no tabs

* - changed references to ref_names[0]

* Mckay/pd warnings (#45)

* refactor errors='ignore' to try except

* refactored integer slice to iloc[]

* moved to_numeric try except to function

* Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

This change is slightly cleaner because it addresses the root issue that some
columns are strings (and can therefore not be converted to numeric types). Now
if an error does occur when converting the dfs to numeric types it won't be
swallowed up.

* Add documentation to to_numeric_ignore_columns

---------

Co-authored-by: Cole Lyman <[email protected]>

---------

Co-authored-by: Cole Lyman <[email protected]>

* GitHub actions integration tests (#48)

* GitHub actions clean (#40)

* Create pytest.yml

* Create pylint.yml

* Create .pylintrc

* Create test_env.yml

* Full path

* Remove conda install

* Replace path

* Pytest tests

* pip -e

* Create integration_tests.yml

* Simplify name

* CRISPRESSO2_DIR environment variable

* Up one dir

* ls workspace

* Install CRISPResso and ydiff

* Clone repo instead of checkout

* submodule

* ls

* CRISPResso2_copy

* ls

* Update env

* Simplify

* Pull from githubactions branch

* Pull githubactions repo

* Checkout githubactions

* Mckay/pd warnings (#45)

* refactor errors='ignore' to try except

* refactored integer slice to iloc[]

* moved to_numeric try except to function

* Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

This change is slightly cleaner because it addresses the root issue that some
columns are strings (and can therefore not be converted to numeric types). Now
if an error does occur when converting the dfs to numeric types it won't be
swallowed up.

* Add documentation to to_numeric_ignore_columns

---------

Co-authored-by: Cole Lyman <[email protected]>

* Run tests individually

* Pin plotly version

* Run all tests even if one fails

* Test on another branch

* Switch branch with token

* Update integration_tests.yml

* Introduce pandas sorting in CRISPRessoCompare (#47)

* New makefile commands

* Fix interleaved fastq input in CRISPRessoPooled and suppress CRISPRessoWGS params (#42)

* Extract out split_interleaved_fastq function to CRISPRessoShared

* Implement splitting interleaved fastq files in CRISPRessoPooled

* Suppress split_interleaved_input from CRISPRessoWGS parameters

* Suppress other parameters in CRISPRessoWGS

* Move where interleaved fastq files are split to be trimmed properly

* Bug Fix - 367 (#35)

* - Fixed references to ref_names_for_pe

* removed extra tabs

* trying to match empty line, no tabs

* - changed references to ref_names[0]

* Mckay/pd warnings (#45)

* refactor errors='ignore' to try except

* refactored integer slice to iloc[]

* moved to_numeric try except to function

* Refactor to_numeric_ignore_errors to to_numeric_ignore_columns

This change is slightly cleaner because it addresses the root issue that some
columns are strings (and can therefore not be converted to numeric types). Now
if an error does occur when converting the dfs to numeric types it won't be
swallowed up.

* Add documentation to to_numeric_ignore_columns

---------

Co-authored-by: Cole Lyman <[email protected]>

---------

Co-authored-by: Cole Lyman <[email protected]>

* On push no branches

* On push no branches

* All in one file

* Fix yml errors

* Rename jobs

* Remove old workflow files

* Remove paths

* Run jobs in parallel

---------

Co-authored-by: mbowcut2 <[email protected]>
Co-authored-by: Cole Lyman <[email protected]>

* VCF Output (#128)

* vcf file writing initial concept

* splitting vcf into testable functions, adding tests

* adding unit tests, adding nullcontext to vcf writing file

* fixed first insertion bug

* Mckay/base edit plot (#119)

* Cole/plot fixes (#121)

* fix setting of 99%ile in negative direction (deletions)

* need another break

* When reading CRISPRessoPooled amplicon file, only skip lines if they start with the comment character(#)

* Implement new overwrite_crispresso_options in CRISPRessoPooled to only add non-default commands when propagating

* In CRISPRessoPooled implement multiplexing for sub-runs and create a run_name that is filename-safe and separate from a display name.

* Add arg for display_name

* Change display name name to Display Name

* CRISPRessoWGS fix unpickleable partial error

* Move complete message to after crispresso cup to allow for json parsing of status.json

* Update CRISPRessoWGS to allow for run names

* Messages to users about counting reads in input (sometimes this takes a while for large samples)

* Cast guardrail values as ints because division casts them as floats

* Remove invalid escape of _ when writing to JSON status file

* Fix documentatio for `CRISPRessoShared.check_if_failed` and remove extraneous whitespace

* Point to updated test branch

* Fix Cython bug

* Read version from toml file

* Fix import error

* update version

* Update integration_tests.yml

Remove Native Merge test temporarily

* Pin fastp version

* Run native merge test and point tests to master

---------

Co-authored-by: kclem <[email protected]>
Co-authored-by: Cole Lyman <[email protected]>

* Additional plot fixes (#122)

* fix setting of 99%ile in negative direction (deletions)

* need another break

* When reading CRISPRessoPooled amplicon file, only skip lines if they start with the comment character(#)

* Implement new overwrite_crispresso_options in CRISPRessoPooled to only add non-default commands when propagating

* In CRISPRessoPooled implement multiplexing for sub-runs and create a run_name that is filename-safe and separate from a display name.

* Add arg for display_name

* Change display name name to Display Name

* CRISPRessoWGS fix unpickleable partial error

* Move complete message to after crispresso cup to allow for json parsing of status.json

* Update CRISPRessoWGS to allow for run names

* Messages to users about counting reads in input (sometimes this takes a while for large samples)

* Cast guardrail values as ints because division casts them as floats

* Remove invalid escape of _ when writing to JSON status file

* Fix documentatio for `CRISPRessoShared.check_if_failed` and remove extraneous whitespace

* Point to updated test branch

* Fix Cython bug

* Read version from toml file

* Move printing header info to after console log level setting in main function

* Add import for importlib.metadata to CRISPRessoShared.py

* Standardize intermediate file names for CRISPRessoPooled info and fastq files

* CRISPRessoPooled avoid gzipping nonexistant files

* Remove warning if no config file

* Allow none for custom_colors in CRISPRessoPlot

* read version from toml file

* Make fig_filename_root default to None, in which case the figure is shown interactively - e.g. in a jupyter notebook

* Print tool description after logging level has been set

* update testRelease.sh script

* Don't rerun if the 'verbosity' value has changed.

* CRISPResso core will not rerun if there are changes in the 'debug', 'n_processes', and 'verbosity' arguments

* Verbosity levels <=1 are set to 1, >= 4 are set to 4.

* update for future pd.read_json deprecation

* Fix import error

* update version

* Update integration_tests.yml

Remove Native Merge test temporarily

* Pin fastp version

* Run native merge test and point tests to master

* Point tests to cole/plot_fixes branch

* Point tests back to master

---------

Co-authored-by: kclem <[email protected]>
Co-authored-by: Sam <[email protected]>

* - added Kendall's code
- added args
- added fig 10i to report
- added tests for data prep

* added upsetplot to Dockerfile

* added save_png to plot args

* removed unused args

* remove unused variable

* cole's comments

* removed unused function

* comments

* comments

* updated test_be_df with UNMODIFIED row

* remove comment

* removed extra args
fixed name

* Update help string for `--base_editor_consider_changes_outside_qw`

* Point tests back to master

---------

Co-authored-by: Samuel Nichols <[email protected]>
Co-authored-by: kclem <[email protected]>
Co-authored-by: Cole Lyman <[email protected]>

* further vcf tweaks/work

* vcf dynamic alt map and vcf line generation with unit testing

* updating tests

* Add test case for second element deletion

* Remove unnecessary imports and add whitspace

* Add mini integration tests for alt_map and vcf_line

* Update unit tests to account for bug in find_indels_substitutions

* Add failing test case for insertion and deletion occurring at the same position

The alt map incorrectly separate the deletion and insertion to different
positions.

* Cast dict_keys to be a list

* Fix test case to reflect correct position of deletion

* Add checks to the VCF file

* Add upsetplot as a dependency in setup.py (#130)

* Fix position of deletion in test case

* Fix off by one error for deletions, update tests and fix deletion at start

It turns out that when there is a deletion at the start of the sequence, the
correct way to handle it is to provide the last base after the deletion. Source:
https://bioinformatics.stackexchange.com/questions/2476/how-to-represent-a-deletion-at-position-1-in-a-vcf-file

* Update tests to convince myself that insertion and deletion handling is working

* Remove duplicate writing of allele frequency table

* Refactor `get_allele_row` and make vcf_output not dependent on write_detailed_allele_table

* Extract out construction of df_alleles in unit tests

* Write test to illustrate bug when you have an insertion then a deletion

* Refactor unit tests to use create_df_alleles

* Fix test_build_alt_map_mixed test

* Refactor create_df_alleles to support number of reads

This fixes test_build_alt_map_substitutions

* Fix test_build_alt_map_insertions test

* Allow amplicon name to be passed to create_df_alleles

* Fix test_upsert_edit_del_and_ins

* Fix deletion at start of amplicon

* Fix bug when there is a deletion starting at the second position

This bug only happens when a deletion starts are the second position, before the
fix, it would report that the deletion started at the first position. It is
fixed now, so deletions at the second position are reported correctly.

* Fix test to reflect full deletion

* Add tests for find_indels_substitutions for deletions at the end

* Fix 1bp deletions at the end, and off by one error

This ensures that when a deletion occurs at the end of a read, the entire
deletion is accounted for.

* Fix for representing deletions at the end of a sequence

* Fix bug where deletions that extend to the end of a sequence fail

* Remove qwc_indexes

* Properly implement handling of deletions that start at the beginning of a sequence

* Properly account for multiple delete_start events

* Update test to reflect proper ref_positions and other attributes

* Fix QWC inference across amplicons (#137)

* Mckay/be plot improvements (#136)

* trying to get the figure to fit nicely, increased
element size to 100

* custom figsize to display without cutting off

increased figsize in report template

* Allow messages to be served in CLI reports (#134) (pinellolab#583)

* Fix deletion at second position (#131)

* Fix bug when there is a deletion starting at the second position

This bug only happens when a deletion starts are the second position, before the
fix, it would report that the deletion started at the first position. It is
fixed now, so deletions at the second position are reported correctly.

* Update CRISPRessoCOREResources.c due to change in .pyx

* Add tests for find_indels_substitutions for deletions at the end

* Fix 1bp deletions at the end, and off by one error

This ensures that when a deletion occurs at the end of a read, the entire
deletion is accounted for.

* Update CRISPRessoCOREResources.c to reflect fixes for deletions at the end of alignments

* Add extra asserts to deletion checks

* Point to new test branch

* Reafctor deletion_coordinates to go past the end of the string for deletions at
the end of the sequence

* Point tests to master

* Allow messages to be served in CLI reports

* Point to cole/messages test branch

* Point tests back to master

* point to tests branch

* typo

* testing github actions

* remove test

* point tests to master

---------

Co-authored-by: Cole Lyman <[email protected]>

* Update inferred QWC tests to reflect correct intended behavior

* Fix inferring QWC to match intended behavior

* Add more test cases and fix bug discovered in single bp QWC

* Add even more test cases testing indels outside the QWC

* Point tests to cole/fix-qwc-deletion

* update plotly.js (#138)

* Change order of amplicon inference alignment so that 1st amplicon is the reference

This makes a difference because it changes the values of `s1inds`, and therefore
the value of the inferred quantification window coordinates.

* Point integration tests back to master

* Update CHANGELOG.md

---------

Co-authored-by: mbowcut2 <[email protected]>

* Fix a bug for `--bam_output` when there are unaligned reads (#144)

* Fix BAM output when there are unaligned reads

* Point tests to cole/unaligned-reads

* Update CHANGELOG.md

* Fix typo in tests branch name

* Point tests back to master

* Add VCF testing and verification design document

Documents the strategy for integration testing the --vcf_output feature
using golden file comparison with syn-gen synthetic data.

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* Change name of output VCF file

* Point tests to vcf-parameters

* Add writers module refactor design document

Documents plan to move VCF output code from CRISPRessoUtilities.py
to a new writers/ module structure.

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* Refactor VCF code into writers/ module

Move CRISPRessoUtilities.py to writers/vcf.py to establish a pattern
for organizing output writers. Update imports in CRISPRessoCORE.py
and relocate tests to test_writers/test_vcf/.

Co-Authored-By: Claude Opus 4.5 <[email protected]>

* Remove the FORMAT column from VCF and add the contig length to VCF headers

* Initial VCF simplication writing each edit on its own line

* Further simplification of VCF output code

* Left align deletions for VCF output

* Left-normalize insertions for VCF output

* Update CLAUDE.md with design_docs

* Update Github Actions with Pooled Prime Editing and VCF basic and VCF Prime Edit Basic

* Update integration tests to run in parallel

* Add VCF path to crispresso2_info

* Point the tests back to cole/vcf-parameters

* Always save the conda env cache in Github Actions

* Debug: add set +e and per-target error reporting in integration tests

setup-miniconda@v3 injects 'set -eo pipefail' into ~/.profile, which
is sourced by bash -l. This causes make commands to fail silently.
Adding set +e at script start and explicit per-target error annotations.

* Attempt to fix conda cache in GitHub Actions

* Update CHANGELOG.md and parameter descriptions

* Remove unused nullcontext import

* Remove extra blank lines after amplicon_coordinates validation

* Document that --vcf_output implies --write_detailed_allele_table

* Use max amplicon length for VCF contig headers and warn on collision

When multiple amplicons map to the same chromosome, the contig header
now uses the max amplicon length instead of silently overwriting with
the last one seen. A warning is logged when this occurs.

* Fix VCF POS off-by-one for start-of-amplicon deletions when pos > 1

For deletions starting at position 0 of the amplicon, the VCF POS was
computed as pos-1 instead of pos. The max(1,...) clamp accidentally
masked this when pos=1 (all existing tests). With pos>1, VCF POS
pointed one base before the amplicon.

Fix: when start==0, set left_index=pos directly instead of
pos+start-1.

* Fix multi-insertion bug: use alignment index to extract inserted bases

_edits_from_insertions treated the second element of insertion_coordinates
as an alignment index, but it is actually a reference position. These
coincide for single-insertion reads but diverge when prior insertions
shift alignment indices.

Fix: convert right_anchor_ref_pos to an alignment index via
ref_positions.index(), then extract the size characters before it.

Also:
- Rename misleading variables (right_anchor_ref_pos was actually left
  anchor, aligned_start was actually right anchor ref pos)
- Add comment on defensive ref_len branch explaining it is currently
  unreachable
- Update LEFT_NORMALIZATION.md to correctly describe the
  insertion_coordinates format

* Add clarifying example to help text of --amplicon_coordinates

* Remove planning documents

* Point tests back to master

* Update PR number in CHANGELOG.md

* Cache the conda environment for pytest as well

---------

Co-authored-by: Trevor Martin <[email protected]>
Co-authored-by: mbowcut2 <[email protected]>
Co-authored-by: Samuel Nichols <[email protected]>
Co-authored-by: kclem <[email protected]>
Co-authored-by: Claude Opus 4.5 <[email protected]>

* Tests file

* Rebased tests

* Rebase of new unit tests, improve coverage

* Fix version assertion

* Cole comment fixes

* Fix tests

* Add multiprocess tests

* Add discovery of Python packages (#151)

* Fix Docker entrypoint

* Attempt to fix Circle CI pip install

* Run each CRISPResso command in the conda environment

* Update `--base_editor_output` parameter name for Circle CI

* Update tests Makefile and batch expected output

* Fix QWC inference across amplicons (#137)

* Mckay/be plot improvements (#136)

* trying to get the figure to fit nicely, increased
element size to 100

* custom figsize to display without cutting off

increased figsize in report template

* Allow messages to be served in CLI reports (#134) (pinellolab#583)

* Fix deletion at second position (#131)

* Fix bug when there is a deletion starting at the second position

This bug only happens when a deletion starts are the second position, before the
fix, it would report that the deletion started at the first position. It is
fixed now, so deletions at the second position are reported correctly.

* Update CRISPRessoCOREResources.c due to change in .pyx

* Add tests for find_indels_substitutions for deletions at the end

* Fix 1bp deletions at the end, and off by one error

This ensures that when a deletion occurs at the end of a read, the entire
deletion is accounted for.

* Update CRISPRessoCOREResources.c to reflect fixes for deletions at the end of alignments

* Add extra asserts to deletion checks

* Point to new test branch

* Reafctor deletion_coordinates to go past the end of the string for deletions at
the end of the sequence

* Point tests to master

* Allow messages to be served in CLI reports

* Point to cole/messages test branch

* Point tests back to master

* point to tests branch

* typo

* testing github actions

* remove test

* point tests to master

---------

Co-authored-by: Cole Lyman <[email protected]>

* Update inferred QWC tests to reflect correct intended behavior

* Fix inferring QWC to match intended behavior

* Add more test cases and fix bug discovered in single bp QWC

* Add even more test cases testing indels outside the QWC

* Point tests to cole/fix-qwc-deletion

* update plotly.js (#138)

* Change order of amplicon inference alignment so that 1st amplicon is the reference

This makes a difference because it changes the values of `s1inds`, and therefore
the value of the inferred quantification window coordinates.

* Point integration tests back to master

* Update CHANGELOG.md

---------

Co-authored-by: mbowcut2 <[email protected]>

* Update setup.py and pyproject.toml to find all modules and packages

* Convert pyproject.toml to Unix file endings

---------

Co-authored-by: mbowcut2 <[email protected]>

* Update tests to not use length

* Fix

* Pixi implementation (#148)

* Initial implementation of pixi

* Add .pixi to .dockerignore

* Fix Dockerfile

* Update Dockerfile to be multistage

* Strip down setup.py to reduce redundant metadata

* Unpin numpy, matplotlib, and pandas. Remove tbb and pyparsing

* fix: add locked: false to setup-pixi since pixi.lock is not committed

* fix: disable pixi cache since pixi.lock is not committed

* feat: add pixi environment caching using actions/cache keyed on pixi.toml hash

* Point the integration tests to master

* fix: checkout CRISPResso2_tests to separate path to preserve pixi workspace

* fix: checkout CRISPResso2_tests under workspace, set CRISPRESSO2_DIR

* fix: pin python <3.13 to avoid compatibility issues with newer python

* remove accidentally committed test output

* Add discovery of Python packages (#151)

* Fix Docker entrypoint

* Attempt to fix Circle CI pip install

* Run each CRISPResso command in the conda environment

* Update `--base_editor_output` parameter name for Circle CI

* Update tests Makefile and batch expected output

* Fix QWC inference across amplicons (#137)

* Mckay/be plot improvements (#136)

* trying to get the figure to fit nicely, increased
element size to 100

* custom figsize to display without cutting off

increased figsize in report template

* Allow messages to be served in CLI reports (#134) (pinellolab#583)

* Fix deletion at second position (#131)

* Fix bug when there is a deletion starting at the second position

This bug only happens when a deletion starts are the second position, before the
fix, it would report that the deletion started at the first position. It is
fixed now, so deletions at the second position are reported correctly.

* Update CRISPRessoCOREResources.c due to change in .pyx

* Add tests for find_indels_substitutions for deletions at the end

* Fix 1bp deletions at the end, and off by one error

This ensures that when a deletion occurs at the end of a read, the entire
deletion is accounted for.

* Update CRISPRessoCOREResources.c to reflect fixes for deletions at the end of alignments

* Add extra asserts to deletion checks

* Point to new test branch

* Reafctor deletion_coordinates to go past the end of the string for deletions at
the end of the sequence

* Point tests to master

* Allow messages to be served in CLI reports

* Point to cole/messages test branch

* Point tests back to master

* point to tests branch

* typo

* testing github actions

* remove test

* point tests to master

---------

Co-authored-by: Cole Lyman <[email protected]>

* Update inferred QWC tests to reflect correct intended behavior

* Fix inferring QWC to match intended behavior

* Add more test cases and fix bug discovered in single bp QWC

* Add even more test cases testing indels outside the QWC

* Point tests to cole/fix-qwc-deletion

* update plotly.js (#138)

* Change order of amplicon inference alignment so that 1st amplicon is the reference

This makes a difference because it changes the values of `s1inds`, and therefore
the value of the inferred quantification window coordinates.

* Point integration tests back to master

* Update CHANGELOG.md

---------

Co-authored-by: mbowcut2 <[email protected]>

* Update setup.py and pyproject.toml to find all modules and packages

* Convert pyproject.toml to Unix file endings

---------

Co-authored-by: mbowcut2 <[email protected]>

* Remove .pi TODO's

* pin matplotlib

* pin pandas

* pin matplotlib

* pin numpy

* Update CHANGELOG.md

---------

Co-authored-by: mbowcut2 <[email protected]>
Co-authored-by: mbowcut2 <[email protected]>

* Update `np.tostring()` to `np.tobytes()` which is compatible with numpy 1.x and 2.x

---------

Co-authored-by: mbowcut2 <[email protected]>
Co-authored-by: Cole Lyman <[email protected]>
Co-authored-by: Trevor Martin <[email protected]>
Co-authored-by: kclem <[email protected]>
Co-authored-by: Claude Opus 4.5 <[email protected]>
Co-authored-by: mbowcut2 <[email protected]>
@kclem kclem merged commit ad4e8d9 into pinellolab:master Feb 20, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants