Fix smck bug#2525
Open
jeromekelleher wants to merge 2 commits into
Open
Conversation
In msp_smc_k_common_ancestor_event, gsl_ran_flat occasionally draws random_mass == 0 (probability ~1/2^32 per CA event). With random_mass of zero, fenwick_find returns the first non-zero hull and remaining_mass = cum_sum - 0 lands at exactly x_hull->count. The walk loop with predicate `remaining_mass >= 0` then over-iterates by one match, runs off the head of the AVL tree, and trips the tsk_bug_assert at line 7694. Clamp remaining_mass strictly below x_hull->count so the loop terminates after exactly `count` matches. The fix is local; it does not change the RNG draw count or pattern, so seeded simulations that did not previously trip the bug produce identical trajectories. Add a regression test (test_smc_k_bug_repro) that replays the user-reported failure (50 diploid samples, seq_length 2e7, recomb 1e-8, demography change at t=30, seed 3940783591). Pre-fix the test aborts after ~74821 events; post-fix it completes in about a second.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2525 +/- ##
=======================================
Coverage 93.83% 93.83%
=======================================
Files 20 20
Lines 12104 12106 +2
Branches 2242 2243 +1
=======================================
+ Hits 11358 11360 +2
Misses 567 567
Partials 179 179
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
323edf3 to
d50953d
Compare
The prior test_smc_k_bug_repro replayed the user-reported failing
simulation (50 diploid samples, seq_length 2e7, ~75k events) to hit
the bug after about one second. Replace it with
test_smc_k_zero_random_mass, which uses a custom GSL rng_type whose
get_double callback returns successive doubles from a fixed array.
With the script {0.5, 0.0} and a 2-sample SMCk simulation, the first
common-ancestor event has gsl_ran_flat return 0 deterministically,
making remaining_mass == x_hull->count exactly — the same trigger the
clamp at lib/msprime.c handles. Pre-clamp the test aborts at line
7694; post-clamp it completes in about a millisecond.
The scripted RNG type is defined as static helpers in test_ancestry.c
next to the test that uses it.
Add chained-event SMCk clamp test
test_smc_k_zero_random_mass triggers the clamp at lib/msprime.c on
the very first CA event of a fresh 2-sample simulation, only walking
the loop once. Add a sibling test_smc_k_zero_random_mass_chained that
runs one CA event first (coalescing one pair of 4 lineages, freeing
and reusing hull slots, rearranging AVL trees) and then drives
gsl_ran_flat to 0 on the second CA event. Pre-fix the second event
aborts at line 7694; post-fix both complete in about a millisecond.
The test covers the clamp working from a non-trivial post-event
state. We initially hoped to also drive the second event to a hull
with count > 1 via the fenwick zero-skip path, but fenwick_find
always stops at the first non-zero slot and any non-zero predecessor
blocks the route to a higher-count hull. The honest scope (clamp in
chained events, count == 1) is documented in the test comment.
Update CHANGELOG
d50953d to
577cd77
Compare
Member
Author
|
The change here seems quite straightforward, so I think we can issue a bugfix release once this is merged. |
Member
|
That's quite surprising -- if the proper distribution is uniform on [0, x), why is 0 a problem? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2523
This was a bit more subtle than the fix in #2524 implied (which would probably have led to biases). The issue is the rng producing a value of exactly zero.