names: collapse X-bridge docs to a one-line note#206
Merged
Conversation
The X-bridge case in the cluster-rule fragility tracking (`run_cluster` mishandling a part bridged into two pre-existing clusters) is structurally unreachable under the current monotone DP — the cursor walk in `run_align` can't produce the non-monotone overlap pattern an X-bridge requires. Confirmed empirically: 0 bridge events fired across the 818-case cases.csv corpus. Investigation was triggered by the cluster-DSU branch, which would have been a ~300-line rewrite to fix the case. The fix is correct algorithmically but the bug is unreachable from real input today, so the work is currently complexity without measurable gain. Reverting to the shared-vertex implementation on main and folding the analysis into a one-line code comment. The DSU rewrite stays as a follow-up if the connectivity-rule replacement lands (which would expand which edges qualify and could unlock the case). Plan-doc cleanup: - weighted-distance.md § Open spec knobs → Clustering rule fragility: two sub-bullets collapsed to one sentence. - arch-name-distance.md § Pairing rule: drops the X-bridge limitation paragraph + redundant fragility bullets; keeps the threshold-fragility one-liner. - compare.rs `run_cluster` docstring: drops the "X-bridge limitation" paragraph; replaces with a short note explaining the structural unreachability and what would need to change to unlock it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The X-bridge case in
run_cluster(a part referenced from two clusters when a later edge bridges two pre-existing clusters with no shared vertex) is structurally unreachable under the current monotone DP. The cursor walk inrun_aligncan't produce the non-monotone overlap pattern an X-bridge requires. Confirmed empirically: 0 bridge events fired across the 818-case cases.csv corpus.The plan docs and the
run_clusterdocstring previously framed it as a real-but-rare invariant violation that needed a union-find rewrite. After investigation that's misleading — the bug is real algorithmically but unreachable from real input today. This PR collapses the discussion to a one-line note in each of the three places it was tracked.Background
Investigation triggered by a
pudo/cluster-dsubranch (kept locally as a reference) that would have been a ~300-line rewrite — Dsu struct + iterative path-compressedfind+ union-by-rank + emit-order tick tracking + 6 unit tests. The rewrite is correct algorithmically and bit-identical on the corpus (0 outcome flips, 0 score diffs over 818 cases). But the X-bridge regression test it introduces exercises code that the real DP can never trigger — it builds syntheticAlignmentDatadirectly.The structural argument:
align.overlapsis built by a single monotone DP walk through the SEP-joined strings. Bothqry_idxandres_idxonly advance, never backtrack. So if(qp_a, rp_b)accumulates an Equal step, no later step can be at(qp_c, rp_d)withqp_c > qp_aandrp_d < rp_b— but that's exactly what an X-bridge needs (two non-vertex-sharing edges, then a third that bridges them). Under sort-order processing on lex-sorted edges, the bridge edge always lands at a position where it's a chain-via-shared-vertex instead.The DSU rewrite is the right shape if the connectivity-rule replacement lands (threshold = 0 expands which edges qualify and could unlock the case), or if the DP ever stops being monotone. Neither is imminent. Better to revert and pull the DSU back when there's a concrete reason.
What this PR changes
rust/src/names/compare.rs(run_clusterdocstring)plans/weighted-distance.md§ Open spec knobs → Clustering rule fragilityplans/arch-name-distance.md§ Pairing ruleNet diff: 17 insertions, 55 deletions.
No code behaviour change. No test changes.
Test plan
cargo test --release --features python— cleanpytest tests/— 470 passmypy --strict rigour— cleancargo fmt --check,cargo clippy --all-targets -- -D warnings(with and without--features python) — clean🤖 Generated with Claude Code