-
Notifications
You must be signed in to change notification settings - Fork 150
raw: fix reverse_scan failing across multiple regions #527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Welcome @HueCodes! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes a bug where reverse_scan fails with a key_not_in_region error when the scan range spans multiple TiKV regions. The root cause was that the region selection used start_key (lower bound) for both forward and reverse scans, but TiKV reverse scans start from the upper bound.
Changes:
- Modified region selection to use
end_keyfor reverse scans instead ofstart_key - Updated scan iteration to track the moving boundary (upper for reverse, lower for forward)
- Added boundary handling logic to select the previous region when landing exactly on a region boundary
- Implemented safety checks to prevent infinite loops during reverse scan iteration
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/raw/client.rs
Outdated
| let mut current_key: Key = if reverse { | ||
| // Start from the upper bound for reverse scan | ||
| end_key.clone().unwrap_or_default() |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When end_key is None (indicating an unbounded upper range like start_key..), using unwrap_or_default() sets current_key to an empty key []. For a reverse scan, this is incorrect because the scan should start from the end of the keyspace, not from the beginning. An empty key represents the start of the keyspace. This will cause reverse scans with unbounded upper bounds (e.g., scan_reverse("k2".., limit)) to fail or return incorrect results. The test at line 611-617 in tests/integration_tests.rs appears to test this scenario.
src/raw/client.rs
Outdated
| // If no upper bound, fall back to start_key (though this case | ||
| // is documented as unsupported for reverse scan) | ||
| _ => &start_key, |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment indicates that reverse scans with no upper bound are documented as unsupported (line 520), but the code attempts to handle it by falling back to start_key. This fallback will lead to incorrect behavior. It would be clearer and safer to return an error for this unsupported case rather than silently proceeding with incorrect logic. Consider adding an explicit error check for reverse && end_key.is_none() before entering the scan loop.
f893fa6 to
7e6311d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // if we've reached/passed the upper bound of our scan range | ||
| if next_key.is_empty() || end_key.clone().is_some_and(|ek| ek <= next_key) { | ||
| break; | ||
| } |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a safety check similar to the reverse scan case to prevent infinite loops if next_key doesn't advance. The forward scan path should verify that next_key is strictly greater than current_key before continuing, otherwise break to avoid an infinite loop.
| } | |
| } | |
| // Safety: ensure progress is made; if next_key does not advance, | |
| // break to avoid an infinite loop | |
| if next_key <= current_key { | |
| break; | |
| } |
| loop { | ||
| let region = self.rpc.clone().region_for_key(&start_key).await?; | ||
| // For forward scan: select region containing start_key (lower bound) | ||
| // For reverse scan: select region containing end_key (upper bound) | ||
| // because TiKV reverse scan starts from the upper bound and goes backward. | ||
| // | ||
| // When reverse=true, new_raw_scan_request swaps the keys so that TiKV receives | ||
| // end_key as its start_key. We must select the region containing that key. | ||
| let region_lookup_key: &Key = if reverse { | ||
| match &end_key { | ||
| Some(ek) if !ek.is_empty() => ek, | ||
| // If no upper bound, fall back to start_key (though this case | ||
| // is documented as unsupported for reverse scan) | ||
| _ => &start_key, | ||
| } | ||
| } else { | ||
| &start_key | ||
| }; | ||
| let mut region = self.rpc.clone().region_for_key(region_lookup_key).await?; | ||
|
|
||
| // For reverse scan: if the lookup key equals the region's start_key exactly, | ||
| // we're at a boundary and need the previous region. This happens when iterating | ||
| // backward and current_key lands on a region boundary. | ||
| if reverse { | ||
| let region_start = region.start_key(); | ||
| if !region_start.is_empty() && *region_lookup_key == region_start { | ||
| // Find the previous region by computing the key immediately before | ||
| // this boundary. Decrement the last byte with borrow propagation. | ||
| let prev_key = { | ||
| let mut key: Vec<u8> = region_start.into(); | ||
| while let Some(last) = key.last_mut() { | ||
| if *last > 0 { | ||
| *last -= 1; | ||
| break; | ||
| } | ||
| key.pop(); // byte was 0, borrow from previous | ||
| } | ||
| key | ||
| }; | ||
| // Always look up - empty prev_key correctly returns the first region | ||
| region = self.rpc.clone().region_for_key(&prev_key.into()).await?; | ||
| } | ||
| } |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fix for multi-region reverse scans lacks automated test coverage. While integration tests exist for scan_reverse, they don't validate the specific multi-region scenario that this PR fixes. Consider adding a unit test with a mocked multi-region setup to verify the region selection and boundary handling logic, or document how to manually test this scenario.
|
Thanks for the feedback. Addressing these issues, I will fix and post and update today. |
When reverse scanning a range spanning multiple regions, `retryable_scan` selected the region based on `start_key` (lower bound). However, TiKV reverse scan starts from the upper bound, causing `key_not_in_region` errors when bounds are in different regions. Fix by selecting the region containing `end_key` for reverse scans and adjusting continuation logic to move backward through regions. Also handle the edge case when iteration lands exactly on a region boundary. Fixes tikv#512 Signed-off-by: Hugh <[email protected]>
The previous boundary handling used pop() to get a "previous" key, which truncates instead of decrementing. This could select the wrong region when: 1. The region boundary key is a single byte (pop makes it empty) 2. The truncated key falls in a much earlier region Fix by properly decrementing the last byte with borrow propagation. Also remove the !prev_key.is_empty() guard that skipped the region lookup entirely for single-byte boundaries. Signed-off-by: Hugh <[email protected]>
- Add explicit error for reverse scan without upper bound - Add forward scan safety check to prevent infinite loops - Simplify region lookup now that end_key is validated Signed-off-by: Hugh <[email protected]>
879fb1d to
7e7fe5e
Compare
|
Let me know if theres any other changes you would like me to make. |
Summary
Fix
reverse_scanreturningkey_not_in_regionerror when the scan range spans multiple TiKV regions.Root cause:
retryable_scanselected the region based onstart_key(lower bound), but TiKV reverse scan starts from the upper bound and goes backward. When bounds are in different regions, TiKV returns an error because the request context specifies the wrong region.Fix:
end_key(upper bound) for reverse scansregion.start_key()as continuation point (moving toward lower keys)Fixes #512
Test plan
cargo fmt --checkpassescargo clippy --lib --testspasses with no warningscargo test --libpasses (53 tests)