Skip to content

Bug Fix: Reranker Chunk Overlap and CHUNK_SIZE Exceedance#645

Open
hoperun300339 wants to merge 2 commits intonetease-youdao:qanything-v2from
hoperun300339:qanything-v2
Open

Bug Fix: Reranker Chunk Overlap and CHUNK_SIZE Exceedance#645
hoperun300339 wants to merge 2 commits intonetease-youdao:qanything-v2from
hoperun300339:qanything-v2

Conversation

@hoperun300339
Copy link
Copy Markdown

Fixed an issue where the reranker would exceed the predefined CHUNK_SIZE, which also negatively impacted reranking precision. The root cause was the separate_list function, which groups adjacent chunks based on their proximity in the original document. In cases where the document is small or adjacent chunks are close together, the grouping logic caused overlap between chunk groups. This resulted in individual chunk sets interfering with each other and the total length of a group exceeding the intended CHUNK_SIZE.

Fixed an issue where the reranker would exceed the predefined CHUNK_SIZE, which also negatively impacted reranking precision. The root cause was the separate_list function, which groups adjacent chunks based on their proximity in the original document. In cases where the document is small or adjacent chunks are close together, the grouping logic caused overlap between chunk groups. This resulted in individual chunk sets interfering with each other and the total length of a group exceeding the intended CHUNK_SIZE.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant