Skip to content

GenAI: clarify what 'chunk' means in time-to-chunk metrics #3483

@lmolkova

Description

@lmolkova

This PR talks about "chunks". What is the definition of "chunk"? Is that only about assistant content? Does it include reasoning if the model makes a distinction between reasoning and response text? Does it include any notification from the service, like a function call request (or a part of one)? Etc. My assumption is it's any packet of data from the llm, such that each update produced as part of a streaming implementation, regardless of what that update contains, counts as a "chunk".

Originally posted by @stephentoub in #3377 (comment)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Need triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions