Skip to content

lmms-eval’s OpenAI chat adapter does not preserve lm-eval-harness text-task semantics #1317

@babyplutokurt

Description

@babyplutokurt

Checklist

  • I have searched for similar issues before opening this one.
  • I am using the latest version of lmms-eval.

Bug Description

For example:

OpenAI-compactible adapter has a hradcoded max tokens cap 4096

I don't get why

max_new_tokens = min(request_gen_kwargs.get("max_new_tokens", 1024), 4096)
using chat-completion request hardcap the max tokens to 4096. Many thinking model output length requries far more length than this.

And there is no commenct explain why overwrite user's passed parameter max_new_tokens

Here are some other known issues existed:

  • The chat OpenAI adapter receives ctx but ignores it. That silently drops task description/fewshot context for text tasks.
  • Sample logs can show input that differs from the actual backend request, which is misleading for debugging.
  • --gen_kwargs is accepted globally, but the OpenAI adapter forwards only part of it.
  • The hard 4096 cap is inappropriate for OpenAI-compatible local backends unless clearly documented/configurable.
  • The regex filter crash is a straightforward robustness bug.

Steps to Reproduce

NA

Error Message / Traceback

Environment

NA

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions