Skip to content

🚨 Bug | Model returns malformed output unless the system message is first in the messages array #40

@kmshdev

Description

@kmshdev

When we build the prompt like this 👇🏼

messages = [
    {"role": "user",   "content": multi_questions_prompt},
    {"role": "system", "content": SYSTEM_PROMPT},
]

Expected behaviour
• Both model versions return a well-formed XML-tagged block so that
parser_prediction_response() succeeds.

Possible cause

OpenAI’s docs state that “system messages should come first”.
gpt-4o appears tolerant of reversed order; gpt-4.1 is not.

Work-around / Fix

Swap the message order:

messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user",   "content": multi_questions_prompt},
]

…and bump the unit tests to assert the order so this doesn’t regress again.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingenhancementNew feature or request

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions