🚨 Bug | Model returns malformed output unless the system message is first in the messages array

When we build the prompt like this 👇🏼

```python
messages = [
    {"role": "user",   "content": multi_questions_prompt},
    {"role": "system", "content": SYSTEM_PROMPT},
]
```
https://github.com/valory-xyz/mech-predict/blob/a3b733212cd27617f7876555d9f31388d6abd736/packages/napthaai/customs/prediction_request_reasoning/prediction_request_reasoning.py#L733

Expected behaviour
	•	Both model versions return a well-formed XML-tagged block so that
parser_prediction_response() succeeds.

Possible cause

OpenAI’s docs state that “system messages should come first”.
gpt-4o appears tolerant of reversed order; gpt-4.1 is not.

⸻

Work-around / Fix

Swap the message order:

```python
messages = [
    {"role": "system", "content": SYSTEM_PROMPT},
    {"role": "user",   "content": multi_questions_prompt},
]
```

…and bump the unit tests to assert the order so this doesn’t regress again.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🚨 Bug | Model returns malformed output unless the system message is first in the messages array #40

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

🚨 Bug | Model returns malformed output unless the system message is first in the messages array #40

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions