When we build the prompt like this 👇🏼
messages = [
{"role": "user", "content": multi_questions_prompt},
{"role": "system", "content": SYSTEM_PROMPT},
]
|
{"role": "user", "content": multi_questions_prompt}, |
Expected behaviour
• Both model versions return a well-formed XML-tagged block so that
parser_prediction_response() succeeds.
Possible cause
OpenAI’s docs state that “system messages should come first”.
gpt-4o appears tolerant of reversed order; gpt-4.1 is not.
⸻
Work-around / Fix
Swap the message order:
messages = [
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": multi_questions_prompt},
]
…and bump the unit tests to assert the order so this doesn’t regress again.
When we build the prompt like this 👇🏼
mech-predict/packages/napthaai/customs/prediction_request_reasoning/prediction_request_reasoning.py
Line 733 in a3b7332
Expected behaviour
• Both model versions return a well-formed XML-tagged block so that
parser_prediction_response() succeeds.
Possible cause
OpenAI’s docs state that “system messages should come first”.
gpt-4o appears tolerant of reversed order; gpt-4.1 is not.
⸻
Work-around / Fix
Swap the message order:
…and bump the unit tests to assert the order so this doesn’t regress again.