[OPIK-6370] [BE] fix: tolerate plain-string UUIDs in Redis stream deserialization#6610
Open
thiagohora wants to merge 3 commits intomainfrom
Open
[OPIK-6370] [BE] fix: tolerate plain-string UUIDs in Redis stream deserialization#6610thiagohora wants to merge 3 commits intomainfrom
thiagohora wants to merge 3 commits intomainfrom
Conversation
…ation The Redis stream codec used by experiment_denormalization_stream and other Java-codec streams wraps a Jackson ObjectMapper with default typing enabled. Older opik-backend versions ran on a different default-typing configuration and left messages with UUID fields serialized as plain strings in PEL; the current version's deserializer expects Jackson's As.WRAPPER_ARRAY shape (["java.util.UUID","<uuid>"]). XAUTOCLAIM repeatedly tries to reclaim those stuck messages, fails with MismatchedInputException, and emits an ERROR log every pending-message-duration window. Add a LenientUUIDDeserializer that accepts both shapes and register it on the mapper handed to JsonJacksonCodec inside RedisStreamCodec. Scoped to the Redis stream codec only — the global JsonUtils.MAPPER is untouched, so HTTP API JSON behavior is unchanged. Includes a unit test covering plain-string, As.WRAPPER_ARRAY, and a round-trip case to confirm we don't break the happy path.
…ializer fires under WRAPPER_ARRAY default typing The original commit registered LenientUUIDDeserializer via addDeserializer on the codec mapper, but that only wires it as the value-level deserializer called AFTER the TypeDeserializer strips the wrapper. With Redisson's JsonJacksonCodec enabling default typing, every UUID property gets an AsArrayTypeDeserializer in front of it: it expects START_ARRAY for the type id and explicitly throws MismatchedInputException on a bare string before delegating to the value deserializer. So plain-string UUIDs in PEL were never reaching our lenient logic. Fix by overriding deserializeWithType: it runs BEFORE the type deserializer takes over. We peek at the current token and short-circuit when it's a VALUE_STRING (legacy bare-string format), letting the standard UUIDDeserializer parse it directly. For START_ARRAY we delegate to the type deserializer's normal path, so current WRAPPER_ARRAY-format messages still decode correctly. Also adds a production-shape test that wires LenientUUIDDeserializer onto a real JsonJacksonCodec (default typing on, snake_case naming, etc.) and verifies all three round-trips: encoded shape contains the wrapper as expected, plain-string payloads decode, WRAPPER_ARRAY payloads decode. This is the test that caught the gap left by the original commit.
…t malformed payloads The wrapper-array branch advanced past the type-id and value tokens but didn't assert the next token was END_ARRAY, so a payload like ["...","<uuid>","extra"] would silently parse the UUID and leave the parser positioned mid-array — corrupting the next field. Throw a MismatchedInputException when the closing token isn't END_ARRAY. Adds a regression test that builds a 3-element wrapper-array payload and asserts the deserializer rejects it instead of silently accepting it.
Contributor
Backend Tests - Integration Group 9316 tests 316 ✅ 4m 32s ⏱️ Results for commit 2b276da. ♻️ This comment has been updated with latest results. |
Contributor
Backend Tests - Integration Group 12240 tests 239 ✅ 5m 17s ⏱️ Results for commit 2b276da. ♻️ This comment has been updated with latest results. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Details
The Redis stream codec used by
experiment_denormalization_stream(and any other Java-codec stream) wraps a JacksonObjectMapperwith default typing enabled insideJsonJacksonCodec. Older opik-backend versions ran on a different default-typing configuration and left messages withUUIDfields serialized as plain strings in the consumer-group's pending entry list (PEL); the current version's deserializer expects Jackson'sAs.WRAPPER_ARRAYshape (["java.util.UUID","<uuid>"]).XAUTOCLAIMrepeatedly tries to reclaim those stuck messages on every tick (interval =pending-message-duration, default 10 minutes), the codec fails decoding withMismatchedInputException, and the subscriber emits anERROR-level log every 10 minutes — generating alert noise indefinitely. The bad message is never acked or removed.This PR adds a
LenientUUIDDeserializerthat accepts both shapes:"6caf374f-6568-4c6f-aad0-257e0c7296a4"As.WRAPPER_ARRAY:["java.util.UUID","6caf374f-6568-4c6f-aad0-257e0c7296a4"]It's registered on a copy of
JsonUtils.getMapper()(buildStreamMapper) which is then handed toJsonJacksonCodecinsideRedisStreamCodec. Scope is limited to the Redis stream codec — the globalJsonUtils.MAPPERis untouched, so HTTP API JSON behavior is unchanged.After this fix, stuck messages flow through the regular
processMessagepath. If they're still otherwise invalid (e.g., the message references a deleted experiment), the existing retry-and-NACK logic inBaseRedisSubscriberwill eventually ACK them aftermaxRetriesattempts.Change checklist
Issues
AI-WATERMARK
AI-WATERMARK: yes
RedisStreamCodecTest+ newLenientUUIDDeserializerTest)Testing
mvn -DskipTests compileandmvn spotless:checkboth pass.The new unit test covers:
As.WRAPPER_ARRAYUUID deserializationNot run: full backend test suite (no other code paths touched), end-to-end against a real Redis stream with stale PEL entries (would require synthetically populating the stream — out of scope; the unit test pins the deserialization invariant).
Documentation
N/A — internal infrastructure change, no API or schema impact.