[OPIK-5694] [BE] fix: eliminate FORMAT Values fast-path noise in batch inserts#6572
[OPIK-5694] [BE] fix: eliminate FORMAT Values fast-path noise in batch inserts#6572
Conversation
…span/thread batch INSERTs
Per Liya Katz's investigation on OPIK-5694, the silent error counter
pollution in production ClickHouse (~system.errors codes 26 / 27 / 43 /
70) traces back to datetime function expressions being substituted into
the per-row tuple in `FORMAT Values`:
parseDateTime64BestEffort(:start_time, 9),
if(:end_time IS NULL, NULL, parseDateTime64BestEffort(:end_time, 9)),
if(:last_updated_at IS NULL, now64(6), parseDateTime64BestEffort(...)),
...
Empirical isolation against ClickHouse 25.3.6.56 (test container) and
25.3.8 (production version) confirmed: every datetime cell expressed as
a function call adds +1 to code 27 per statement; NULL substituted into
a non-nullable column adds +1 to code 70. The Values fast-path parser
only recognises canonical text literals for DateTime64 cells; any
function or `::` cast falls back to the slow path and counts the
assertion in system.errors.
Fix: format datetime values to canonical ClickHouse text in Java
(`yyyy-MM-dd HH:mm:ss.SSSSSSSSS` for precision 9, `.SSSSSS` for
precision 6) and bind them as plain strings, dropping every parse /
if-wrap from the per-row tuple. The clickhouse-r2dbc driver substitutes
String values as `'<canonical>'` literals which the Values fast-path
recognises directly.
Same canonical-literal idea applied to:
- TraceDAO.BATCH_INSERT: start_time, end_time, last_updated_at,
visibility_mode (was an `if(... IS NULL, 'default', ...)` wrap;
now bound as 'default' literal when null)
- SpanDAO.BULK_INSERT: start_time, end_time, last_updated_at
(toDecimal128 / mapFromArrays for cost / usage left for a follow-up
per scope discussion)
- TraceThreadDAO.INSERT_THREADS_SQL: created_at, last_updated_at, and
sampling_per_rule (the existing mapFromArrays(:rule_ids, :sampling)
was replaced by binding a Java Map directly — the clickhouse-r2dbc
driver natively serialises Map<UUID, Boolean> as `{'uuid': true}` map
literal, no SQL function needed)
New util: ClickHouseDateTimeFormat with NANOS / MICROS formatters in UTC.
Tests: regression tests in TracesResourceTest$BatchInsert and
SpansResourceTest$BatchInsert assert codes 26 / 27 / 43 / 70 deltas
against system.errors before/after batchCreateTraces /
batchCreateSpans:
- Traces: all four codes assert isZero (covers BATCH_INSERT and the
side-effect trace_threads INSERT_THREADS_SQL that ingestion triggers)
- Spans: code 70 asserts isZero. Codes 26 / 27 / 43 are NOT asserted on
spans because BULK_INSERT still has toDecimal128 / mapFromArrays in
non-datetime cells; eliminating those is out of scope for this PR.
Behaviour for clients is unchanged: `lastUpdatedAt` still defaults to
"now" when omitted; `visibilityMode` still defaults to 'default'.
…omArrays from BULK_INSERT
Same FORMAT Values fast-path issue as the rest of OPIK-5694: function
expressions in tuple cells trip ClickHouse's Values fast-path parser
and bump system.errors codes 26 / 27 / 43.
In SpanDAO.BULK_INSERT the remaining offenders were:
- toDecimal128(:total_estimated_cost, 12) -- Decimal128 column
- mapFromArrays(:usage_keys, :usage_values) -- Map(String, Int32) column
The clickhouse-r2dbc driver natively serialises both types as canonical
literals: BigDecimal -> plain numeric (e.g. 0.000123), Map -> {key : value}
literal. So we just bind the values directly:
- bindCost: pass BigDecimal (not .toString()) so SQL gets `0.000123`
instead of `'0.000123'` (the quoted form also trips the fast-path,
empirically verified)
- usage: assemble a Map<String, Integer> in Java and bind it; the driver
emits `{'in' : 2, 'out' : 5}` map literal
Other SpanDAO templates (single-row INSERT, UPDATE_SPAN_SQL) still use
toDecimal128/mapFromArrays and are intentionally untouched -- those are
INSERT...SELECT statements, not FORMAT Values, so they don't hit the
Values fast-path.
Tests: SpansResourceTest now asserts all four codes (26/27/43/70) stay
at zero after batchCreateSpans, matching the traces test.
Backend Tests - Integration Group 9328 tests 326 ✅ 4m 58s ⏱️ Results for commit 80e9687. ♻️ This comment has been updated with latest results. |
…h insert tests
Audit of the regression tests revealed a gap: the new batch-insert tests
only exercised some null/non-null permutations of the fields this PR
touches. Beef up both tests so every branch the SQL/binding cares about
is hit:
Traces test (3 rows in one batch):
- Row A: endTime=null, lastUpdatedAt=null, visibilityMode=null
- Row B: endTime!=null, lastUpdatedAt!=null, visibilityMode=DEFAULT
- Row C: endTime!=null, lastUpdatedAt!=null, visibilityMode=HIDDEN
Spans test (2 rows in one batch):
- Row A: endTime=null, lastUpdatedAt=null, usage=null, totalEstimatedCost=null
- Row B: endTime!=null, lastUpdatedAt!=null, usage={prompt:12, completion:7},
totalEstimatedCost=BigDecimal("0.000123456789")
Both tests still strictly assert codes 26 / 27 / 43 / 70 deltas == 0
against system.errors.
startTime is @NotNull at the API layer for both Trace and Span, so the
null branch is unreachable through the public batch endpoint and not
worth testing here.
Backend Tests - Integration Group 8 35 files 35 suites 7m 25s ⏱️ Results for commit 80e9687. ♻️ This comment has been updated with latest results. |
CI surfaced a precision regression in SpansResourceTest$CreateSpan#createAndGetCost and $UpdateSpan#update__whenCostIsChanged after the previous commit changed bindCost to bind BigDecimal directly. Root cause: bindCost is shared across SQL templates with different shapes. - BULK_INSERT now writes the cell as a bare :placeholder (Decimal128(12) column). It needs a BigDecimal binding so the driver emits an unquoted numeric — CH parses straight into Decimal128 and is lossless. - The single-row INSERT and UPDATE templates still wrap the cell as toDecimal128(:cost, 12). Those need a String binding so the driver wraps with single quotes and CH parses via the Decimal-aware string path. With a BigDecimal binding the unquoted literal toDecimal128(123.45, 12) routes through Float64 and drops digits beyond ~17 significant figures (the test failure mode). Fix: - Restore bindCost to the original String binding (.toString()) — keeps INSERT/UPDATE precision intact. - Add bindCostForBulkInsert that binds BigDecimal directly. Used only in the BULK_INSERT loop. Keeps that cell at +0 system.errors increments. Also added a brief section to the ClickHouse skill documenting the rule: in FORMAT Values templates, every cell must be a plain placeholder bound to a value the driver serialises as a canonical literal — function calls in cells flood system.errors / pod stderr even when the insert itself succeeds.
- Inline the BigDecimal-direct cost bind at the BULK_INSERT call site (only call site for that variant), drop the bindCostForBulkInsert helper and the docstring on the existing bindCost — both unnecessary for one callsite. - Tighten the ClickHouse skill addition to drop the redundant single-row INSERT/UPDATE caveat (the rule scope is already in the heading) and dedupe overlapping language.
Backend Tests - Integration Group 11204 tests 204 ✅ 4m 7s ⏱️ Results for commit 1857bc0. ♻️ This comment has been updated with latest results. |
CI surfaced ProjectsResourceTest$FindProject#getProjects__whenProjectsHasTracesBatch__thenReturnProjectWithLastUpdatedTraceAt failing with a sub-millisecond timestamp diff between getById of the last trace and the project's MAX(last_updated_at). Reproduces under CI load but not deterministically in isolation. Root cause: the prior server-side wrap was if(:last_updated_at IS NULL, now64(6), parseDateTime64BestEffort(...)). ClickHouse evaluates now64() once per query — every batch row with null lastUpdatedAt got the SAME timestamp, making `getById(any).lastUpdatedAt == MAX(...)` trivially true. After the canonical-literal refactor we were calling Instant.now() per row in the bind loop, so each null-lastUpdatedAt row got a distinct sequential timestamp. While `getById(traces.getLast())` should still equal MAX in the steady case, JVM scheduling jitter under CI load made the assertion fragile. Fix: capture Instant.now() once per batch (outside the bind loop) and reuse it for every row whose client did not provide lastUpdatedAt. Restores the prior single-timestamp-per-batch invariant the test relies on. Same change in SpanDAO. TraceThreadDAO doesn't need it — that DAO never falls back to Instant.now(); it always uses item.lastUpdatedAt() / item.createdAt().
Review feedback on PR #6572: The scored_at cell in INSERT_THREADS_SQL is a real fast-path concern that slipped through the original audit. INSERT_THREADS_SQL is a FORMAT Values template and binds scored_at as a bare :placeholder (no parseDateTime64BestEffort wrap). Until now we passed Instant.toString() (ISO-8601 with T/Z), which the new ClickHouseDateTimeFormat util explicitly warns against — when scoredAt is non-null (re-insert after scoring), CH would not recognise the literal and would trip codes 26/27/43 the same way the rest of the PR addresses. Fix line 335 to use ClickHouseDateTimeFormat.formatNanos(item.scoredAt()). Also applied the same canonical-text format at the two other scored_at bind sites (UPDATE_THREAD_SQL line 516 and UPDATE_THREAD_SCORED_AT line 566) — those are INSERT...SELECT / UPDATE statements wrapped in parseDateTime64BestEffort so they were not affected by the fast-path issue, but consistent canonical binding is zero-cost and avoids someone copy-pasting the ISO-8601 form into a future FORMAT Values template. Aligned bindNull("scored_at", Instant.class) → String.class with the rest of the nullable-DateTime cells in this PR.
| if (item.scoredAt() != null) { | ||
| statement.bind("scored_at" + i, item.scoredAt().toString()); | ||
| statement.bind("scored_at" + i, ClickHouseDateTimeFormat.formatNanos(item.scoredAt())); | ||
| } else { | ||
| statement.bindNull("scored_at" + i, Instant.class); | ||
| statement.bindNull("scored_at" + i, String.class); | ||
| } |
There was a problem hiding this comment.
mapAndInsert and updateThreadSampledValues both duplicate the same scoredAt bind/null block, so should we extract a shared helper like bindScoredAt(statement, prefix, scoredAt) to keep serialization in one place?
Finding type: Code Dedup and Conventions | Severity: 🟢 Low
Want Baz to fix this for you? Activate Fixer
There was a problem hiding this comment.
Hi @ldaugusto — I saw that a similar PR was recently merged, but before we move forward with reviewing this one, have we explored the possibility of addressing these errors at the source (e.g., via configuration)?
From what I can see, the functions being used here are valid, tests are passing, and the queries/DAOs behave as expected. The entries in ClickHouse system.errors seem to be more of an unintended side effect rather than an indication of incorrect behavior.
While we can patch individual usages, this approach may not scale well—engineers could legitimately use similar (or new) functions in the future, and we might end up reintroducing the same kind of noise.
It might be worth investigating whether there’s a configuration-based solution to suppress or handle these cases more appropriately, rather than modifying code paths that are otherwise correct and functioning as intended.
|
@andrescrz during this task I've learned when there's a I guess we could set a config to discard these loggings, but we would be dropping logs when we have new/actual problems, so I don't believe it's a good idea. Using this approach here we can make our queries to run without triggering these logs and using only the fastpath (stream parser). I've updated the Clickhouse skill to try to avoid this problem to expand. |
andrescrz
left a comment
There was a problem hiding this comment.
Only minor suggestions, but LGTM. Thanks for the investigation, the fix and the improvements!
| // BULK_INSERT writes the cost cell directly into Decimal128(12) (no toDecimal128 | ||
| // wrap), so the driver must emit an unquoted numeric literal — bind the | ||
| // BigDecimal itself rather than its String form. See OPIK-5694. | ||
| BigDecimal cost = span.totalEstimatedCost() != null ? span.totalEstimatedCost() : calculateCost(span); | ||
| statement.bind("total_estimated_cost" + i, cost); | ||
| statement.bind("total_estimated_cost_version" + i, | ||
| span.totalEstimatedCost() == null && cost.compareTo(BigDecimal.ZERO) > 0 | ||
| ? ESTIMATED_COST_VERSION | ||
| : ""); |
There was a problem hiding this comment.
Minor: DRY and encapsulate, aiming to reuse the previous bindCost method. Refactor that one if you need to.
| public static String formatNanos(Instant instant) { | ||
| return NANOS.format(instant); | ||
| } | ||
|
|
||
| public static String formatMicros(Instant instant) { |
There was a problem hiding this comment.
Minor: if non-null isn't handled (I believe so), annotate the instant arg with lombok nonnull.
| @DisplayName("when batch spans are inserted, no FORMAT Values fast-path errors are emitted (OPIK-5694)") | ||
| void batch__whenSpansAreInserted__thenNoFastPathErrorsEmitted( |
There was a problem hiding this comment.
Minor: you created these tests in the previous fix. Honestly, feel free to completely clean them up. Reasons:
- This is more an implementation detail at ClickHouse level that we shouldn't test directly. It should be caught at the operational level, like it was. Your manual test is enough, I believe we don't need regression.
- It directly reads ClickHouse, with breaks the principle of this test class (and general in the service) of being black box tests.
Anyhow, if you prefer to keep them.
| @Test | ||
| @DisplayName("when batch contains a trace with null lastUpdatedAt, then no CANNOT_CONVERT_TYPE errors are emitted (OPIK-5694)") | ||
| void batch__whenTraceHasNullLastUpdatedAt__thenNoCannotConvertTypeErrors( | ||
| @DisplayName("when batch traces are inserted, no FORMAT Values fast-path errors are emitted (OPIK-5694)") |
There was a problem hiding this comment.
Minor: same about this other test. I recommend deleting it completely.
@ldaugusto discussed offline. Thanks for the explanation, that makes sense. They key learning is that it's not only noise error logs, the fast parser isn't used, which has side effects on amount of memory and speed, so it's desirable to be tackled. In addition, I'd rather have an architecture of cleaner queries without business logic in the form of SQL functions, so I like the improvement of making those more neutral and just passing plain values. |
Details
This PR is the empirical realisation of that diagnosis. After the previous code-70 fix landed, production ClickHouse was still logging a sustained stream of
CANNOT_PARSE_INPUT_ASSERTION_FAILED(code 27),ILLEGAL_TYPE_OF_ARGUMENT(code 43), andCANNOT_PARSE_QUOTED_STRING(code 26) errors from the trace/span/thread batch ingestion paths. The inserts succeed, but every batch INSERT statement counts an error insystem.errorsand writes to pod stderr.We empirically isolated this against ClickHouse
25.3.6.56(test container) and25.3.8(production version): every column whose value in the per-row tuple is a function call instead of a recognised literal trips the Values fast-path and counts an assertion insystem.errors. The Values fast path only recognises canonical literals — quoted strings, plain numbers, NULL, array literals[...], map literals{key : value}. Anything else (parseDateTime64BestEffort(...),now64(...),if(... IS NULL, ...),toDecimal128(...),mapFromArrays(...),'...'::Typecasts) falls back to the slow path and increments the counter once per offending cell per statement.Fix
For every datetime/decimal/map cell in the three FORMAT Values templates affected, build the canonical literal in Java and bind it as a plain value, dropping every wrapping function from the SQL.
TraceDAO.BATCH_INSERTstart_timeparseDateTime64BestEffort(:start_time, 9):start_time(canonical text viaClickHouseDateTimeFormat.formatNanos)end_timeif(:end_time IS NULL, NULL, parseDateTime64BestEffort(...)):end_time(canonical text orbindNull→ literalNULL)last_updated_atif(:last_updated_at IS NULL, now64(6), parseDateTime64BestEffort(...)):last_updated_at(canonical text; null →Instant.now()formatted)visibility_modeif(:visibility_mode IS NULL, 'default', :visibility_mode):visibility_mode(binds'default'literal when null)SpanDAO.BULK_INSERTstart_time/end_time/last_updated_attotal_estimated_costtoDecimal128(:total_estimated_cost, 12):total_estimated_cost(bindBigDecimaldirectly → driver emits plain numeric)usagemapFromArrays(:usage_keys, :usage_values):usage(bindMap<String, Integer>directly → driver emits{'k' : v}literal)TraceThreadDAO.INSERT_THREADS_SQLcreated_at/last_updated_atparseDateTime64BestEffort(...)sampling_per_rulemapFromArrays(:rule_ids, :sampling):sampling_per_rule(bindMap<UUID, Boolean>directly)Other templates in the same files (single-row
INSERT … SELECT,UPDATE_SPAN_SQL,UPDATE_THREAD_SQL) were audited and left alone — they don't useFORMAT Values, so the fast-path issue doesn't apply.New util
ClickHouseDateTimeFormatexposesformatNanos/formatMicros(UTC) for the two precision flavours used by the schema.The ClickHouse skill (
.agents/skills/opik-backend/clickhouse.md) gets a short section codifying the rule for future contributors: inFORMAT Valuestemplates every cell must be a plain:placeholderbound to a value the driver serialises as a literal — function calls, casts, andif(...)wrappers in cells floodsystem.errorseven when the insert succeeds.Behaviour visible to API consumers is unchanged:
lastUpdatedAtstill defaults to "now" when omitted (the column DEFAULT wasnow64(6), we now computeInstant.now()in Java and format it the same way),visibilityModestill defaults to'default', costs and usage are still recorded with the same precision and shape.The previous code-70 fix on this ticket (
if(... IS NULL, now64(6), parseDateTime64BestEffort(...))) is now superseded — the canonical-literal approach eliminates the same NULL→non-nullable conversion failure from a different angle, plus codes 26/27/43.Change checklist
Issues
AI-WATERMARK
AI-WATERMARK: yes
system.errorscounters before/after the fix; targeted run of all 31 BatchInsert tests acrossTracesResourceTestandSpansResourceTestTesting
Commands run:
Results:
TracesResourceTest$BatchInsert: 19 tests pass.SpansResourceTest$BatchInsert: 12 tests pass.FindTraceThreadsResourceTest+TraceThreadListenerTest+TraceThreadsClosingJobTest: 407 tests pass (covers the modifiedTraceThreadDAO.INSERT_THREADS_SQLpath).Scenarios validated:
batch__whenTracesAreInserted__thenNoFastPathErrorsEmittedandbatch__whenSpansAreInserted__thenNoFastPathErrorsEmittedmeasuresystem.errorsdeltas for codes 26 / 27 / 43 / 70 acrossbatchCreateTraces/batchCreateSpans. With the fix, all four counters' deltas are zero. Without the fix the same tests would fail on multiple counters at once.getByIdfor every inserted row to confirm rows are still present and the materialiseddurationcolumn still computes correctly.curlagainst ClickHouse25.3.6.56-alpine(test container) and25.3.8-alpine(production version) to confirm the result is not version-specific. Fixed-shape SQL produces zero increments on both.Not run: full backend regression suite (out of scope for this fix).
Documentation
N/A — no behaviour change visible to API consumers.