Skip to content

feat(builder-sidecar): log API calls to LOG_DIR for evaluation#150

Open
0xdkay wants to merge 2 commits intomainfrom
feat/sidecar-api-log
Open

feat(builder-sidecar): log API calls to LOG_DIR for evaluation#150
0xdkay wants to merge 2 commits intomainfrom
feat/sidecar-api-log

Conversation

@0xdkay
Copy link
Copy Markdown
Collaborator

@0xdkay 0xdkay commented Mar 23, 2026

Summary

Add structured JSONL logging of every builder sidecar API call (run-pov, apply-patch-build, run-test) to OSS_CRS_LOG_DIR for CRS evaluation.

Closes #148

Implementation

  • Register /sidecar-logs via libCRS register-log-dir at startup, symlinked to host-mounted OSS_CRS_LOG_DIR
  • After each API handler completes, append one JSON line to api-calls.jsonl
  • Each entry includes: ts, api, job_id, exit_code, duration_ms, plus api-specific fields
  • Handlers return harness/build_id in result dicts (no separate pre-read needed)
  • crash uses exit_code > 0 to avoid false positives from handler exceptions (sentinel -1)
  • Empty api-calls.jsonl created at startup to distinguish "zero calls" from "sidecar never started"
  • Best-effort logging — never blocks or fails the API call

Log entry examples

Bug-finding (crs-bug-finding-claude-code on sanity-mock-c-delta-01):

{"ts":1774296986.79,"api":"run-pov","job_id":"2b9ec6bf26cf","duration_ms":58,"exit_code":1,"harness":"fuzz_parse_buffer_section","build_id":"base","crash":true,"timeout":false}
{"ts":1774297023.24,"api":"run-pov","job_id":"dbe6bf714002","duration_ms":55,"exit_code":1,"harness":"fuzz_parse_buffer_section","build_id":"base","crash":true,"timeout":false}
{"ts":1774297067.71,"api":"run-pov","job_id":"bcee79ed184a","duration_ms":65,"exit_code":77,"harness":"fuzz_parse_buffer_section","build_id":"base","crash":true,"timeout":false}

Bug-fixing (crs-claude-code on sanity-mock-c-delta-01 with POV input):

{"ts":1774300121.36,"api":"build","job_id":"52d872574bec","duration_ms":551,"exit_code":0,"build_success":true}
{"ts":1774300136.70,"api":"run-pov","job_id":"d9f4cce856b6","duration_ms":107,"exit_code":1,"harness":"fuzz_parse_buffer_section","build_id":"base","crash":true,"timeout":false}
{"ts":1774300148.91,"api":"run-pov","job_id":"21fc5831ed93","duration_ms":90,"exit_code":0,"harness":"fuzz_parse_buffer_section","build_id":"52d872574bec","crash":false,"timeout":false}
{"ts":1774300159.91,"api":"run-test","job_id":"bd45c2ec677e","duration_ms":176,"exit_code":0,"build_id":"52d872574bec","test_passed":true,"skipped":false}

Log file location

<workdir>/.../LOG_DIR/<harness>/sidecar-logs/api-calls.jsonl

Collected by oss-crs artifacts via the log_dir field, and by CRSBench adapter to trial_dir/output/logs/crs/<crs>/log_dir/sidecar-logs/api-calls.jsonl.

Smoke test results

Tested against sanity-mock-c-delta-01 (fuzz_parse_buffer_section harness):

CRS Bug-finding Bug-fixing (with POV)
crs-bug-finding-claude-code 7 run-pov, all crashes
crs-bug-finding-codex 6 run-pov, all crashes
crs-bug-finding-copilot-cli 5 run-pov, all crashes
crs-bug-finding-gemini-cli 10 run-pov, all crashes
crs-claude-code 4 calls: build→run-pov(crash)→run-pov(no crash)→run-test(pass)
crs-codex 4 calls: run-pov(crash)→build→run-pov(no crash)→run-test(pass)
crs-copilot-cli 3 calls: build→run-pov(no crash)→run-test(pass)
crs-gemini-cli 3 calls: build→run-pov(no crash)→run-test(pass)

Test plan

  • 22 unit tests for log entry construction and JSONL writing
  • Smoke tested crs-bug-finding-claude-code, crs-bug-finding-codex, crs-bug-finding-copilot-cli, crs-bug-finding-gemini-cli in bug-finding mode (real Docker runs)
  • Smoke tested crs-claude-code, crs-codex, crs-copilot-cli, crs-gemini-cli in bug-fixing mode with POV input (real Docker runs)
  • Verified api-calls.jsonl persists in LOG_DIR on host
  • Verified oss-crs artifacts reports log_dir containing the file

@0xdkay 0xdkay force-pushed the feat/sidecar-api-log branch 3 times, most recently from f747fae to 231de63 Compare March 23, 2026 21:50
0xdkay added 2 commits March 23, 2026 21:51
Add structured JSONL logging of every builder sidecar API call
(run-pov, apply-patch-build, run-test) for CRS evaluation.

Closes #148

Signed-off-by: Dongkwan Kim <0xdkay@gmail.com>
22 tests covering _make_log_entry and JSONL file writing.

Signed-off-by: Dongkwan Kim <0xdkay@gmail.com>
@0xdkay 0xdkay force-pushed the feat/sidecar-api-log branch from 231de63 to 486f866 Compare March 23, 2026 21:51
@azchin
Copy link
Copy Markdown
Collaborator

azchin commented Mar 24, 2026

I'll reimplement this in the new builder sidecar workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(builder-sidecar): log API calls for run-pov, apply-patch-build, run-test

2 participants