[issue-5494] [P-SDK] fix: reuse original trace/span IDs in project import for idempotency by octo-patch · Pull Request #6471 · comet-ml/opik

octo-patch · 2026-04-24T02:43:52Z

Problem

When importing a project export, import_projects_from_directory generated a fresh random UUID for every trace and every span via id_helpers.generate_id(), even though the original IDs were available in the export file. This meant:

Running the same import twice created duplicate traces (different IDs, identical content).
Users had no way to make imports idempotent without the migration manifest, and even with the manifest a forced re-run (--force) still produced duplicates.

Solution

Pass the original id from the export file directly to client.trace() and client.span(), falling back to generate_id() only when the export contains no ID (for forward-compatibility with older export formats that predated ID storage).

# Before
trace = client.trace(
    id=id_helpers.generate_id(timestamp=original_start_time),
    ...
)

# After
trace = client.trace(
    id=original_trace_id or id_helpers.generate_id(timestamp=original_start_time),
    ...
)

The same pattern is applied to client.span(). The server handles repeat writes of the same ID gracefully (upsert or 409 dedup), so re-running import no longer creates duplicates.

The trace_id_map and span_id_map bookkeeping is unchanged — when original IDs are preserved the maps become identity maps, which is still correct for experiment recreation lookups.

Testing

Three new unit tests added to tests/unit/cli/test_import_project.py:
- test_trace_uses_original_id — asserts client.trace() is called with the original trace ID
- test_span_uses_original_id — asserts client.span() is called with the original span ID
- test_import_twice_produces_same_ids — asserts two consecutive imports use identical trace IDs
Full CLI unit test suite (201 tests) passes without regressions.

🤖 AI-assisted contribution — code reviewed and tests verified by human author.

…port for idempotency When importing traces from an export file, each run generated fresh random IDs (via `id_helpers.generate_id()`), so importing the same data twice created duplicate traces and spans instead of overwriting or deduplicating. Reuse the `id` from the export file for both `client.trace()` and `client.span()`, falling back to `generate_id()` only when the export contains no ID (older export format). The server handles repeat writes of the same ID gracefully (upsert / 409), so re-running import no longer creates duplicates. Also update comments on the span_id_map to clarify it is now an identity map when original IDs are preserved. Adds three unit tests that assert original IDs are passed through and that consecutive imports produce identical IDs. Fixes comet-ml#5494 Co-Authored-By: Octopus <liyuan851277048@icloud.com>

baz-reviewer · 2026-04-24T02:47:46Z


                        trace = client.trace(
-                            id=id_helpers.generate_id(timestamp=original_start_time),
+                            id=original_trace_id or id_helpers.generate_id(timestamp=original_start_time),


sdks/python/AGENTS.md enforces 88-char Python lines, but the new client.trace() id=original_trace_id or id_helpers.generate_id(timestamp=original_start_time), line breaks that limit—should we wrap the conditional id expression like the other keyword args?

_{Finding type: AI Coding Guidelines | Severity: 🟢 Low}

Want Baz to fix this for you? Activate Fixer

Other fix methods

Prompt for AI Agents:

Before applying, verify this suggestion against the current code. In sdks/python/src/opik/cli/imports/project.py around lines 205-228 inside import_projects_from_directory(), update the client.trace() call so the keyword argument `id=original_trace_id or id_helpers.generate_id(timestamp=original_start_time)` does not exceed the 88-character limit from AGENTS.md. Refactor that single long `id=` expression by wrapping it across multiple lines (using parentheses) and keep the formatting consistent with the other keyword arguments in the same call. Re-run lint/format checks to confirm the line length violation is resolved.

baz-reviewer · 2026-04-24T02:47:46Z

@@ -246,7 +246,10 @@ def import_projects_from_directory(
                            )


The span import loop duplicates the span reconstruction flow in sdks/python/src/opik/cli/imports/experiment.py, should we refactor both to a shared helper (e.g. opik/cli/imports/utils.py)?

_{Finding type: Code Dedup and Conventions | Severity: 🟢 Low}

Want Baz to fix this for you? Activate Fixer

dsblank · 2026-04-27T14:14:33Z

@octo-patch this makes a lot of sense; thank you for the PR!

Can you either address the baz-reviewer comments, or resolve them.

Also, there are linting errors. Please fix those.

octo-patch requested a review from a team as a code owner April 24, 2026 02:43

github-actions Bot added python Pull requests that update Python code tests Including test files, or tests related like configuration. Python SDK labels Apr 24, 2026

baz-reviewer Bot reviewed Apr 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[issue-5494] [P-SDK] fix: reuse original trace/span IDs in project import for idempotency#6471

[issue-5494] [P-SDK] fix: reuse original trace/span IDs in project import for idempotency#6471
octo-patch wants to merge 1 commit intocomet-ml:mainfrom
octo-patch:ximi/issue-5494-idempotent-import-trace-ids

octo-patch commented Apr 24, 2026

Uh oh!

baz-reviewer Bot Apr 24, 2026

Uh oh!

baz-reviewer Bot Apr 24, 2026

Uh oh!

dsblank commented Apr 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

octo-patch commented Apr 24, 2026

Problem

Solution

Testing

Uh oh!

baz-reviewer Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

baz-reviewer Bot Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

dsblank commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dsblank commented Apr 27, 2026 •

edited

Loading