Skip to content

test: cover Iceberg REST catalog backend in CI#4276

Open
mengw15 wants to merge 2 commits intoapache:mainfrom
mengw15:Lakekeeper-CI-2
Open

test: cover Iceberg REST catalog backend in CI#4276
mengw15 wants to merge 2 commits intoapache:mainfrom
mengw15:Lakekeeper-CI-2

Conversation

@mengw15
Copy link
Copy Markdown
Contributor

@mengw15 mengw15 commented Mar 9, 2026

What changes were proposed in this PR?

Add two @IntegrationTest-tagged specs that round-trip table metadata
via the Iceberg REST catalog against a live Lakekeeper + MinIO stack
brought up by the existing amber-integration CI job:

  • IcebergRestCatalogIntegrationSpec (Scala)
  • test_iceberg_rest_catalog_integration.py (Python, marked
    pytest.mark.integration)

The amber-integration job now boots MinIO + Lakekeeper, initializes a
warehouse with an S3 storage profile, installs dev-requirements.txt
for pytest, and runs pytest -m integration after the existing sbt
step. The regular python job runs with -m "not integration" so the
new Python test is excluded there. The integration marker is
registered in amber/pyproject.toml as the Python equivalent of the
Scala @IntegrationTest Java tag.

Any related issues, documentation, discussions?

Closes #4994 (sub-issue of #4126).

How was this PR tested?

CI itself — the new specs run as part of amber-integration on every
push of this PR; Lakekeeper boot / warehouse init / REST-path
breakage all surface on that job's status check.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.7)

@mengw15 mengw15 marked this pull request as draft March 9, 2026 11:34
@github-actions github-actions Bot added engine dependencies Pull requests that update a dependency file ddl-change Changes to the TexeraDB DDL python ci changes related to CI service common labels Mar 9, 2026
@mengw15 mengw15 self-assigned this Apr 6, 2026
@mengw15 mengw15 changed the title feat: introduce Result Service using Lakekeeper as REST catalog for Iceberg - CI ci: wire Lakekeeper and MinIO into GitHub Actions May 9, 2026
Comment thread amber/requirements.txt
Comment thread common/workflow-core/build.sbt Outdated
Comment thread amber/src/main/python/core/storage/iceberg/test_iceberg_document.py Outdated
@mengw15 mengw15 force-pushed the Lakekeeper-CI-2 branch from 214c671 to a1a4d33 Compare May 9, 2026 01:13
@github-actions github-actions Bot removed engine ddl-change Changes to the TexeraDB DDL labels May 9, 2026
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 43.22%. Comparing base (c86dc15) to head (40a8e65).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #4276      +/-   ##
============================================
+ Coverage     43.21%   43.22%   +0.01%     
- Complexity     2185     2186       +1     
============================================
  Files          1035     1035              
  Lines         38843    38843              
  Branches       4061     4061              
============================================
+ Hits          16785    16789       +4     
+ Misses        21006    21004       -2     
+ Partials       1052     1050       -2     
Flag Coverage Δ
access-control-service 39.53% <ø> (ø)
agent-service 33.72% <ø> (ø)
amber 43.22% <ø> (+0.01%) ⬆️
computing-unit-managing-service 0.00% <ø> (ø)
config-service 0.00% <ø> (ø)
file-service 32.18% <ø> (ø)
frontend 34.80% <ø> (ø)
python 88.90% <ø> (+0.05%) ⬆️
workflow-compiling-service 47.72% <ø> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mengw15 mengw15 force-pushed the Lakekeeper-CI-2 branch from 1ca58ce to 33fb0d8 Compare May 9, 2026 01:33
@mengw15 mengw15 changed the title ci: wire Lakekeeper and MinIO into GitHub Actions test: cover Iceberg REST catalog backend in CI May 9, 2026
@mengw15
Copy link
Copy Markdown
Contributor Author

mengw15 commented May 9, 2026

@Yicong-Huang ready for review!

@mengw15 mengw15 marked this pull request as ready for review May 9, 2026 08:29
@Yicong-Huang
Copy link
Copy Markdown
Contributor

Per offline discussion, I think this integration test better to be moved into amber-intergration, and let's not start a new CI job just for iceberg rest catalog.

@mengw15 mengw15 marked this pull request as draft May 9, 2026 22:17
@mengw15 mengw15 force-pushed the Lakekeeper-CI-2 branch from 0071899 to 6e1e8b9 Compare May 9, 2026 22:43
@mengw15 mengw15 force-pushed the Lakekeeper-CI-2 branch from 6e1e8b9 to 53e6e73 Compare May 9, 2026 22:50
@github-actions github-actions Bot removed dependencies Pull requests that update a dependency file common labels May 9, 2026
@mengw15 mengw15 force-pushed the Lakekeeper-CI-2 branch from 53e6e73 to dd3ac36 Compare May 9, 2026 23:00
@github-actions github-actions Bot added the python label May 9, 2026
@mengw15 mengw15 force-pushed the Lakekeeper-CI-2 branch 10 times, most recently from a940bc3 to eff01b6 Compare May 10, 2026 00:53
@mengw15 mengw15 marked this pull request as ready for review May 10, 2026 01:10
Copy link
Copy Markdown
Contributor

@Yicong-Huang Yicong-Huang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. left several comments inline, please take care of them. Thanks!

Comment thread .github/workflows/build.yml Outdated
Comment thread .github/workflows/build.yml Outdated
Comment on lines +382 to +399
curl -sf -X POST -H 'Content-Type: application/json' -d '{
"warehouse-name": "texera",
"project-id": "00000000-0000-0000-0000-000000000000",
"storage-profile": {
"type": "s3",
"bucket": "texera-iceberg",
"region": "us-west-2",
"endpoint": "http://localhost:9000",
"flavor": "s3-compat",
"path-style-access": true,
"sts-enabled": false
},
"storage-credential": {
"type": "s3",
"credential-type": "access-key",
"aws-access-key-id": "texera_minio",
"aws-secret-access-key": "password"
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do those config live some where in our codebase? can we point there instead? this is to avoid the need to update this CI file when we need to change some parameter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question — these values do have a runtime source-of-truth in common/config/src/main/resources/storage.conf, but it is not easily readable from a bash CI step.

Comment thread .github/workflows/build.yml
mengw15 added 2 commits May 9, 2026 18:40
Wire Lakekeeper + MinIO into the existing `amber-integration` CI job
and add focused integration specs that round-trip table metadata via
the REST catalog from both the Scala and Python clients.

`amber-integration`'s setup now boots MinIO + Lakekeeper (migrate ->
serve -> healthcheck), creates the `texera_lakekeeper` DB, initializes
the warehouse with an S3 storage profile, installs `dev-requirements`
for pytest, and exposes the REST URI / S3 env vars to the test steps.

- `IcebergRestCatalogIntegrationSpec` (Scala, `@IntegrationTest`) builds
  a `RESTCatalog` directly via `IcebergUtil.createRestCatalog` and
  scopes it through `IcebergCatalogInstance.replaceInstance`, restoring
  the prior singleton in `afterAll`. The catalog *type* is intentionally
  not flipped to `rest` globally, so other `@IntegrationTest` specs in
  the same job keep their existing Postgres-catalog backend.
- `test_iceberg_rest_catalog_integration.py` (Python) mirrors the Scala
  spec and is tagged `pytest.mark.integration` — the Python equivalent
  of the Scala `@IntegrationTest` tag. The regular `python` job runs
  with `-m "not integration"` and excludes it; `amber-integration`
  selects it via `-m integration`. Marker registered in
  `amber/pyproject.toml`.

Closes apache#4994
@mengw15 mengw15 force-pushed the Lakekeeper-CI-2 branch from 99b33a4 to 40a8e65 Compare May 10, 2026 01:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wire Lakekeeper and MinIO into GitHub Actions CI

3 participants