-
Notifications
You must be signed in to change notification settings - Fork 3
Add query labels via sqlcommenter comment parsing #67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
4fcf791
21c3e78
1feb6cf
f1cd581
0b9cf76
84c436b
19f3018
d42738d
a6af354
1017e02
2ca802a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,15 @@ | ||||||
| -- Migration 001: Add labels column for sqlcommenter support | ||||||
| -- | ||||||
| -- This migration adds the `labels` column to `events_raw` for existing | ||||||
| -- installations. New installations already include this column via | ||||||
| -- docker/init/00-schema.sql. | ||||||
| -- | ||||||
| -- Run with: | ||||||
| -- clickhouse-client < docker/migrations/001_add_labels_column.sql | ||||||
| -- | ||||||
| -- Safe to re-run: ALTER TABLE ADD COLUMN IF NOT EXISTS is idempotent. | ||||||
|
|
||||||
| ALTER TABLE pg_stat_ch.events_raw | ||||||
| ADD COLUMN IF NOT EXISTS labels String DEFAULT '{}' | ||||||
| COMMENT 'Query labels from sqlcommenter comments (key=value pairs in /* */ blocks). Access subpaths directly: labels.controller, labels.action. Empty {} when no labels present. See: https://google.github.io/sqlcommenter/' | ||||||
|
||||||
| COMMENT 'Query labels from sqlcommenter comments (key=value pairs in /* */ blocks). Access subpaths directly: labels.controller, labels.action. Empty {} when no labels present. See: https://google.github.io/sqlcommenter/' | |
| COMMENT 'Query labels from sqlcommenter comments (key=value pairs in /* */ blocks). Extract fields with JSONExtractString(labels, ''controller'') and JSONExtractString(labels, ''action''). Empty {} when no labels present. See: https://google.github.io/sqlcommenter/' |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -80,6 +80,32 @@ Four materialized views provide pre-aggregated analytics: | |||||
|
|
||||||
| For view schemas, query patterns, and the `-State`/`-Merge` aggregation pattern, see [materialized views](/reference/materialized-views). | ||||||
|
|
||||||
| ## Schema migrations | ||||||
|
|
||||||
| When upgrading pg_stat_ch, new columns or schema changes may be required. Migration scripts are provided in [`docker/migrations/`](https://github.com/ClickHouse/pg_stat_ch/tree/main/docker/migrations) and are safe to re-run (idempotent). | ||||||
|
|
||||||
| Apply all pending migrations: | ||||||
|
|
||||||
| ```bash | ||||||
| for f in docker/migrations/*.sql; do | ||||||
| clickhouse-client < "$f" | ||||||
| done | ||||||
| ``` | ||||||
|
|
||||||
| Or apply a specific migration: | ||||||
|
|
||||||
| ```bash | ||||||
| clickhouse-client < docker/migrations/001_add_labels_column.sql | ||||||
| ``` | ||||||
|
|
||||||
| | Migration | Version | Description | | ||||||
| |---|---|---| | ||||||
| | `001_add_labels_column.sql` | 0.2+ | Adds `labels JSON` column for [sqlcommenter](https://google.github.io/sqlcommenter/) query label support | | ||||||
|
||||||
| | `001_add_labels_column.sql` | 0.2+ | Adds `labels JSON` column for [sqlcommenter](https://google.github.io/sqlcommenter/) query label support | | |
| | `001_add_labels_column.sql` | 0.2+ | Adds `labels String DEFAULT '{}'` column for [sqlcommenter](https://google.github.io/sqlcommenter/) query label support | |
| Original file line number | Diff line number | Diff line change | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -30,6 +30,16 @@ The table is partitioned by date (`toDate(ts_start)`) and ordered by `ts_start` | |||||||||
| Query normalization replaces literals with placeholders (`$N`). This means `SELECT * FROM users WHERE id = 42` becomes `SELECT * FROM users WHERE id = $1`. No passwords, tokens, or PII are exported in query text. | ||||||||||
| </Note> | ||||||||||
|
|
||||||||||
| ## Query labels | ||||||||||
|
|
||||||||||
| | Column | Type | Description | | ||||||||||
| |---|---|---| | ||||||||||
| | `labels` | `String DEFAULT '{}'` | Key-value labels extracted from [sqlcommenter](https://google.github.io/sqlcommenter/) comments appended to the query. For example, `/* controller='users',action='show' */` produces `{"controller":"users","action":"show"}`. Access subpaths directly in ClickHouse: `labels.controller`, `labels.action`. Empty `{}` when no labels are present. | | ||||||||||
|
||||||||||
| | `labels` | `String DEFAULT '{}'` | Key-value labels extracted from [sqlcommenter](https://google.github.io/sqlcommenter/) comments appended to the query. For example, `/* controller='users',action='show' */` produces `{"controller":"users","action":"show"}`. Access subpaths directly in ClickHouse: `labels.controller`, `labels.action`. Empty `{}` when no labels are present. | | |
| | `labels` | `String DEFAULT '{}'` | Key-value labels extracted from [sqlcommenter](https://google.github.io/sqlcommenter/) comments appended to the query. For example, `/* controller='users',action='show' */` produces `{"controller":"users","action":"show"}`. Since this column stores JSON as a string, extract values in ClickHouse with functions such as `JSONExtractString(labels, 'controller')` and `JSONExtractString(labels, 'action')`. Empty `{}` when no labels are present. | |
Copilot
AI
Apr 13, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section introduces pg_stat_ch.track_labels and links to /reference/configuration, but that page currently doesn’t mention track_labels (so readers can’t discover details like context/reload semantics). Please add pg_stat_ch.track_labels to the configuration reference and consider adding a short warning that label values are user-supplied and may contain sensitive data if applications put PII/tokens in sqlcommenter comments.
| Labels are parsed from the **last** `/* */` comment block in the query text. The parser supports URL-encoded values and escaped single quotes per the sqlcommenter specification. Controlled by the [`pg_stat_ch.track_labels`](/reference/configuration) GUC (default: `true`). | |
| Labels are parsed from the **last** `/* */` comment block in the query text. The parser supports URL-encoded values and escaped single quotes per the sqlcommenter specification. Collection is controlled by the [`pg_stat_ch.track_labels`](/reference/configuration) GUC, which defaults to `true` and can be changed with a configuration reload (no server restart required). | |
| Label values are user-supplied application metadata. If applications include PII, tokens, session identifiers, or other secrets in sqlcommenter comments, those values may be exported in `labels`. Only enable and use labels with trusted, non-sensitive metadata. |
Copilot
AI
Apr 13, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The labels column is documented as String, but the description claims you can access subpaths via labels.controller / labels.action. That dot-notation requires a ClickHouse JSON-typed column; with String you need functions like JSONExtractString(labels, 'controller'). Please either change the column type to JSON(...) (and keep docs as-is) or update the docs/examples to match a String column.
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -75,10 +75,15 @@ class ClickHouseExporter : public StatsExporter { | |||||||||||||||||||||||||||||||||||||||||||
| shared_ptr<Column<string>> DbOperationColumn() final { return TagString("cmd_type"); } | ||||||||||||||||||||||||||||||||||||||||||||
| shared_ptr<Column<string_view>> DbQueryTextColumn() final { return RecordString("query"); } | ||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||
| void AppendLabels(const ParseResult& labels) final { | ||||||||||||||||||||||||||||||||||||||||||||
| labels_col_->Append(SerializeLabelsJson(labels)); | ||||||||||||||||||||||||||||||||||||||||||||
| } | ||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||
| void BeginBatch() final { | ||||||||||||||||||||||||||||||||||||||||||||
| block = std::make_unique<clickhouse::Block>(); | ||||||||||||||||||||||||||||||||||||||||||||
| columns.clear(); | ||||||||||||||||||||||||||||||||||||||||||||
| exported_count = 0; | ||||||||||||||||||||||||||||||||||||||||||||
| labels_col_ = Wrap<clickhouse::ColumnString, string_view>("labels"); | ||||||||||||||||||||||||||||||||||||||||||||
|
Comment on lines
+79
to
+86
|
||||||||||||||||||||||||||||||||||||||||||||
| labels_col_->Append(SerializeLabelsJson(labels)); | |
| } | |
| void BeginBatch() final { | |
| block = std::make_unique<clickhouse::Block>(); | |
| columns.clear(); | |
| exported_count = 0; | |
| labels_col_ = Wrap<clickhouse::ColumnString, string_view>("labels"); | |
| if (labels_col_) { | |
| labels_col_->Append(SerializeLabelsJson(labels)); | |
| } | |
| } | |
| void BeginBatch() final { | |
| block = std::make_unique<clickhouse::Block>(); | |
| columns.clear(); | |
| exported_count = 0; | |
| labels_col_.reset(); | |
| if (pg_stat_ch_track_labels) { | |
| labels_col_ = Wrap<clickhouse::ColumnString, string_view>("labels"); | |
| } |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -6,6 +6,8 @@ | |||||
| #include <string> | ||||||
| #include <string_view> | ||||||
|
|
||||||
| #include "export/sqlcommenter_parse.h" | ||||||
|
|
||||||
| class StatsExporter { | ||||||
| protected: | ||||||
| using string = std::string; | ||||||
|
|
@@ -65,6 +67,10 @@ class StatsExporter { | |||||
| virtual shared_ptr<Column<string>> DbOperationColumn() = 0; | ||||||
| // Query text. CH: RecordString "query"; OTel semconv: "db.query.text". | ||||||
| virtual shared_ptr<Column<string_view>> DbQueryTextColumn() = 0; | ||||||
| // Query labels from sqlcommenter comments. Called inside the event loop. | ||||||
| // CH: serializes to JSON, appends to a String "labels" column; | ||||||
|
||||||
| // CH: serializes to JSON, appends to a String "labels" column; | |
| // CH: serializes to JSON, appends to a JSON "labels" column; |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -129,6 +129,15 @@ class OTelExporter : public StatsExporter { | |
| shared_ptr<Column<string_view>> DbQueryTextColumn() final { | ||
| return Wrap<RecordOnlyColumn<string_view>>("db.query.text"); | ||
| } | ||
| void AppendLabels(const ParseResult& labels) final { | ||
| for (int i = 0; i < labels.count; ++i) { | ||
| string attr_name = "db.query.label."; | ||
| attr_name.append(labels.labels[i].key.data(), labels.labels[i].key.size()); | ||
| string val(labels.labels[i].value); | ||
| current_log_record->SetAttribute(attr_name, val); | ||
| current_row_tags[attr_name] = std::move(val); | ||
| } | ||
|
Comment on lines
+135
to
+139
|
||
| } | ||
|
|
||
| bool EstablishNewConnection() final; | ||
| bool IsConnected() const final { return metrics_provider && log_provider; } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The readiness polling loop for the OTel collector never fails the step if the collector doesn’t become ready within 30s (it just falls through). Please add a post-loop check (or track a flag) and
exit 1if the health endpoint never responds, so CI fails fast with a clear error when the collector can’t start.