Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
145 changes: 141 additions & 4 deletions identify-slow-queries.md
Original file line number Diff line number Diff line change
Expand Up @@ -172,12 +172,149 @@ Fields related to storage engines:
- `Storage_from_kv`: introduced in v8.5.5, indicates whether this statement read data from TiKV.
- `Storage_from_mpp`: introduced in v8.5.5, indicates whether this statement read data from TiFlash.

## Use `tidb_slow_log_rules`

[`tidb_slow_log_rules`](/system-variables.md#tidb_slow_log_rules-new-in-v900) is used to define trigger rules for slow query logs, supporting multi-dimensional metric combinations. It is suitable for "targeted sampling" and "problem reproduction" of slow logs, enabling you to filter target statements based on specific metric combinations.

The triggering behavior of slow query logs depends on the configuration of `tidb_slow_log_rules`:

- If `tidb_slow_log_rules` is not set, slow query log triggering still relies on [`tidb_slow_log_threshold`](/system-variables.md#tidb_slow_log_threshold) (in milliseconds).
- If `tidb_slow_log_rules` is set, the configured rules take precedence, and [`tidb_slow_log_threshold`](/system-variables.md#tidb_slow_log_threshold) will be ignored.

For more information about meanings, diagnostic value, and background information of each field, see the [Fields description](#fields-description).

### Unified rule syntax and type constraints

- Rule capacity and separation: `SESSION` and `GLOBAL` each support a maximum of 10 rules. A single session can have up to 20 active rules. Rules are separated by `;`.
- Condition format: each condition uses the format `field_name:value`. Multiple conditions within a single rule are separated by `,`.
- Field and scope: field names are case-insensitive (underscores and other characters are preserved). `SESSION` rules do not support `Conn_ID`. Only `GLOBAL` rules support `Conn_ID`.
- Matching semantics:
- Numeric fields are matched using `>=`. String and boolean fields are matched using equality (`=`).
- Matching for `DB` and `Resource_group` is case-insensitive.
- Explicit operators such as `>`, `<`, and `!=` are not supported.

Type constraints are as follows:

- Numeric types (`int64`, `uint64`, `float64`) uniformly require `>= 0`. Negative values will result in a parsing error.
- `int64`: the maximum value is `2^63-1`.
- `uint64`: the maximum value is `2^64-1`.
- `float64`: the general upper limit is approximately `1.79e308`. Currently, parsing is done using Go's `ParseFloat`. While `NaN`/`Inf` can be parsed, they might lead to rules that are always true or always false. It is not recommended to use them.
- `bool`: supports `true`/`false`, `1`/`0`, and `t`/`f` (case-insensitive).
- `string`: currently does not support strings containing the separators `,` (condition separator) or `;` (rule separator), even with quotes (single or double). Escaping is not supported.
- Duplicate fields: if the same field is specified multiple times in a single rule, the last occurrence takes effect.

### Supported fields

For detailed field descriptions, diagnostic meanings, and background information, see the [field descriptions in `identify-slow-queries`](/identify-slow-queries.md#fields-description).

Unless otherwise noted, the fields in the following table follow the general matching and type rules described in [Unified rule syntax and type constraints](#unified-rule-syntax-and-type-constraints). This table lists only the currently supported field names, types, units, and a few rule-specific notes. It does not repeat each field's semantic meaning.

| Field name | Type | Unit | Notes |
| -------------------------------------- | -------- | ------ | ------------------------------ |
| `Conn_ID` | `uint` | count | Supported only in `GLOBAL` rules |
| `Session_alias` | `string` | none | - |
| `DB` | `string` | none | Case-insensitive when matched |
| `Exec_retry_count` | `uint` | count | - |
| `Query_time` | `float` | second | - |
| `Parse_time` | `float` | second | - |
| `Compile_time` | `float` | second | - |
| `Rewrite_time` | `float` | second | - |
| `Optimize_time` | `float` | second | - |
| `Wait_TS` | `float` | second | - |
| `Is_internal` | `bool` | none | - |
| `Digest` | `string` | none | - |
| `Plan_digest` | `string` | none | - |
| `Num_cop_tasks` | `int` | count | - |
| `Mem_max` | `int` | bytes | - |
| `Disk_max` | `int` | bytes | - |
| `Write_sql_response_total` | `float` | second | - |
| `Succ` | `bool` | none | - |
| `Resource_group` | `string` | none | Case-insensitive when matched |
| `KV_total` | `float` | second | - |
| `PD_total` | `float` | second | - |
| `Unpacked_bytes_sent_tikv_total` | `int` | bytes | - |
| `Unpacked_bytes_received_tikv_total` | `int` | bytes | - |
| `Unpacked_bytes_sent_tikv_cross_zone` | `int` | bytes | - |
| `Unpacked_bytes_received_tikv_cross_zone` | `int` | bytes | - |
| `Unpacked_bytes_sent_tiflash_total` | `int` | bytes | - |
| `Unpacked_bytes_received_tiflash_total` | `int` | bytes | - |
| `Unpacked_bytes_sent_tiflash_cross_zone` | `int` | bytes | - |
| `Unpacked_bytes_received_tiflash_cross_zone` | `int` | bytes | - |
| `Process_time` | `float` | second | - |
| `Backoff_time` | `float` | second | - |
| `Total_keys` | `uint` | count | - |
| `Process_keys` | `uint` | count | - |
| `cop_mvcc_read_amplification` | `float` | ratio | Ratio value (`Total_keys / Process_keys`) |
| `Prewrite_time` | `float` | second | - |
| `Commit_time` | `float` | second | - |
| `Write_keys` | `uint` | count | - |
| `Write_size` | `uint` | bytes | - |
| `Prewrite_region` | `uint` | count | - |

### Effective behavior and matching order

- Rule update behavior: every execution of `SET [SESSION|GLOBAL] tidb_slow_log_rules = '...'` overwrites the existing rules in that scope instead of appending to them.
- Rule clearing behavior: `SET [SESSION|GLOBAL] tidb_slow_log_rules = ''` clears the rules in the corresponding scope.
- If the current session has any applicable `tidb_slow_log_rules`, such as `SESSION` rules, `GLOBAL` rules for the current `Conn_ID`, or generic global rules without `Conn_ID`, the output of slow query logs is determined by rule matching results, and `tidb_slow_log_threshold` is no longer used.
- If the current session has no applicable rules, for example when both `SESSION` and `GLOBAL` rules are empty, or only `GLOBAL` rules that do not match the current `Conn_ID` are configured, slow query logging still depends on `tidb_slow_log_threshold`. Note that the unit is milliseconds.
- If you still want to use SQL execution time as a condition for writing slow logs, use `Query_time` in the rule and note that the unit is seconds.
- Rule matching logic:
- Multiple rules are combined with `OR`, while multiple field conditions within a single rule are combined with `AND`.
- `SESSION`-scope rules are matched first. If none matches, TiDB then matches `GLOBAL` rules for the current `Conn_ID`, followed by generic `GLOBAL` rules without `Conn_ID`.
- `SHOW VARIABLES LIKE 'tidb_slow_log_rules'` and `SELECT @@SESSION.tidb_slow_log_rules` return the `SESSION` rule text, or an empty string if unset. `SELECT @@GLOBAL.tidb_slow_log_rules` returns the `GLOBAL` rule text.

### Examples

- Standard format (`SESSION` scope):

```sql
SET SESSION tidb_slow_log_rules = 'Query_time: 0.5, Is_internal: false';
```

- Invalid format (`SESSION` scope does not support `Conn_ID`):

```sql
SET SESSION tidb_slow_log_rules = 'Conn_ID: 12, Query_time: 0.5, Is_internal: false';
```

- Global rule (applies to all connections):

```sql
SET GLOBAL tidb_slow_log_rules = 'Query_time: 0.5, Is_internal: false';
```

- Global rules for specific connections (applied separately to the two connections `Conn_ID:11` and `Conn_ID:12`):

```sql
SET GLOBAL tidb_slow_log_rules = 'Conn_ID: 11, Query_time: 0.5, Is_internal: false; Conn_ID: 12, Query_time: 0.6, Process_time: 0.3, DB: db1';
```

### Recommendations

- `tidb_slow_log_rules` is designed to replace the single-threshold approach. It supports combinations of multi-dimensional metric conditions, enabling more flexible and fine-grained control over slow query logging.

- In a well-provisioned test environment with 1 TiDB node (16 CPU cores, 48 GiB memory) and 3 TiKV nodes (each with 16 CPU cores and 48 GiB memory), repeated sysbench tests show that performance impact remains small when multi-dimensional slow query log rules generate millions of slow log entries within 30 minutes. However, when the log volume reaches tens of millions, TPS drops significantly and latency increases noticeably. Therefore, if business workload is high or CPU and memory resources are close to their limits, configure `tidb_slow_log_rules` carefully to avoid log flooding caused by overly broad rules. If you need to limit the log output rate, use [`tidb_slow_log_max_per_sec`](/system-variables.md#tidb_slow_log_max_per_sec-new-in-v900) to throttle it and reduce the impact on business performance.

## Related system variables

* [`tidb_slow_log_threshold`](/system-variables.md#tidb_slow_log_threshold): Sets the threshold for the slow log. The SQL statement whose execution time exceeds this threshold is recorded in the slow log. The default value is 300 (ms).
* [`tidb_query_log_max_len`](/system-variables.md#tidb_query_log_max_len): Sets the maximum length of the SQL statement recorded in the slow log. The default value is 4096 (byte).
* [tidb_redact_log](/system-variables.md#tidb_redact_log): Determines whether to desensitize user data using `?` in the SQL statement recorded in the slow log. The default value is `0`, which means to disable the feature.
* [`tidb_enable_collect_execution_info`](/system-variables.md#tidb_enable_collect_execution_info): Determines whether to record the physical execution information of each operator in the execution plan. The default value is `1`. This feature impacts the performance by approximately 3%. After enabling this feature, you can view the `Plan` information as follows:
* [`tidb_slow_log_rules`](/system-variables.md#tidb_slow_log_rules-new-in-v900): see [`tidb_slow_log_rules` recommendations](#recommendations)

* [`tidb_slow_log_threshold`](/system-variables.md#tidb_slow_log_threshold): sets the threshold for slow query logging. SQL statements whose execution time exceeds this threshold are recorded in the slow query log. The default value is `300ms` (milliseconds).

> **Tip:**
>
> Time-related fields in `tidb_slow_log_rules`, such as `Query_time` and `Process_time`, use seconds as the unit and can include decimals, while [`tidb_slow_log_threshold`](/system-variables.md#tidb_slow_log_threshold) uses milliseconds.

* [`tidb_slow_log_max_per_sec`](/system-variables.md#tidb_slow_log_max_per_sec-new-in-v900): sets the maximum number of slow query log entries that can be written per second. The default value is `0`. This variable is introduced in v9.0.0.
* A value of `0` means there is no limit on the number of slow query log entries written per second.
* A value greater than `0` means TiDB writes at most the specified number of slow query log entries per second. Any excess log entries are discarded and not written to the slow query log file.
* It is recommended to set this variable after enabling `tidb_slow_log_rules` to prevent rule-based slow query logging from being triggered too frequently.

* [`tidb_query_log_max_len`](/system-variables.md#tidb_query_log_max_len): sets the maximum length of the SQL statement recorded in the slow query log. The default value is 4096 (byte).

* [`tidb_redact_log`](/system-variables.md#tidb_redact_log): controls whether user data in SQL statements recorded in the slow query log is redacted and replaced with `?`. The default value is `0`, which means this feature is disabled.

* [`tidb_enable_collect_execution_info`](/system-variables.md#tidb_enable_collect_execution_info): controls whether to record the physical execution information of each operator in the execution plan. The default value is `1`. This feature impacts the performance by approximately 3%. After enabling this feature, you can view the `Plan` information as follows:

```sql
> select tidb_decode_plan('jAOIMAk1XzE3CTAJMQlmdW5jczpjb3VudChDb2x1bW4jNyktPkMJC/BMNQkxCXRpbWU6MTAuOTMxNTA1bXMsIGxvb3BzOjIJMzcyIEJ5dGVzCU4vQQoxCTMyXzE4CTAJMQlpbmRleDpTdHJlYW1BZ2dfOQkxCXQRSAwyNzY4LkgALCwgcnBjIG51bTogMQkMEXMQODg0MzUFK0hwcm9jIGtleXM6MjUwMDcJMjA2HXsIMgk1BWM2zwAAMRnIADcVyAAxHcEQNQlOL0EBBPBbCjMJMTNfMTYJMQkzMTI4MS44NTc4MTk5MDUyMTcJdGFibGU6dCwgaW5kZXg6aWR4KGEpLCByYW5nZTpbLWluZiw1MDAwMCksIGtlZXAgb3JkZXI6ZmFsc2UJMjUBrgnQVnsA');
Expand Down
28 changes: 28 additions & 0 deletions system-variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -5805,6 +5805,34 @@ Query OK, 0 rows affected, 1 warning (0.00 sec)
>
> If the character check is skipped, TiDB might fail to detect invalid UTF-8 characters written by the application, cause decoding errors when `ANALYZE` is executed, and introduce other unknown encoding issues. If your application cannot guarantee the validity of the written string, it is not recommended to skip the character check.

### tidb_slow_log_max_per_sec <span class="version-mark">New in v8.5.6</span>

- Scope: GLOBAL
- Persists to cluster: Yes
- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No
- Default value: `0`
- Type: Integer
- Range: `[0, 1000000]`
- This variable controls the maximum number of slow query log entries that can be written per TiDB node per second.
- A value of `0` means there is no limit on the number of slow query log entries written per second.
- A value greater than `0` means TiDB writes at most the specified number of slow query log entries per second. Any excess log entries are discarded and not written to the slow query log file.
- This variable is often used with [`tidb_slow_log_rules`](#tidb_slow_log_rules-new-in-v856) to prevent excessive slow query logs from being generated under high-workload conditions.

### tidb_slow_log_rules <span class="version-mark">New in v8.5.6</span>

- Scope: SESSION | GLOBAL
- Persists to cluster: Yes
- Applies to hint [SET_VAR](/optimizer-hints.md#set_varvar_namevar_value): No
- Default value: ""
- Type: String
- This variable defines the triggering rules for slow query logs. It supports combining multi-dimensional metrics to provide more flexible and fine-grained logging.
- For more information about how to use this system variable, see [Use `tidb_slow_log_rules`](/identify-slow-queries.md#use-tidb_slow_log_rules).

> **Tip:**
>
> - When enabling `tidb_slow_log_rules` in a production environment, it is recommended to also configure [`tidb_slow_log_max_per_sec`](#tidb_slow_log_max_per_sec-new-in-v856) to avoid excessively frequent slow query log printing.
> - It is recommended to start with stricter conditions and gradually relax them based on troubleshooting needs. For more information on performance impact, see [Recommendations](/identify-slow-queries.md#recommendations).

### tidb_slow_log_threshold

> **Note:**
Expand Down
Loading