Skip to content

[kafka-connect]: Add troubleshooting entry on handling large connection counts#6158

Open
kurnoolsaketh wants to merge 2 commits intomainfrom
improvement/kafka-connect-connection-count-troubleshooting
Open

[kafka-connect]: Add troubleshooting entry on handling large connection counts#6158
kurnoolsaketh wants to merge 2 commits intomainfrom
improvement/kafka-connect-connection-count-troubleshooting

Conversation

@kurnoolsaketh
Copy link
Copy Markdown
Contributor

Summary

Adds an entry addressing an edge case in which high data volume connector deployments may result in a large number of open connections to ClickHouse.

@kurnoolsaketh kurnoolsaketh requested review from a team as code owners May 6, 2026 01:17
@vercel
Copy link
Copy Markdown

vercel Bot commented May 6, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
clickhouse-docs Ready Ready Preview, Comment May 6, 2026 1:36am
clickhouse-docs-jp Building Building Preview, Comment May 6, 2026 1:36am
3 Skipped Deployments
Project Deployment Actions Updated (UTC)
clickhouse-docs-ko Ignored Ignored Preview May 6, 2026 1:36am
clickhouse-docs-ru Ignored Ignored Preview May 6, 2026 1:36am
clickhouse-docs-zh Ignored Ignored Preview May 6, 2026 1:36am

Request Review

High insertion frequency may result in many open connections to your database. This is common in large task count or distributed connector deployments where the insert rate to a single ClickHouse instance is high. In ClickHouse Cloud, a common symptom of this issue is requests being rate limited by the cloud proxy/load balancer.

Some strategies to reduce the number of open connections are:
1. Adjust the connection pool settings on the Java client (note that these may reduce overall throughput):
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chernser are java client options configurable on the connector?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, they are configured via jdbcConnectionProperties this is similar to JDBC url. However if configuration has some unknown property it may be treated as ClickHouse settings for V1

transforms.keyToValue.field=_key
```

#### "There are too many open connections to my ClickHouse instance" {#too-many-open-connections}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would name it shorter "Too many DB connections".

```

#### "There are too many open connections to my ClickHouse instance" {#too-many-open-connections}
High insertion frequency may result in many open connections to your database. This is common in large task count or distributed connector deployments where the insert rate to a single ClickHouse instance is high. In ClickHouse Cloud, a common symptom of this issue is requests being rate limited by the cloud proxy/load balancer.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs careful explanation because there may be many records and we are fine with that and there can be many small batches.

Current text declares problem way too broadly and do not explain why it happens.

It may be worth having structuring the record in way

<Description> 

Symptoms on prem: 
Symptoms on cloud:
Metrics to check: 

<What configuration to change, how to troubleshoot> 

- `max_open_connections`: defaults to 10. Reducing this will bound the number of open connections per task.
- `connection_ttl`: defaults to -1 (no ttl). Setting this to >0 will eagerly reclaim connections after the ttl expires.

See the [Java client Connection & Endpoints configuration tab](https://clickhouse.com/docs/integrations/language-clients/java/client#configuration) for more details.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java client is not configured via endpoint.
But troubleshooting guide expects that user know how to configure client in the connector - if it is not explained before then it should be fixed.


See the [Java client Connection & Endpoints configuration tab](https://clickhouse.com/docs/integrations/language-clients/java/client#configuration) for more details.

2. Increase `bufferCount`: in high data volume/throughput deployments, this will increase the number of records buffered between inserts and reduce the frequency of insert queries sent to your database. This will reduce the number of new connections needed to write to your database.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is ti really for data volume and throughput?
This is not - before that sink was handling big batches just fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants