Skip to content

Conversation

@PlagueCZ
Copy link
Contributor

@PlagueCZ PlagueCZ commented Dec 4, 2025

Testing this is a bit complicated, because VM is the weakest link usually, so it requires many many VMs, or alternatively, one can send the packets out 10 times. This of course breaks communication, however in local testing works as intended.

I was able to reach 200-300 kp/s and then the error messages start appearing.

Increasing the size of TX queue by a factor of 4 makes my local tests start failing at 1 Mp/s, and in our real-life scenario makes all problems go away, only the occasional VM unable to cope with the traffic.

In real-life deployment, factor of 4 was still occasionally producing an error, factor of 8 did not. The memory allocation is not that big, so I would recommend using the larger factor for future-proofing. (130-200 MB increase in alloc_size, but still fits into 1GiB total heap size, OSC reserves 2GiB for dpservice anyway).

Increasing more is not really possible because we run into some DPDK limit, fortunately this seems to be enough.

EAL log message, eal_msg: mlx5_net: port 0 Tx WQEBB count (65536) exceeds the limit (32768), try smaller queue size

Fixes #748

@github-actions github-actions bot added size/XS bug Something isn't working labels Dec 4, 2025
@PlagueCZ PlagueCZ marked this pull request as ready for review December 4, 2025 21:35
@PlagueCZ PlagueCZ requested a review from a team as a code owner December 4, 2025 21:35
Copy link
Collaborator

@guvenc guvenc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@guvenc guvenc merged commit f0c0ac0 into main Dec 5, 2025
8 checks passed
@guvenc guvenc deleted the fix/tx_queue_size branch December 5, 2025 13:14
@github-project-automation github-project-automation bot moved this to Done in Roadmap Dec 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/networking bug Something isn't working size/XS

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Unable to send packets under heavy load

4 participants