Skip to content

FFTO Implementation: MySQL & PostgreSQL Protocol Observation in Fast-Forward#5393

Open
renecannao wants to merge 12 commits intov3.0from
v3.0-ff_inspect
Open

FFTO Implementation: MySQL & PostgreSQL Protocol Observation in Fast-Forward#5393
renecannao wants to merge 12 commits intov3.0from
v3.0-ff_inspect

Conversation

@renecannao
Copy link
Contributor

@renecannao renecannao commented Feb 19, 2026

Summary

This PR completes the implementation of the Fast Forward Traffic Observer (FFTO) for both MySQL and PostgreSQL engines. FFTO allows ProxySQL to collect and report detailed query statistics even when a connection is in Fast-Forward mode, which previously bypassed all statistics gathering.

Key Features

  • MySQL Support: Implemented a state machine in to handle both Text and Binary (Prepared Statement) protocols. It extracts from and tracks for result sets.
  • PostgreSQL Support: Implemented to handle Simple and Extended Query protocols (Parse/Bind/Execute), correctly extracting row counts from the message.
  • Resource Safety: Added a bypass mechanism controlled by and . This ensures that sessions transferring large payloads (e.g., BLOBs) automatically disable FFTO to prevent excessive buffering.
  • Full Metric Parity: Query digests now correctly report and for fast-forwarded queries.

Testing & Verification

  • Added and to verify metrics.
  • Added to verify resource safety thresholds.
  • Verified 100% pass rate for all new TAP tests.

Documentation

  • Added detailed Doxygen headers to and .
  • Documented internal protocol parsing logic in and .

Summary by CodeRabbit

  • New Features

    • Introduced FFTO: observes fast-forward traffic for MySQL and PostgreSQL and records query metrics (duration, rows affected/sent).
    • Runtime controls added: ffto_enabled and ffto_max_buffer_size with automatic bypass for oversized queries.
  • Documentation

    • Added comprehensive FFTO design and configuration guide.
  • Tests

    • Added end-to-end tests for FFTO (MySQL, PostgreSQL) and a bypass test.
  • Chores

    • Added an informational startup log and updated ignore patterns.

- Added TrafficObserver base interface
- Implemented MySQLFFTO and PgSQLFFTO classes for protocol-aware observation
- Integrated FFTO hooks into MySQL and PostgreSQL session handlers in FAST_FORWARD state
- Added configuration variables: mysql-ffto_enabled, mysql-ffto_max_buffer_size, pgsql-ffto_enabled, pgsql-ffto_max_buffer_size
- Implemented query digest extraction and reporting parity with internal stats
- Added session-level bypass when packet size exceeds buffer threshold
- Included architectural documentation in doc/ffto_design.md
- Implemented statement ID tracking for MySQL COM_STMT_PREPARE/EXECUTE
- Implemented Parse/Bind/Execute portal mapping for PostgreSQL Extended Query
- Corrected documentation regarding binary protocol support
- Ensured metric parity for prepared statements in Fast Forward mode
- Added comprehensive MySQL FFTO test (text and binary protocols)
- Added comprehensive PgSQL FFTO test (Simple and Extended Query protocols)
- Added FFTO memory bypass logic test
- Added bash orchestration scripts for each test group
- Integrated new tests into the TAP Makefile
… scripts

Following the CI convention where all binaries ending in -t in test/tap/tests/ are automatically executed, the subdirectories and run.sh scripts are no longer needed.
- Implemented MySQL OK_Packet parsing to extract affected_rows
- Implemented PostgreSQL CommandComplete parsing to extract rows affected/sent
- Updated TAP tests to validate sum_rows_affected and sum_rows_sent metrics
- Fixed incorrect assertion in test_ffto_mysql-t.cpp
- Resolved signature mismatch in report_query_stats
- Fixed missing type declarations for Query_Processor in FFTO classes
- Verified Metric Parity for affected_rows and rows_sent
- Confirmed full coverage in TAP tests for text/binary protocols and bypass logic
- Verified clean build with make debug
…andling

- Finalized MySQL and PostgreSQL protocol state machines.
- Implemented accurate affected_rows and rows_sent capture.
- Added defensive null checks to prevent early-session crashes.
- Enhanced TAP tests with result-set draining and success verification.
- Verified 100% pass rate for MySQL CRUD, binary protocol, and memory bypass.
@coderabbitai
Copy link

coderabbitai bot commented Feb 19, 2026

📝 Walkthrough

Walkthrough

Adds FFTO (Fast Forward Traffic Observer) infrastructure: a TrafficObserver interface and MySQL/PgSQL concrete observers that parse fast-forwarded protocol data, extract query metrics (duration, affected rows, rows sent), report digests, and integrate into session/thread handlers with runtime controls and buffer-size bypassing.

Changes

Cohort / File(s) Summary
Core interface
include/TrafficObserver.hpp
New abstract TrafficObserver with callbacks on_client_data, on_server_data, and on_close.
MySQL FFTO headers/impl
include/MySQLFFTO.hpp, lib/MySQLFFTO.cpp
New MySQLFFTO class and implementation: state-machine parsing of MySQL packets, prepared-statement tracking, metrics extraction, and digest reporting.
PostgreSQL FFTO headers/impl
include/PgSQLFFTO.hpp, lib/PgSQLFFTO.cpp
New PgSQLFFTO class and implementation: message-oriented parsing for simple/extended protocol, portal/statement mappings, row-count extraction, and digest reporting.
Session integrations
include/MySQL_Session.h, lib/MySQL_Session.cpp, include/PgSQL_Session.h, lib/PgSQL_Session.cpp
Add m_ffto (unique_ptr), ffto_bypassed flags, lifecycle handling (on_close), and feed client/server bytes into FFTO within fast-forward paths with buffer-size bypass logic.
Thread/config plumbing
include/MySQL_Thread.h, lib/MySQL_Thread.cpp, include/PgSQL_Thread.h, lib/PgSQL_Thread.cpp, include/proxysql_structs.h
Introduce ffto_enabled (bool) and ffto_max_buffer_size (int) thread-level variables, expose them through variable lists/refresh logic and thread-local externs.
Build & misc
lib/Makefile, .gitignore, src/main.cpp
Add MySQLFFTO.oo and PgSQLFFTO.oo to build, add .gemini to .gitignore, and a startup log line in main.
Design doc
doc/ffto_design.md
New comprehensive design document describing FFTO goals, architecture, protocol handling, performance/security considerations, and integration hooks.
Tests
test/tap/tests/Makefile, test/tap/tests/test_ffto_mysql-t.cpp, test/tap/tests/test_ffto_pgsql-t.cpp, test/tap/tests/test_ffto_bypass-t.cpp
Add end-to-end tests: MySQL (text/binary), PostgreSQL (simple/extended), and bypass behavior when queries exceed buffer limits; update test targets.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Session as MySQL/PgSQL<br/>Session
    participant Observer as MySQLFFTO/<br/>PgSQLFFTO
    participant Backend
    participant QP as Query<br/>Processor

    rect rgba(120, 180, 240, 0.5)
    Client->>Session: send query bytes (fast-forward)
    activate Session
    Session->>Observer: on_client_data(buf)
    activate Observer
    Observer->>Observer: parse/track query, start timer
    deactivate Observer
    Session->>Backend: forward bytes
    deactivate Session
    end

    rect rgba(200, 160, 120, 0.5)
    Backend->>Session: response bytes
    activate Session
    Session->>Observer: on_server_data(buf)
    activate Observer
    Observer->>Observer: parse response, extract rows/OK/CommandComplete
    Observer->>Observer: finalize metrics
    Observer->>QP: update_query_digest(digest, duration, rows_sent, affected_rows)
    deactivate Observer
    Session->>Client: forward response
    deactivate Session
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐰 I hopped through bytes both short and long,

I counted rows and timed each song,
MySQL, Postgres — I kept the score,
Digests stitched and sent once more,
A tiny rabbit cheering metrics galore! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.45% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: implementing FFTO for MySQL and PostgreSQL protocol observation in Fast-Forward mode, which aligns with the substantial protocol-parsing and state-machine additions across both database engines.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch v3.0-ff_inspect

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @renecannao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the Fast Forward Traffic Observer (FFTO) into ProxySQL, enabling detailed query statistics collection for both MySQL and PostgreSQL connections even when they are operating in fast-forward mode. Previously, fast-forwarded connections bypassed all statistics gathering. The FFTO passively observes the protocol streams to extract critical metadata like SQL digests, execution latency, affected rows, and rows sent, ensuring metric parity with standard query processing while maintaining high performance and resource safety through a configurable bypass mechanism for large payloads.

Highlights

  • MySQL Support: Implemented a state machine to handle Text and Binary (Prepared Statement) protocols, extracting query details and tracking result sets.
  • PostgreSQL Support: Implemented protocol handling for Simple and Extended Query protocols (Parse/Bind/Execute), correctly extracting row counts.
  • Resource Safety: Introduced a bypass mechanism for sessions with large payloads to prevent excessive buffering, controlled by *-ffto_max_buffer_size.
  • Full Metric Parity: Ensured query digests for fast-forwarded queries now correctly report affected_rows and rows_sent.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • .gitignore
    • Added '.gemini' to the list of ignored files.
  • doc/ffto_design.md
    • Added a new document outlining the architecture and design of the Fast Forward Traffic Observer (FFTO).
    • Detailed the executive summary, design goals, configuration variables, system architecture (TrafficObserver interface, MySQLFFTO, PgSQLFFTO), protocol/security details, performance/memory management, configuration/verification, protocol support, and limitations of FFTO.
  • include/MySQLFFTO.hpp
    • Added a new header file defining the MySQLFFTO class.
    • Declared methods for on_client_data, on_server_data, and on_close to implement the TrafficObserver interface.
    • Defined internal state variables for MySQL protocol parsing, including State enum, buffers, current query tracking, and prepared statement mapping.
  • include/MySQL_Session.h
    • Included a forward declaration for TrafficObserver.
    • Added std::unique_ptr<TrafficObserver> m_ffto to manage the FFTO instance for the session.
    • Added bool ffto_bypassed to track if FFTO has been disabled for the session due to large payloads.
  • include/MySQL_Thread.h
    • Added bool ffto_enabled and int ffto_max_buffer_size members to MySQL_Threads_Handler to control FFTO behavior.
  • include/PgSQLFFTO.hpp
    • Added a new header file defining the PgSQLFFTO class.
    • Declared methods for on_client_data, on_server_data, and on_close to implement the TrafficObserver interface.
    • Defined internal state variables for PostgreSQL protocol parsing, including State enum, buffers, current query tracking, and prepared statement/portal mapping.
  • include/PgSQL_Session.h
    • Included a forward declaration for TrafficObserver.
    • Added std::unique_ptr<TrafficObserver> m_ffto to manage the FFTO instance for the session.
    • Added bool ffto_bypassed to track if FFTO has been disabled for the session due to large payloads.
  • include/PgSQL_Thread.h
    • Added bool ffto_enabled and int ffto_max_buffer_size members to PgSQL_Threads_Handler to control FFTO behavior.
  • include/TrafficObserver.hpp
    • Added a new header file defining the TrafficObserver abstract base class.
    • Declared pure virtual methods on_client_data, on_server_data, and on_close for observing traffic.
  • include/proxysql_structs.h
    • Declared thread-local mysql_thread___ffto_enabled and mysql_thread___ffto_max_buffer_size variables.
    • Declared thread-local pgsql_thread___ffto_enabled and pgsql_thread___ffto_max_buffer_size variables.
  • lib/Makefile
    • Added MySQLFFTO.oo and PgSQLFFTO.oo to the _OBJ_CXX list for compilation.
  • lib/MySQLFFTO.cpp
    • Implemented the MySQLFFTO class methods.
    • Included logic for reassembling MySQL packets from fragmented buffers.
    • Implemented state machine for COM_QUERY, COM_STMT_PREPARE, COM_STMT_EXECUTE commands and their responses.
    • Added read_lenenc_int helper function for parsing MySQL protocol.
    • Implemented report_query_stats to integrate with ProxySQL's query digest system, tracking affected rows and rows sent.
  • lib/MySQL_Session.cpp
    • Included MySQLFFTO.hpp.
    • Initialized ffto_bypassed to false in the constructor.
    • Modified the destructor to call m_ffto->on_close() if an FFTO instance exists.
    • Integrated FFTO logic into GPFC_Statuses2 and GPFC_WaitingClientData_FastForwardSession for client data processing in FAST_FORWARD state.
    • Added logic to create MySQLFFTO instance and call on_client_data or on_server_data based on ffto_enabled and ffto_max_buffer_size.
    • Implemented bypass mechanism: if pkt.size exceeds ffto_max_buffer_size, ffto_bypassed is set to true and m_ffto is reset.
    • Integrated FFTO logic into handler() for server data processing in FAST_FORWARD state.
  • lib/MySQL_Thread.cpp
    • Added "ffto_enabled" and "ffto_max_buffer_size" to mysql_thread_variables_names.
    • Set default values for ffto_enabled (true) and ffto_max_buffer_size (1MB) in the constructor.
    • Registered ffto_enabled as a boolean variable and ffto_max_buffer_size as an integer variable in get_variables_list.
    • Added REFRESH_VARIABLE_BOOL(ffto_enabled) and REFRESH_VARIABLE_INT(ffto_max_buffer_size) to refresh_variables.
  • lib/PgSQLFFTO.cpp
    • Implemented the PgSQLFFTO class methods.
    • Included logic for reassembling PostgreSQL messages from fragmented buffers.
    • Implemented state machine for Query ('Q'), Parse ('P'), Bind ('B'), Execute ('E') client messages and CommandComplete ('C'), ReadyForQuery ('Z'), ErrorResponse ('E') server messages.
    • Added extract_pg_rows_affected helper function to parse CommandComplete tags.
    • Implemented report_query_stats to integrate with ProxySQL's query digest system, tracking affected rows and rows sent.
  • lib/PgSQL_Session.cpp
    • Included PgSQLFFTO.hpp.
    • Initialized ffto_bypassed to false in the constructor.
    • Modified the destructor to call m_ffto->on_close() if an FFTO instance exists.
    • Integrated FFTO logic into get_pkts_from_client for client data processing in FAST_FORWARD state.
    • Added logic to create PgSQLFFTO instance and call on_client_data or on_server_data based on ffto_enabled and ffto_max_buffer_size.
    • Implemented bypass mechanism: if pkt.size exceeds ffto_max_buffer_size, ffto_bypassed is set to true and m_ffto is reset.
    • Integrated FFTO logic into handler() for server data processing in FAST_FORWARD state.
  • lib/PgSQL_Thread.cpp
    • Added "ffto_enabled" and "ffto_max_buffer_size" to pgsql_thread_variables_names.
    • Set default values for ffto_enabled (true) and ffto_max_buffer_size (1MB) in the constructor.
    • Registered ffto_enabled as a boolean variable and ffto_max_buffer_size as an integer variable in get_variables_list.
    • Added pgsql_thread___ffto_enabled and pgsql_thread___ffto_max_buffer_size to refresh_variables.
  • src/main.cpp
    • Added a proxy_info("TEST_LOG: ProxySQL is starting\n"); message for testing purposes.
  • test/tap/tests/Makefile
    • Added test_ffto_mysql-t, test_ffto_pgsql-t, and test_ffto_bypass-t to the tests target.
  • test/tap/tests/test_ffto_bypass-t.cpp
    • Added a new TAP test to verify that FFTO is correctly bypassed when a query's payload size exceeds the configured mysql-ffto_max_buffer_size.
  • test/tap/tests/test_ffto_mysql-t.cpp
    • Added a new TAP test to validate MySQL FFTO functionality for both text-based queries and prepared statements, ensuring count_star, sum_rows_affected, and sum_rows_sent are correctly recorded in stats_mysql_query_digest.
  • test/tap/tests/test_ffto_pgsql-t.cpp
    • Added a new TAP test to validate PostgreSQL FFTO functionality for both simple and extended query protocols, ensuring count_star, sum_rows_affected, and sum_rows_sent are correctly recorded in stats_pgsql_query_digest.
Activity
  • Detailed design documentation for FFTO has been added.
  • New TAP tests (test_ffto_mysql-t, test_ffto_pgsql-t, test_ffto_bypass-t) have been introduced and verified to pass.
  • Doxygen headers were added to MySQLFFTO.hpp, PgSQLFFTO.hpp, and TrafficObserver.hpp.
  • Internal protocol parsing logic was documented in MySQLFFTO.cpp and PgSQLFFTO.cpp.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the Fast Forward Traffic Observer (FFTO) for both MySQL and PostgreSQL, a significant feature enhancing observability for connections in Fast-Forward mode. The implementation is well-structured, introducing a TrafficObserver interface and specific implementations for each database, along with necessary changes to session management and configuration. The addition of documentation and comprehensive tests is commendable. My review focuses on improving code maintainability by addressing duplicated logic, fixing minor style issues, removing a temporary log message, and replacing unsafe function calls in the new tests to align with best practices.


// Ensure user exists
char user_query[1024];
sprintf(user_query, "INSERT OR REPLACE INTO mysql_users (username, password, default_hostgroup, fast_forward) VALUES ('%s', '%s', 0, 1)", cl.username, cl.password);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using sprintf is unsafe as it doesn't perform bounds checking, which can lead to buffer overflows. It's recommended to use snprintf instead to ensure safety. This issue is also present on line 39.

Suggested change
sprintf(user_query, "INSERT OR REPLACE INTO mysql_users (username, password, default_hostgroup, fast_forward) VALUES ('%s', '%s', 0, 1)", cl.username, cl.password);
snprintf(user_query, sizeof(user_query), "INSERT OR REPLACE INTO mysql_users (username, password, default_hostgroup, fast_forward) VALUES ('%s', '%s', 0, 1)", cl.username, cl.password);


// Ensure backend server exists
char server_query[1024];
sprintf(server_query, "INSERT OR REPLACE INTO mysql_servers (hostgroup_id, hostname, port) VALUES (0, '%s', %d)", cl.mysql_host, cl.mysql_port);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using sprintf is unsafe as it doesn't perform bounds checking, which can lead to buffer overflows. It's recommended to use snprintf instead to ensure safety.

Suggested change
sprintf(server_query, "INSERT OR REPLACE INTO mysql_servers (hostgroup_id, hostname, port) VALUES (0, '%s', %d)", cl.mysql_host, cl.mysql_port);
snprintf(server_query, sizeof(server_query), "INSERT OR REPLACE INTO mysql_servers (hostgroup_id, hostname, port) VALUES (0, '%s', %d)", cl.mysql_host, cl.mysql_port);

void verify_digest(MYSQL* admin, const char* template_text, int expected_count, uint64_t expected_rows_affected = 0, uint64_t expected_rows_sent = 0) {
char query[1024];
// Use a more relaxed LIKE pattern to handle potential normalization differences
sprintf(query, "SELECT count_star, sum_rows_affected, sum_rows_sent, digest_text FROM stats_mysql_query_digest WHERE digest_text LIKE '%%%s%%'", template_text);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using sprintf is unsafe as it doesn't perform bounds checking, which can lead to buffer overflows. It's recommended to use snprintf instead to ensure safety. This issue is also present on line 85.

    snprintf(query, sizeof(query), "SELECT count_star, sum_rows_affected, sum_rows_sent, digest_text FROM stats_mysql_query_digest WHERE digest_text LIKE '%%%s%%'", template_text);


// Ensure backend server exists
char server_query[1024];
sprintf(server_query, "INSERT OR REPLACE INTO mysql_servers (hostgroup_id, hostname, port) VALUES (0, '%s', %d)", cl.mysql_host, cl.mysql_port);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using sprintf is unsafe as it doesn't perform bounds checking, which can lead to buffer overflows. It's recommended to use snprintf instead to ensure safety.

    snprintf(server_query, sizeof(server_query), "INSERT OR REPLACE INTO mysql_servers (hostgroup_id, hostname, port) VALUES (0, '%s', %d)", cl.mysql_host, cl.mysql_port);


void verify_pg_digest(MYSQL* admin, const char* template_text, int expected_count, uint64_t expected_rows_affected = 0, uint64_t expected_rows_sent = 0) {
char query[1024];
sprintf(query, "SELECT count_star, sum_rows_affected, sum_rows_sent, digest_text FROM stats_pgsql_query_digest WHERE digest_text LIKE '%%%s%%'", template_text);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using sprintf is unsafe as it doesn't perform bounds checking, which can lead to buffer overflows. It's recommended to use snprintf instead to ensure safety. This issue is also present on lines 84 and 92.

    snprintf(query, sizeof(query), "SELECT count_star, sum_rows_affected, sum_rows_sent, digest_text FROM stats_pgsql_query_digest WHERE digest_text LIKE '%%%s%%'", template_text);

Comment on lines +92 to +93
sprintf(conninfo, "host=%s port=%d user=%s password=%s dbname=postgres sslmode=disable",
cl.pgsql_host, cl.pgsql_port, cl.pgsql_root_username, cl.pgsql_root_password);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using sprintf is unsafe as it doesn't perform bounds checking, which can lead to buffer overflows. It's recommended to use snprintf instead to ensure safety.

    snprintf(conninfo, sizeof(conninfo), "host=%s port=%d user=%s password=%s dbname=postgres sslmode=disable", 
            cl.pgsql_host, cl.pgsql_port, cl.pgsql_root_username, cl.pgsql_root_password);

Comment on lines +490 to +491
std::unique_ptr<TrafficObserver> m_ffto;
bool ffto_bypassed;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The indentation for these new members is incorrect. Please adjust it to match the surrounding code style for better readability and consistency.

	std::unique_ptr<TrafficObserver> m_ffto;
	bool ffto_bypassed;

Comment on lines +11 to +14
/**
* @class PgSQLFFTO
* @brief PostgreSQL-specific implementation of TrafficObserver.
*/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This doxygen block for the PgSQLFFTO class is a duplicate of the one that follows. Please remove it to improve documentation clarity.

Comment on lines +4749 to +4761
if (mysql_thread___ffto_enabled && !ffto_bypassed) {
if (pkt.size > (size_t)mysql_thread___ffto_max_buffer_size) {
ffto_bypassed = true;
m_ffto.reset();
} else {
if (!m_ffto) {
m_ffto = std::make_unique<MySQLFFTO>(this);
}
if (m_ffto) {
m_ffto->on_client_data((const char*)pkt.ptr, pkt.size);
}
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code for handling FFTO logic is duplicated in a few places in this file (e.g., GPFC_Statuses2 for the CONNECTING_SERVER case, and in GPFC_WaitingClientData_FastForwardSession). To improve maintainability and reduce redundancy, consider extracting this logic into a private helper method within the MySQL_Session class.

}
#endif /* PROXYSQLGENAI */
ProxySQL_Main_init_Admin_module(boostrap_info);
proxy_info("TEST_LOG: ProxySQL is starting\n");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This log message appears to be for testing or debugging purposes. It should be removed before merging to keep the codebase clean.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

🧹 Nitpick comments (5)
include/MySQLFFTO.hpp (1)

31-31: Use override instead of virtual on the destructor for consistency.

virtual is redundant here because the base already declares the destructor virtual; override is more explicit and catches base-class changes.

♻️ Proposed fix
-    virtual ~MySQLFFTO();
+    ~MySQLFFTO() override;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/MySQLFFTO.hpp` at line 31, The destructor declaration "virtual
~MySQLFFTO();" should use override for clarity and safety; replace it with
"~MySQLFFTO() override;" in the MySQLFFTO class declaration so the compiler
enforces that it overrides a virtual base destructor (refer to the
MySQLFFTO::~MySQLFFTO symbol).
include/TrafficObserver.hpp (2)

16-16: Prefer = default for the virtual destructor.

virtual ~TrafficObserver() {} is functionally equivalent but = default is the idiomatic modern C++ form.

♻️ Proposed fix
-    virtual ~TrafficObserver() {}
+    virtual ~TrafficObserver() = default;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/TrafficObserver.hpp` at line 16, Replace the empty user-defined
virtual destructor implementation for the TrafficObserver interface with the
modern defaulted form: change the virtual destructor in class TrafficObserver
from the empty-body form (virtual ~TrafficObserver() {}) to the defaulted
declaration (virtual ~TrafficObserver() = default;), keeping it virtual to
preserve polymorphic deletion semantics.

4-4: Prefer <cstddef> over <stddef.h> in C++ code.

<stddef.h> is the C header; <cstddef> is the C++ counterpart placing size_t cleanly in std::.

♻️ Proposed fix
-#include <stddef.h>
+#include <cstddef>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/TrafficObserver.hpp` at line 4, Replace the C header include with the
C++ equivalent: change the include of <stddef.h> to <cstddef> in the header that
declares TrafficObserver (so symbols like size_t come from the C++ header);
ensure any uses of size_t either qualify as std::size_t or have an appropriate
using-declaration if needed.
include/MySQL_Thread.h (1)

418-419: ffto_* fields placed in the middle of the monitor_* variables block.

ffto_enabled and ffto_max_buffer_size are inserted between monitor_replication_lag_group_by_host and the monitor_replication_lag_interval group, fragmenting the logical monitor-variable cluster. Moving them to the end of the struct (or alongside the other non-monitor feature booleans such as commands_stats, servers_stats, etc.) would improve scanability and make the grouping intent clear.

Additionally, int ffto_max_buffer_size permits negative values; while the existing codebase uses int for many size fields, downstream consumers will need to guard against <= 0 inputs. Ensure set_variable / refresh_variables validates the range accordingly.

♻️ Suggested placement (move to end of struct, before closing brace at ~line 600)
-		bool ffto_enabled;
-		int ffto_max_buffer_size;
 		//! How frequently a replication lag check is performed. Unit: 'ms'.
 		int monitor_replication_lag_interval;
         // ... rest of monitor_* block unchanged ...
+
+		// FFTO (Fast-Forward Traffic Observer)
+		bool ffto_enabled;
+		int ffto_max_buffer_size;
 	} variables;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/MySQL_Thread.h` around lines 418 - 419, Move the two fields
ffto_enabled and ffto_max_buffer_size out of the middle of the monitor_* block
and place them with other non-monitor feature flags (for example alongside
commands_stats and servers_stats) at the end of the struct to restore logical
grouping; then add validation in set_variable and refresh_variables to guard
ffto_max_buffer_size against non-positive values (treat <=0 as invalid and
either clamp to a safe default or return an error), ensuring any assignment path
checks ffto_max_buffer_size before use.
include/PgSQLFFTO.hpp (1)

11-22: Remove the redundant first Doxygen block.

Lines 11–14 duplicate the class documentation already provided in lines 15–22. Having two @class / @brief entries for the same symbol confuses Doxygen and pollutes generated docs.

♻️ Proposed fix
-/**
- * `@class` PgSQLFFTO
- * `@brief` PostgreSQL-specific implementation of TrafficObserver.
- */
 /**
  * `@class` PgSQLFFTO
  * `@brief` Observer class for PostgreSQL traffic in Fast-Forward mode.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/PgSQLFFTO.hpp` around lines 11 - 22, Remove the duplicate Doxygen
block for the PgSQLFFTO class: keep the more detailed second block (the one that
starts with "Observer class for PostgreSQL traffic in Fast-Forward mode") and
delete the earlier, redundant /** `@class` PgSQLFFTO `@brief` ... */ block so there
is only a single Doxygen comment describing PgSQLFFTO.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@doc/ffto_design.md`:
- Around line 86-88: The MySQL bullet contains a grammar/typo issue: change
"Corrects captures `COM_STMT_PREPARE` and `COM_STMT_EXECUTE`, tracking statement
IDs to their respective SQL text." to a clear sentence such as "MySQL Binary
Protocol: captures `COM_STMT_PREPARE` and `COM_STMT_EXECUTE`, tracking statement
IDs to their respective SQL text."—update the wording around the
`COM_STMT_PREPARE` and `COM_STMT_EXECUTE` references so the verb and subject
agree and the sentence reads smoothly.

In `@include/MySQL_Session.h`:
- Around line 489-494: The new class member declarations (Session_Regex
**match_regexes, std::unique_ptr<TrafficObserver> m_ffto, bool ffto_bypassed,
ProxySQL_Node_Address * proxysql_node_address) are over-indented; align them
with the surrounding members (use the same single-tab indentation as members
like server_myds and use_ldap_auth) so their indentation level matches the class
style and improves readability.

In `@include/MySQLFFTO.hpp`:
- Line 76: process_client_packet currently doesn't handle _MYSQL_COM_STMT_CLOSE,
so m_statements (the unordered_map of stmt_id to SQL) is never cleared; add a
branch in process_client_packet to detect command == _MYSQL_COM_STMT_CLOSE,
extract the 4‑byte stmt_id from the packet payload (little‑endian uint32), and
call m_statements.erase(stmt_id) to remove the prepared statement entry (this
complements where process_server_packet inserts entries for prepared
statements).

In `@lib/MySQL_Session.cpp`:
- Around line 6026-6035: The current server→client handling ignores the session
bypass and buffer-size limit: ensure you only run FFTO when
mysql_thread___ffto_enabled is true AND ffto_bypassed is false, do not recreate
m_ffto here (remove/avoid the std::make_unique<MySQLFFTO>(this) in this block —
creation should remain at session init), and before calling
m_ffto->on_server_data(...) enforce mysql_thread___ffto_max_buffer_size by
skipping or truncating payloads whose .size exceeds that threshold; operate on
mybe->server_myds->PSarrayIN->pdata as shown but gated by these checks so you
honor the session-level bypass and the max buffer size.
- Line 652: reset() currently doesn’t clear session-level FFTO state, so add
explicit resets for ffto_bypassed and m_ffto after the existing fast-forward
state cleanup; update MySQL_Session::reset() to set ffto_bypassed = false and
m_ffto = nullptr (or equivalent cleared state for m_ffto) so stale FFTO observer
state and the bypass flag cannot persist across COM_CHANGE_USER or
COM_RESET_CONNECTION.

In `@lib/MySQLFFTO.cpp`:
- Line 159: The code unsafely dereferences
m_session->client_myds->myconn->options.client_flag in process_server_packet
(used at the first_byte checks) without the same null guards present in
report_query_stats; compute a local bool deprecate_eof at the start of
process_server_packet by checking m_session, m_session->client_myds and
m_session->client_myds->myconn before reading options.client_flag &
CLIENT_DEPRECATE_EOF, then replace both direct dereferences (the branches that
check first_byte == 0x00 and the other occurrence) with this local deprecate_eof
variable so the pointer chain is only evaluated once and safely guarded.
- Around line 162-176: The check for CLIENT_DEPRECATE_EOF uses the wrong packet
header byte: replace comparisons of first_byte == 0x00 with first_byte == 0xFE
in the EOF-detection logic so OK packets are correctly detected when
CLIENT_DEPRECATE_EOF is set; specifically update the condition in the
READING_COLUMNS handling and the condition in the READING_ROWS handling that
reference (m_session->client_myds->myconn->options.client_flag &
CLIENT_DEPRECATE_EOF) and first_byte, ensuring m_state transitions and
report_query_stats() are only triggered when first_byte == 0xFE.

In `@lib/PgSQL_Session.cpp`:
- Around line 2288-2300: Server-side FFTO processing currently calls
m_ffto->on_server_data(...) without bounding packet size; add the same guard
used for client packets: when pgsql_thread___ffto_enabled && !ffto_bypassed and
before calling PgSQLFFTO::on_server_data, check if pkt.size >
(size_t)pgsql_thread___ffto_max_buffer_size; if so set ffto_bypassed = true and
m_ffto.reset(), otherwise ensure m_ffto is created
(std::make_unique<PgSQLFFTO>(this)) if needed and then call
m_ffto->on_server_data((const char*)pkt.ptr, pkt.size). This mirrors the
client-side logic around pgsql_thread___ffto_max_buffer_size, ffto_bypassed, and
m_ffto to prevent unbounded server-side buffering.

In `@lib/PgSQLFFTO.cpp`:
- Around line 60-67: The loop reads a uint32_t msg_len and then passes msg_len -
4 to process_client_message (and similarly in on_server_data), which underflows
for msg_len < 4; add an explicit guard after decoding msg_len that
rejects/handles any msg_len < 4 (and optionally an upper-bound sanity check)
before computing msg_len - 4 or using it as a size_t, and return/skip/close the
connection on malformed frames; update the client loop using m_client_buffer,
the call to process_client_message(type, payload, msg_len - 4) and the
corresponding server loop/on_server_data that calls process_server_message to
use this validation so no underflow/OOB occurs.
- Line 93: The code sets m_current_query using all len bytes from the 'Q'
payload, which includes the PostgreSQL trailing NUL, causing
pgsql_query_digest_and_first_comment to see a different string; change the
assignment to strip the trailing NUL when present (e.g., check len>0 and
payload[len-1]==0 and use length (len-1) when constructing m_current_query, or
use strlen on payload) so m_current_query does not include the framing NUL
before calling pgsql_query_digest_and_first_comment.
- Around line 31-44: The function extract_pg_rows_affected builds command_tag
then uses a flawed regex that mis-captures multi-digit counts and re-compiles on
every call; replace the regex approach with simple string parsing: trim
command_tag, find the last space (or last token) to extract the numeric suffix
as rows (use std::stoull) and determine the command type by examining the
leading token (e.g., compare the first word to
"SELECT","FETCH","MOVE","UPDATE","DELETE","INSERT") to set is_select
appropriately; if you prefer to keep regex, make the regex static and change it
to unambiguous pattern (e.g., match start-of-string command word then \\s+ then
(\\d+)$) to both fix correctness and avoid recompiling per call—locate
extract_pg_rows_affected and command_tag to apply the change.
- Around line 96-118: Parsing of 'P', 'B', and 'E' messages reads
null-terminated strings without bounds checks causing buffer over-reads; update
the Parse/Bind/Execute handling to use strnlen (or equivalent) against the
available payload length to verify each null terminator lies within the provided
length before creating std::string or advancing pointers. For the 'P' case
validate stmt_name length+1 <= payload_len then create stmt_name with the
bounded length and ensure the query pointer + query_length is inside payload;
for the 'B' case validate portal_name and stmt_name each terminate within
remaining bytes before assigning m_portals[portal_name] = stmt_name; for the 'E'
case validate portal_name termination before lookup. On any validation failure,
handle safely (e.g., call on_close() or drop the message) instead of performing
the unchecked reads; refer to symbols m_statements, m_portals, m_current_query,
m_query_start_time, monotonic_time and on_close to locate where to add these
checks.

In `@src/main.cpp`:
- Line 1508: Remove the test-only log artifact from production by eliminating or
conditionalizing the proxy_info call that prints "TEST_LOG: ProxySQL is
starting\n" in main.cpp: either delete the proxy_info("TEST_LOG: ...")
invocation entirely, or wrap it in a test-only conditional (e.g., `#ifdef`
TEST_BUILD / runtime test flag) so that proxy_info is not emitted in production;
ensure any remaining startup logs use production-safe messages or move test-only
signaling into the test harness instead of using the proxy_info call.

In `@test/tap/tests/test_ffto_bypass-t.cpp`:
- Around line 52-64: The test executes a large query with mysql_query(conn,
large_query.c_str()) without asserting it succeeded and then checks for a
fragile LIKE match; change to use the test helper/MACRO that asserts execution
(e.g., MYSQL_QUERY or equivalent) to ensure the query actually ran, and replace
the run_q call that uses "SELECT count(*) FROM stats_mysql_query_digest WHERE
digest_text LIKE '%xxxx%'" with a simple total count query "SELECT count(*) FROM
stats_mysql_query_digest" and assert that this total count is 0 (use the
existing mysql_store_result/mysql_fetch_row/ok pattern to read and assert the
count).

In `@test/tap/tests/test_ffto_mysql-t.cpp`:
- Around line 25-57: In verify_digest, ensure the TAP plan emits the same number
of ok() assertions on all paths: when run_q(admin, query) fails (rc != 0) and
when no row is returned, add the two missing ok() calls (matching the
success-path checks for rows_affected and rows_sent) so the function always
emits three ok() calls; update the failure messages to indicate the digest
lookup failed for the given template_text and keep references to the same
diagnostic output; modify the branches around run_q(admin, query) and the "no
row" else to call ok(...) twice more (for expected_rows_affected and
expected_rows_sent) in addition to the existing failure ok() to preserve TAP
plan stability.

In `@test/tap/tests/test_ffto_pgsql-t.cpp`:
- Around line 27-57: The failure branch in verify_pg_digest currently calls only
one ok(0, ...) when no row is found, which desyncs the TAP plan; update the else
branch in verify_pg_digest to also emit the two additional ok(0, ...) assertions
for rows_affected and rows_sent (using template_text and
expected_rows_affected/expected_rows_sent in the messages, similar to the
success branch) so the total number of ok() calls matches both success and
failure paths, and keep existing diagnostics and mysql_free_result/res cleanup
unchanged.

---

Nitpick comments:
In `@include/MySQL_Thread.h`:
- Around line 418-419: Move the two fields ffto_enabled and ffto_max_buffer_size
out of the middle of the monitor_* block and place them with other non-monitor
feature flags (for example alongside commands_stats and servers_stats) at the
end of the struct to restore logical grouping; then add validation in
set_variable and refresh_variables to guard ffto_max_buffer_size against
non-positive values (treat <=0 as invalid and either clamp to a safe default or
return an error), ensuring any assignment path checks ffto_max_buffer_size
before use.

In `@include/MySQLFFTO.hpp`:
- Line 31: The destructor declaration "virtual ~MySQLFFTO();" should use
override for clarity and safety; replace it with "~MySQLFFTO() override;" in the
MySQLFFTO class declaration so the compiler enforces that it overrides a virtual
base destructor (refer to the MySQLFFTO::~MySQLFFTO symbol).

In `@include/PgSQLFFTO.hpp`:
- Around line 11-22: Remove the duplicate Doxygen block for the PgSQLFFTO class:
keep the more detailed second block (the one that starts with "Observer class
for PostgreSQL traffic in Fast-Forward mode") and delete the earlier, redundant
/** `@class` PgSQLFFTO `@brief` ... */ block so there is only a single Doxygen
comment describing PgSQLFFTO.

In `@include/TrafficObserver.hpp`:
- Line 16: Replace the empty user-defined virtual destructor implementation for
the TrafficObserver interface with the modern defaulted form: change the virtual
destructor in class TrafficObserver from the empty-body form (virtual
~TrafficObserver() {}) to the defaulted declaration (virtual ~TrafficObserver()
= default;), keeping it virtual to preserve polymorphic deletion semantics.
- Line 4: Replace the C header include with the C++ equivalent: change the
include of <stddef.h> to <cstddef> in the header that declares TrafficObserver
(so symbols like size_t come from the C++ header); ensure any uses of size_t
either qualify as std::size_t or have an appropriate using-declaration if
needed.

Comment on lines +86 to +88
- **Text and Binary Protocols**: FFTO supports both standard text-based queries and the binary protocol used by prepared statements.
- **MySQL Binary Protocol**: Corrects captures `COM_STMT_PREPARE` and `COM_STMT_EXECUTE`, tracking statement IDs to their respective SQL text.
- **PostgreSQL Extended Query**: Supports the multi-phase `Parse` -> `Bind` -> `Execute` sequence by tracking Statement and Portal mappings.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Typo in MySQL binary protocol bullet.

Small grammar fix improves readability.

✏️ Suggested edit
- **MySQL Binary Protocol**: Corrects captures `COM_STMT_PREPARE` and `COM_STMT_EXECUTE`, tracking statement IDs to their respective SQL text.
+ **MySQL Binary Protocol**: Correctly captures `COM_STMT_PREPARE` and `COM_STMT_EXECUTE`, tracking statement IDs to their respective SQL text.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- **Text and Binary Protocols**: FFTO supports both standard text-based queries and the binary protocol used by prepared statements.
- **MySQL Binary Protocol**: Corrects captures `COM_STMT_PREPARE` and `COM_STMT_EXECUTE`, tracking statement IDs to their respective SQL text.
- **PostgreSQL Extended Query**: Supports the multi-phase `Parse` -> `Bind` -> `Execute` sequence by tracking Statement and Portal mappings.
- **Text and Binary Protocols**: FFTO supports both standard text-based queries and the binary protocol used by prepared statements.
- **MySQL Binary Protocol**: Correctly captures `COM_STMT_PREPARE` and `COM_STMT_EXECUTE`, tracking statement IDs to their respective SQL text.
- **PostgreSQL Extended Query**: Supports the multi-phase `Parse` -> `Bind` -> `Execute` sequence by tracking Statement and Portal mappings.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@doc/ffto_design.md` around lines 86 - 88, The MySQL bullet contains a
grammar/typo issue: change "Corrects captures `COM_STMT_PREPARE` and
`COM_STMT_EXECUTE`, tracking statement IDs to their respective SQL text." to a
clear sentence such as "MySQL Binary Protocol: captures `COM_STMT_PREPARE` and
`COM_STMT_EXECUTE`, tracking statement IDs to their respective SQL text."—update
the wording around the `COM_STMT_PREPARE` and `COM_STMT_EXECUTE` references so
the verb and subject agree and the sentence reads smoothly.

Comment on lines +489 to +494
Session_Regex **match_regexes;
std::unique_ptr<TrafficObserver> m_ffto;
bool ffto_bypassed;

ProxySQL_Node_Address * proxysql_node_address;

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Indentation inconsistency — new members are over-indented relative to surrounding class members.

Lines 489–494 carry 3 levels of indentation while adjacent members (server_myds at line 411, use_ldap_auth at line 496) use a single tab.

♻️ Proposed fix
-			Session_Regex **match_regexes;
-			std::unique_ptr<TrafficObserver> m_ffto;
-			bool ffto_bypassed;
-		
-			ProxySQL_Node_Address * proxysql_node_address;
-		
+	Session_Regex **match_regexes;
+	std::unique_ptr<TrafficObserver> m_ffto;
+	bool ffto_bypassed = false;
+	ProxySQL_Node_Address * proxysql_node_address;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Session_Regex **match_regexes;
std::unique_ptr<TrafficObserver> m_ffto;
bool ffto_bypassed;
ProxySQL_Node_Address * proxysql_node_address;
Session_Regex **match_regexes;
std::unique_ptr<TrafficObserver> m_ffto;
bool ffto_bypassed = false;
ProxySQL_Node_Address * proxysql_node_address;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/MySQL_Session.h` around lines 489 - 494, The new class member
declarations (Session_Regex **match_regexes, std::unique_ptr<TrafficObserver>
m_ffto, bool ffto_bypassed, ProxySQL_Node_Address * proxysql_node_address) are
over-indented; align them with the surrounding members (use the same single-tab
indentation as members like server_myds and use_ldap_auth) so their indentation
level matches the class style and improves readability.

uint64_t m_affected_rows; ///< Number of rows affected by the current DML query.
uint64_t m_rows_sent; ///< Number of rows returned by the current SELECT query.

std::unordered_map<uint32_t, std::string> m_statements; ///< Map of stmt_id to original SQL query for prepared statements.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# Search for COM_STMT_CLOSE handling and m_statements usage
rg -n "COM_STMT_CLOSE" --type cpp -C 5

Repository: sysown/proxysql

Length of output: 4706


🏁 Script executed:

# Search for m_statements erasure/cleanup patterns
rg -n "m_statements.*erase|erase.*m_statements" --type cpp -B 2 -A 2

Repository: sysown/proxysql

Length of output: 41


🏁 Script executed:

# Find MySQLFFTO.cpp implementation
fd -e cpp "MySQLFFTO"

Repository: sysown/proxysql

Length of output: 76


🏁 Script executed:

# First, check the size and structure of MySQLFFTO.cpp
wc -l lib/MySQLFFTO.cpp

Repository: sysown/proxysql

Length of output: 80


🏁 Script executed:

# Search for process_client_packet or similar methods in MySQLFFTO.cpp
rg -n "process_client_packet|COM_STMT_CLOSE|m_statements" lib/MySQLFFTO.cpp -B 2 -A 5

Repository: sysown/proxysql

Length of output: 1646


🏁 Script executed:

cat -n lib/MySQLFFTO.cpp

Repository: sysown/proxysql

Length of output: 11182


Add COM_STMT_CLOSE handler to remove entries from m_statements.

The process_client_packet method does not handle _MYSQL_COM_STMT_CLOSE commands. The m_statements map (which stores prepared statement IDs) is populated when statements are prepared (line 135 in process_server_packet) but never cleared. In long-lived fast-forward sessions with frequent prepared-statement churn, this results in unbounded map growth for the session's lifetime.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/MySQLFFTO.hpp` at line 76, process_client_packet currently doesn't
handle _MYSQL_COM_STMT_CLOSE, so m_statements (the unordered_map of stmt_id to
SQL) is never cleared; add a branch in process_client_packet to detect command
== _MYSQL_COM_STMT_CLOSE, extract the 4‑byte stmt_id from the packet payload
(little‑endian uint32), and call m_statements.erase(stmt_id) to remove the
prepared statement entry (this complements where process_server_packet inserts
entries for prepared statements).

user_attributes=NULL;
schema_locked=false;
session_fast_forward=SESSION_FORWARD_TYPE_NONE;
ffto_bypassed=false;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Find MySQL_Session::reset definition
rg -n "void MySQL_Session::reset" lib/MySQL_Session.cpp

# Check COM_CHANGE_USER and COM_RESET_CONNECTION usage in the file
rg -n "COM_CHANGE_USER|COM_RESET_CONNECTION" lib/MySQL_Session.cpp

# Find all occurrences of ffto_bypassed and m_ffto
rg -n "ffto_bypassed|m_ffto" --glob="*.cpp" --glob="*.hpp"

Repository: sysown/proxysql

Length of output: 4672


🏁 Script executed:

# Read the MySQL_Session::reset() method starting at line 709
sed -n '709,800p' lib/MySQL_Session.cpp | cat -n

Repository: sysown/proxysql

Length of output: 2935


🏁 Script executed:

# Read the COM_CHANGE_USER handler
sed -n '8313,8400p' lib/MySQL_Session.cpp | cat -n

# Also check COM_RESET_CONNECTION handler
sed -n '8398,8450p' lib/MySQL_Session.cpp | cat -n

Repository: sysown/proxysql

Length of output: 7001


Reset FFTO session state in MySQL_Session::reset().

ffto_bypassed and m_ffto are session-level state not cleared by reset(). Since COM_CHANGE_USER and COM_RESET_CONNECTION invoke reset() to establish a new logical session, stale FFTO observer state or the bypassed flag could persist across session resets, skewing metrics or unexpectedly skipping FFTO in the new session.

Add the following lines to reset() after the existing fast-forward state cleanup:

	backend_closed_in_fast_forward = false;
	fast_forward_grace_start_time = 0;
+	ffto_bypassed = false;
+	m_ffto.reset();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/MySQL_Session.cpp` at line 652, reset() currently doesn’t clear
session-level FFTO state, so add explicit resets for ffto_bypassed and m_ffto
after the existing fast-forward state cleanup; update MySQL_Session::reset() to
set ffto_bypassed = false and m_ffto = nullptr (or equivalent cleared state for
m_ffto) so stale FFTO observer state and the bypass flag cannot persist across
COM_CHANGE_USER or COM_RESET_CONNECTION.

Comment on lines 6026 to 6035
if (mysql_thread___ffto_enabled) {
if (!m_ffto) {
m_ffto = std::make_unique<MySQLFFTO>(this);
}
if (m_ffto) {
for (unsigned int i = 0; i < mybe->server_myds->PSarrayIN->len; i++) {
m_ffto->on_server_data((const char*)mybe->server_myds->PSarrayIN->pdata[i].ptr, mybe->server_myds->PSarrayIN->pdata[i].size);
}
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Honor FFTO bypass and size limits for server→client packets.

This block ignores ffto_bypassed and recreates m_ffto after a bypass, which defeats the session-level safety switch. It also doesn’t apply mysql_thread___ffto_max_buffer_size to server payloads, so large resultsets can still drive excessive buffering.

Recommend guarding with !ffto_bypassed, avoiding re-creation here, and applying the same size threshold to server packets.

🐛 Suggested fix
-			if (mysql_thread___ffto_enabled) {
-				if (!m_ffto) {
-					m_ffto = std::make_unique<MySQLFFTO>(this);
-				}
-				if (m_ffto) {
-					for (unsigned int i = 0; i < mybe->server_myds->PSarrayIN->len; i++) {
-						m_ffto->on_server_data((const char*)mybe->server_myds->PSarrayIN->pdata[i].ptr, mybe->server_myds->PSarrayIN->pdata[i].size);
-					}
-				}
-			}
+			if (mysql_thread___ffto_enabled && !ffto_bypassed && m_ffto) {
+				for (unsigned int i = 0; i < mybe->server_myds->PSarrayIN->len; i++) {
+					const auto& p = mybe->server_myds->PSarrayIN->pdata[i];
+					if (p.size > (size_t)mysql_thread___ffto_max_buffer_size) {
+						ffto_bypassed = true;
+						m_ffto.reset();
+						break;
+					}
+					m_ffto->on_server_data((const char*)p.ptr, p.size);
+				}
+			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/MySQL_Session.cpp` around lines 6026 - 6035, The current server→client
handling ignores the session bypass and buffer-size limit: ensure you only run
FFTO when mysql_thread___ffto_enabled is true AND ffto_bypassed is false, do not
recreate m_ffto here (remove/avoid the std::make_unique<MySQLFFTO>(this) in this
block — creation should remain at session init), and before calling
m_ffto->on_server_data(...) enforce mysql_thread___ffto_max_buffer_size by
skipping or truncating payloads whose .size exceeds that threshold; operate on
mybe->server_myds->PSarrayIN->pdata as shown but gated by these checks so you
honor the session-level bypass and the max buffer size.

Comment on lines +96 to +118
} else if (type == 'P') {
std::string stmt_name = reinterpret_cast<const char*>(payload);
const char* query = reinterpret_cast<const char*>(payload) + stmt_name.length() + 1;
m_statements[stmt_name] = query;
} else if (type == 'B') {
std::string portal_name = reinterpret_cast<const char*>(payload);
const char* stmt_ptr = reinterpret_cast<const char*>(payload) + portal_name.length() + 1;
std::string stmt_name = stmt_ptr;
m_portals[portal_name] = stmt_name;
} else if (type == 'E') {
std::string portal_name = reinterpret_cast<const char*>(payload);
auto pit = m_portals.find(portal_name);
if (pit != m_portals.end()) {
auto sit = m_statements.find(pit->second);
if (sit != m_statements.end()) {
m_current_query = sit->second;
m_query_start_time = monotonic_time();
m_state = AWAITING_RESPONSE;
}
}
} else if (type == 'X') {
on_close();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

No bounds check before reading null-terminated strings in Parse ('P'), Bind ('B'), and Execute ('E') messages.

Lines 97, 98, 101, 102, and 106 all construct std::string or raw pointers from reinterpret_cast<const char*>(payload) treating it as a null-terminated string, with no verification that a null byte falls within the [payload, payload+len) region. A malformed (or truncated) message causes a buffer over-read into adjacent m_client_buffer memory.

🐛 Proposed fix — use strnlen for all null-terminated reads
     } else if (type == 'P') {
-        std::string stmt_name = reinterpret_cast<const char*>(payload);
-        const char* query = reinterpret_cast<const char*>(payload) + stmt_name.length() + 1;
-        m_statements[stmt_name] = query;
+        size_t sn_len = strnlen(reinterpret_cast<const char*>(payload), len);
+        if (sn_len >= len) return; // no null terminator within bounds
+        std::string stmt_name(reinterpret_cast<const char*>(payload), sn_len);
+        size_t q_off = sn_len + 1;
+        if (q_off >= len) return;
+        size_t q_len = strnlen(reinterpret_cast<const char*>(payload) + q_off, len - q_off);
+        m_statements[stmt_name] = std::string(reinterpret_cast<const char*>(payload) + q_off, q_len);
     } else if (type == 'B') {
-        std::string portal_name = reinterpret_cast<const char*>(payload);
-        const char* stmt_ptr = reinterpret_cast<const char*>(payload) + portal_name.length() + 1;
-        std::string stmt_name = stmt_ptr;
+        size_t pn_len = strnlen(reinterpret_cast<const char*>(payload), len);
+        if (pn_len >= len) return;
+        std::string portal_name(reinterpret_cast<const char*>(payload), pn_len);
+        size_t sn_off = pn_len + 1;
+        if (sn_off >= len) return;
+        std::string stmt_name(reinterpret_cast<const char*>(payload) + sn_off,
+                              strnlen(reinterpret_cast<const char*>(payload) + sn_off, len - sn_off));
         m_portals[portal_name] = stmt_name;
     } else if (type == 'E') {
-        std::string portal_name = reinterpret_cast<const char*>(payload);
+        size_t pn_len = strnlen(reinterpret_cast<const char*>(payload), len);
+        if (pn_len >= len) return;
+        std::string portal_name(reinterpret_cast<const char*>(payload), pn_len);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/PgSQLFFTO.cpp` around lines 96 - 118, Parsing of 'P', 'B', and 'E'
messages reads null-terminated strings without bounds checks causing buffer
over-reads; update the Parse/Bind/Execute handling to use strnlen (or
equivalent) against the available payload length to verify each null terminator
lies within the provided length before creating std::string or advancing
pointers. For the 'P' case validate stmt_name length+1 <= payload_len then
create stmt_name with the bounded length and ensure the query pointer +
query_length is inside payload; for the 'B' case validate portal_name and
stmt_name each terminate within remaining bytes before assigning
m_portals[portal_name] = stmt_name; for the 'E' case validate portal_name
termination before lookup. On any validation failure, handle safely (e.g., call
on_close() or drop the message) instead of performing the unchecked reads; refer
to symbols m_statements, m_portals, m_current_query, m_query_start_time,
monotonic_time and on_close to locate where to add these checks.

}
#endif /* PROXYSQLGENAI */
ProxySQL_Main_init_Admin_module(boostrap_info);
proxy_info("TEST_LOG: ProxySQL is starting\n");
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove the TEST_LOG: debug/test artifact from production startup logging.

proxy_info("TEST_LOG: ProxySQL is starting\n") emits unconditionally to the production error log on every startup. The TEST_LOG: prefix is a test-harness artifact and will appear in operator-visible production logs, polluting structured log pipelines. If TAP tests grep for this marker to detect startup readiness, that couples test infrastructure directly to a production log string — a fragile, non-standard approach.

Standard alternatives:

  • Remove the line entirely and use connection probing in the TAP tests to detect startup.
  • If a startup marker is genuinely needed, guard it with #ifdef TEST_BUILD or a runtime flag, or emit it only from the test harness itself.
🗑️ Proposed fix
-	proxy_info("TEST_LOG: ProxySQL is starting\n");
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
proxy_info("TEST_LOG: ProxySQL is starting\n");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main.cpp` at line 1508, Remove the test-only log artifact from production
by eliminating or conditionalizing the proxy_info call that prints "TEST_LOG:
ProxySQL is starting\n" in main.cpp: either delete the proxy_info("TEST_LOG:
...") invocation entirely, or wrap it in a test-only conditional (e.g., `#ifdef`
TEST_BUILD / runtime test flag) so that proxy_info is not emitted in production;
ensure any remaining startup logs use production-safe messages or move test-only
signaling into the test harness instead of using the proxy_info call.

Comment on lines +52 to +64
// Send a query larger than 100 bytes
std::string large_query = "SELECT '";
for(int i=0; i<200; i++) large_query += "x";
large_query += "'";

mysql_query(conn, large_query.c_str());

// Verify that NO digest was recorded for this query because it was bypassed
int rc = run_q(admin, "SELECT count(*) FROM stats_mysql_query_digest WHERE digest_text LIKE '%xxxx%'");
MYSQL_RES* res = mysql_store_result(admin);
MYSQL_ROW row = mysql_fetch_row(res);
int count = atoi(row[0]);
ok(count == 0, "Query larger than threshold was correctly bypassed (count: %d)", count);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Search for digest_text normalization and stats_mysql_query_digest handling
rg -n "digest_text" -g '*.cpp' -g '*.h' -A 3 -B 3 | head -100

Repository: sysown/proxysql

Length of output: 8992


🏁 Script executed:

# Look at the MYSQL_QUERY macro definition
rg -n "MYSQL_QUERY" -g '*.cpp' -g '*.h' -A 5 -B 2 | head -80

Repository: sysown/proxysql

Length of output: 6502


🏁 Script executed:

# Examine the test file context around the lines in question
sed -n '40,75p' test/tap/tests/test_ffto_bypass-t.cpp

Repository: sysown/proxysql

Length of output: 1184


🏁 Script executed:

# Check how stats_mysql_query_digest is used elsewhere in tests
rg -n "stats_mysql_query_digest" -g '*.cpp' -g '*.h' | head -50

Repository: sysown/proxysql

Length of output: 6345


🏁 Script executed:

# Find MYSQL_QUERY macro definition
rg -n "define MYSQL_QUERY" -g '*.h' -g '*.cpp' -A 5

Repository: sysown/proxysql

Length of output: 7815


🏁 Script executed:

# Check for query normalization/truncation in the codebase
rg -n "normalize" -g '*.cpp' -g '*.h' | grep -i digest

Repository: sysown/proxysql

Length of output: 50372


🏁 Script executed:

# Look at other FFTO tests to see their assertion patterns
sed -n '20,100p' test/tap/tests/test_ffto_mysql-t.cpp

Repository: sysown/proxysql

Length of output: 3927


🏁 Script executed:

# Check how test_ffto_mysql.cpp handles query success checking
sed -n '70,130p' test/tap/tests/test_ffto_mysql-t.cpp

Repository: sysown/proxysql

Length of output: 2958


🏁 Script executed:

# Look for bypass test patterns in test_ffto_mysql.cpp to compare with test_ffto_bypass-t.cpp
rg -n "bypass" -A 5 -B 5 test/tap/tests/test_ffto_mysql-t.cpp

Repository: sysown/proxysql

Length of output: 41


🏁 Script executed:

# Read the full test_ffto_bypass-t.cpp to understand context
cat test/tap/tests/test_ffto_bypass-t.cpp

Repository: sysown/proxysql

Length of output: 2641


🏁 Script executed:

# Check if there are other bypass tests to understand the pattern
fd -t f "bypass" test/tap/tests/ --type f

Repository: sysown/proxysql

Length of output: 96


Assert query execution and simplify digest count check for bypass verification.

Line 57 executes the large query without checking success; if execution fails, no digest is recorded and the test passes incorrectly. Line 60 relies on LIKE '%xxxx%', but this pattern is fragile if digest_text is normalized or truncated. Since bypass behavior means NO digests should be recorded (the stats table was cleared at line 43), use MYSQL_QUERY to assert the query executed and check the total digest count instead.

🔧 Suggested adjustment
-    mysql_query(conn, large_query.c_str());
+    MYSQL_QUERY(conn, large_query.c_str());

-    int rc = run_q(admin, "SELECT count(*) FROM stats_mysql_query_digest WHERE digest_text LIKE '%xxxx%'");
+    int rc = run_q(admin, "SELECT count(*) FROM stats_mysql_query_digest");
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Send a query larger than 100 bytes
std::string large_query = "SELECT '";
for(int i=0; i<200; i++) large_query += "x";
large_query += "'";
mysql_query(conn, large_query.c_str());
// Verify that NO digest was recorded for this query because it was bypassed
int rc = run_q(admin, "SELECT count(*) FROM stats_mysql_query_digest WHERE digest_text LIKE '%xxxx%'");
MYSQL_RES* res = mysql_store_result(admin);
MYSQL_ROW row = mysql_fetch_row(res);
int count = atoi(row[0]);
ok(count == 0, "Query larger than threshold was correctly bypassed (count: %d)", count);
// Send a query larger than 100 bytes
std::string large_query = "SELECT '";
for(int i=0; i<200; i++) large_query += "x";
large_query += "'";
MYSQL_QUERY(conn, large_query.c_str());
// Verify that NO digest was recorded for this query because it was bypassed
int rc = run_q(admin, "SELECT count(*) FROM stats_mysql_query_digest");
MYSQL_RES* res = mysql_store_result(admin);
MYSQL_ROW row = mysql_fetch_row(res);
int count = atoi(row[0]);
ok(count == 0, "Query larger than threshold was correctly bypassed (count: %d)", count);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/tap/tests/test_ffto_bypass-t.cpp` around lines 52 - 64, The test
executes a large query with mysql_query(conn, large_query.c_str()) without
asserting it succeeded and then checks for a fragile LIKE match; change to use
the test helper/MACRO that asserts execution (e.g., MYSQL_QUERY or equivalent)
to ensure the query actually ran, and replace the run_q call that uses "SELECT
count(*) FROM stats_mysql_query_digest WHERE digest_text LIKE '%xxxx%'" with a
simple total count query "SELECT count(*) FROM stats_mysql_query_digest" and
assert that this total count is 0 (use the existing
mysql_store_result/mysql_fetch_row/ok pattern to read and assert the count).

Comment on lines +25 to +57
void verify_digest(MYSQL* admin, const char* template_text, int expected_count, uint64_t expected_rows_affected = 0, uint64_t expected_rows_sent = 0) {
char query[1024];
// Use a more relaxed LIKE pattern to handle potential normalization differences
sprintf(query, "SELECT count_star, sum_rows_affected, sum_rows_sent, digest_text FROM stats_mysql_query_digest WHERE digest_text LIKE '%%%s%%'", template_text);
int rc = run_q(admin, query);
if (rc != 0) {
ok(0, "Failed to query stats_mysql_query_digest for %s", template_text);
return;
}
MYSQL_RES* res = mysql_store_result(admin);
MYSQL_ROW row = mysql_fetch_row(res);
if (row) {
int count = atoi(row[0]);
uint64_t rows_affected = strtoull(row[1], NULL, 10);
uint64_t rows_sent = strtoull(row[2], NULL, 10);

ok(count >= expected_count, "Found digest: %s (count: %d, expected: %d)", row[3], count, expected_count);
ok(rows_affected == expected_rows_affected, "Affected rows for %s: %llu (expected: %llu)", row[3], (unsigned long long)rows_affected, (unsigned long long)expected_rows_affected);
ok(rows_sent == expected_rows_sent, "Sent rows for %s: %llu (expected: %llu)", row[3], (unsigned long long)rows_sent, (unsigned long long)expected_rows_sent);
} else {
ok(0, "Digest NOT found for pattern: %s", template_text);
// Dump the table to see what's actually in there
diag("Dumping stats_mysql_query_digest for debugging:");
run_q(admin, "SELECT digest_text, count_star FROM stats_mysql_query_digest");
MYSQL_RES* dump_res = mysql_store_result(admin);
MYSQL_ROW dump_row;
while (dump_res && (dump_row = mysql_fetch_row(dump_res))) {
diag(" Actual digest in table: %s", dump_row[0]);
}
if (dump_res) mysql_free_result(dump_res);
}
mysql_free_result(res);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Keep TAP plan count consistent on digest miss.

On failure paths (rc != 0 or no rows), only one ok() is emitted, which can make the TAP plan count short and obscure the real failure. Emit the two additional assertions even on failure to keep the plan stable.

✅ Suggested fix
     if (rc != 0) {
         ok(0, "Failed to query stats_mysql_query_digest for %s", template_text);
+        ok(0, "Affected rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_affected);
+        ok(0, "Sent rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_sent);
         return;
     }
@@
     } else {
         ok(0, "Digest NOT found for pattern: %s", template_text);
+        ok(0, "Affected rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_affected);
+        ok(0, "Sent rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_sent);
         // Dump the table to see what's actually in there
         diag("Dumping stats_mysql_query_digest for debugging:");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/tap/tests/test_ffto_mysql-t.cpp` around lines 25 - 57, In verify_digest,
ensure the TAP plan emits the same number of ok() assertions on all paths: when
run_q(admin, query) fails (rc != 0) and when no row is returned, add the two
missing ok() calls (matching the success-path checks for rows_affected and
rows_sent) so the function always emits three ok() calls; update the failure
messages to indicate the digest lookup failed for the given template_text and
keep references to the same diagnostic output; modify the branches around
run_q(admin, query) and the "no row" else to call ok(...) twice more (for
expected_rows_affected and expected_rows_sent) in addition to the existing
failure ok() to preserve TAP plan stability.

Comment on lines +27 to +57
void verify_pg_digest(MYSQL* admin, const char* template_text, int expected_count, uint64_t expected_rows_affected = 0, uint64_t expected_rows_sent = 0) {
char query[1024];
sprintf(query, "SELECT count_star, sum_rows_affected, sum_rows_sent, digest_text FROM stats_pgsql_query_digest WHERE digest_text LIKE '%%%s%%'", template_text);
int rc = run_q(admin, query);
if (rc != 0) {
ok(0, "Failed to query stats_pgsql_query_digest for %s", template_text);
return;
}
MYSQL_RES* res = mysql_store_result(admin);
MYSQL_ROW row = mysql_fetch_row(res);
if (row) {
int count = atoi(row[0]);
uint64_t rows_affected = strtoull(row[1], NULL, 10);
uint64_t rows_sent = strtoull(row[2], NULL, 10);

ok(count >= expected_count, "Found PG digest: %s (count: %d, expected: %d)", row[3], count, expected_count);
ok(rows_affected == expected_rows_affected, "Affected rows for %s: %llu (expected: %llu)", row[3], (unsigned long long)rows_affected, (unsigned long long)expected_rows_affected);
ok(rows_sent == expected_rows_sent, "Sent rows for %s: %llu (expected: %llu)", row[3], (unsigned long long)rows_sent, (unsigned long long)expected_rows_sent);
} else {
ok(0, "PG Digest NOT found for pattern: %s", template_text);
diag("Dumping stats_pgsql_query_digest for debugging:");
run_q(admin, "SELECT digest_text, count_star FROM stats_pgsql_query_digest");
MYSQL_RES* dump_res = mysql_store_result(admin);
MYSQL_ROW dump_row;
while (dump_res && (dump_row = mysql_fetch_row(dump_res))) {
diag(" Actual PG digest in table: %s", dump_row[0]);
}
if (dump_res) mysql_free_result(dump_res);
}
mysql_free_result(res);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Keep TAP plan count consistent on digest miss.

Like the MySQL test helper, the failure branches only emit one ok(); that can desync the TAP plan and mask the real failure. Emit the two additional assertions even on failure.

✅ Suggested fix
     if (rc != 0) {
         ok(0, "Failed to query stats_pgsql_query_digest for %s", template_text);
+        ok(0, "Affected rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_affected);
+        ok(0, "Sent rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_sent);
         return;
     }
@@
     } else {
         ok(0, "PG Digest NOT found for pattern: %s", template_text);
+        ok(0, "Affected rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_affected);
+        ok(0, "Sent rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_sent);
         diag("Dumping stats_pgsql_query_digest for debugging:");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/tap/tests/test_ffto_pgsql-t.cpp` around lines 27 - 57, The failure
branch in verify_pg_digest currently calls only one ok(0, ...) when no row is
found, which desyncs the TAP plan; update the else branch in verify_pg_digest to
also emit the two additional ok(0, ...) assertions for rows_affected and
rows_sent (using template_text and expected_rows_affected/expected_rows_sent in
the messages, similar to the success branch) so the total number of ok() calls
matches both success and failure paths, and keep existing diagnostics and
mysql_free_result/res cleanup unchanged.

@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
8 Security Hotspots

See analysis details on SonarQube Cloud

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
lib/MySQLFFTO.cpp (1)

76-76: std::vector::erase from the front is O(n) on every processed packet.

Both m_client_buffer and m_server_buffer shift all remaining bytes on every consumed packet. For a fast-forward observer sitting on the hot I/O path, this linear cost accumulates. A read-offset index into the vector (advancing head without moving data) or switching to std::deque eliminates the shift entirely.

♻️ Sketch: offset-based approach for `on_client_data`
-    m_client_buffer.insert(m_client_buffer.end(), buf, buf + len);
-    while (m_client_buffer.size() >= sizeof(mysql_hdr)) {
-        const mysql_hdr* hdr = reinterpret_cast<const mysql_hdr*>(m_client_buffer.data());
-        uint32_t pkt_len = hdr->pkt_length;
-        if (m_client_buffer.size() < sizeof(mysql_hdr) + pkt_len) break;
-        const unsigned char* payload = reinterpret_cast<const unsigned char*>(m_client_buffer.data()) + sizeof(mysql_hdr);
-        process_client_packet(payload, pkt_len);
-        m_client_buffer.erase(m_client_buffer.begin(), m_client_buffer.begin() + sizeof(mysql_hdr) + pkt_len);
-    }
+    m_client_buffer.insert(m_client_buffer.end(), buf, buf + len);
+    size_t offset = 0;
+    while (m_client_buffer.size() - offset >= sizeof(mysql_hdr)) {
+        const mysql_hdr* hdr = reinterpret_cast<const mysql_hdr*>(m_client_buffer.data() + offset);
+        uint32_t pkt_len = hdr->pkt_length;
+        if (m_client_buffer.size() - offset < sizeof(mysql_hdr) + pkt_len) break;
+        const unsigned char* payload = reinterpret_cast<const unsigned char*>(m_client_buffer.data()) + offset + sizeof(mysql_hdr);
+        process_client_packet(payload, pkt_len);
+        offset += sizeof(mysql_hdr) + pkt_len;
+    }
+    if (offset > 0)
+        m_client_buffer.erase(m_client_buffer.begin(), m_client_buffer.begin() + offset);

Apply the same pattern to on_server_data.

Also applies to: 89-89

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/MySQLFFTO.cpp` at line 76, The erase call on m_client_buffer (and similar
on m_server_buffer) inside on_client_data/on_server_data causes O(n) shifts per
packet; replace it with a non-shifting scheme by introducing a read offset index
(e.g., client_head/server_head) that advances by sizeof(mysql_hdr)+pkt_len
instead of erasing, and only compact/resize the underlying vector when the head
grows past a threshold (or alternatively switch the buffers to std::deque and
use pop_front for whole-packet removal). Update all uses of
m_client_buffer/m_server_buffer access to account for the head offset (or adapt
to deque semantics) and ensure mysql_hdr parsing still reads from the correct
offset; perform the same change for the erase at the other reported location.
lib/PgSQLFFTO.cpp (1)

96-118: Missing client 'C' (Close) handler — m_statements and m_portals grow without bound.

The PostgreSQL protocol identifies frontend type byte 'C' as a Close command, whose payload names the prepared statement or portal to close. Without handling this message, entries added to m_statements (via 'P') and m_portals (via 'B') are never erased. For long-running sessions with frequent prepare-and-close cycles, both maps grow monotonically for the session lifetime, leaking memory proportional to the number of statements created.

♻️ Proposed handler (add between the `'E'` and `'X'` branches)
+    } else if (type == 'C') {  // Frontend Close (not server CommandComplete)
+        if (len < 1) return;
+        char close_type = static_cast<char>(payload[0]);
+        size_t name_len = strnlen(reinterpret_cast<const char*>(payload) + 1, len - 1);
+        std::string name(reinterpret_cast<const char*>(payload) + 1, name_len);
+        if (close_type == 'S') {
+            m_statements.erase(name);
+        } else if (close_type == 'P') {
+            m_portals.erase(name);
+        }
     } else if (type == 'X') {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/PgSQLFFTO.cpp` around lines 96 - 118, Add a handler for frontend type 'C'
(Close) to remove prepared statements/portals so m_statements and m_portals do
not leak: parse the Close payload to read the target byte (e.g., 'S' for
prepared Statement, 'P' for Portal) then the null-terminated name string (const
char* name = reinterpret_cast<const char*>(payload) + 1) and call
m_statements.erase(name) for 'S' or m_portals.erase(name) for 'P'; also handle
the empty-name case (unnamed statement/portal) consistently and make sure this
branch is placed between the existing 'E' and 'X' branches so entries are
removed when the client sends Close.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/MySQLFFTO.cpp`:
- Around line 101-126: In process_client_packet, handle the COM_STMT_CLOSE case
so m_statements doesn't leak: detect the _MYSQL_COM_STMT_CLOSE command in
MySQLFFTO::process_client_packet, extract the 4-byte statement id (same approach
as in the _MYSQL_COM_STMT_EXECUTE branch), and call m_statements.erase(stmt_id)
(and any associated cleanup) to remove the statement entry from the cache;
ensure this is done safely only when len is sufficient and the stmt_id exists in
m_statements.

In `@lib/PgSQLFFTO.cpp`:
- Around line 121-138: process_server_message currently drops subsequent
CommandComplete messages and overwrites outstanding extended-query tracking:
modify the 'C' handling so it does not early-return on m_state == IDLE but
instead accumulates rows_sent/affected_rows across multiple 'C' messages (keep
m_state until a terminal 'Z' and sum rows returned by extract_pg_rows_affected
into a per-query accumulator), and change the 'E' (Execute) path to check if
m_state == AWAITING_RESPONSE before overwriting m_current_query and
m_query_start_time—if so, first report or enqueue the prior query's stats (using
report_query_stats) or push it onto a queue, then set
m_current_query/m_query_start_time for the new execute; use the existing symbols
(process_server_message, m_state, IDLE, AWAITING_RESPONSE, 'C', 'E',
m_current_query, m_query_start_time, extract_pg_rows_affected,
report_query_stats) to locate and implement these two fixes.

---

Duplicate comments:
In `@lib/MySQL_Session.cpp`:
- Line 652: The reset() method in MySQL_Session must clear the FFTO-related
state to avoid carrying stale values across COM_CHANGE_USER/RESET; update
MySQL_Session::reset() to set ffto_bypassed = false and clear m_ffto (e.g.
nullptr or equivalent reset) so both ffto_bypassed and m_ffto are explicitly
reinitialized during reset.
- Around line 6026-6035: The server→client feed must enforce
mysql_thread___ffto_max_buffer_size and tear down/bypass FFTO when exceeded:
before calling MySQLFFTO::on_server_data in the block guarded by
mysql_thread___ffto_enabled/ffto_bypassed and using m_ffto, check the size of
the incoming packet plus the FFTO's current buffered bytes and if that sum would
exceed mysql_thread___ffto_max_buffer_size then call the teardown/bypass logic
(reset/cleanup m_ffto and set ffto_bypassed) instead of feeding data; if
MySQLFFTO lacks an accessor for current buffered bytes add a small accessor
(e.g., getBufferedBytes()/buffered_size()) or maintain a session-side counter
(m_ffto_buffered_bytes) updated when calling on_server_data so you can compare
against mysql_thread___ffto_max_buffer_size and enforce bypass/teardown.

In `@lib/MySQLFFTO.cpp`:
- Line 159: The checks that treat a packet as the CLIENT_DEPRECATE_EOF
terminator are using first_byte == 0x00 and len checks that allow misidentifying
binary row packets; update both occurrences (the conditional guarded by
m_session->client_myds->myconn->options.client_flag & CLIENT_DEPRECATE_EOF) to
test for first_byte == 0xFE and require len >= 9 (to distinguish from a legacy
5-byte EOF), so that the code path that calls report_query_stats, resets m_state
and observes m_rows_sent only triggers for the proper 0xFE terminator and not
for normal OK or binary row packets.
- Line 159: process_server_packet dereferences m_session->client_myds->myconn
unsafely; mirror report_query_stats by first computing a safe boolean (e.g.,
deprecate_eof) at the top of process_server_packet after verifying m_session,
m_session->client_myds, and m_session->client_myds->myconn are non-null, then
replace direct checks like (m_session->client_myds->myconn->options.client_flag
& CLIENT_DEPRECATE_EOF) in the branches at the current lines with that computed
deprecate_eof variable to avoid null dereference.

In `@lib/PgSQLFFTO.cpp`:
- Around line 91-95: In PgSQLFFTO::process_client_message the Simple Query ('Q')
payload is being stored into m_current_query including the PostgreSQL framing
trailing NUL, which causes pgsql_query_digest_and_first_comment to see an
embedded null and produce differing digests; fix process_client_message so when
type == 'Q' you detect and remove the final NUL (e.g., if len>0 and
payload[len-1]==0) and construct m_current_query from len-1 bytes (otherwise use
len), then set m_query_start_time and m_state as before, ensuring downstream
callers like pgsql_query_digest_and_first_comment receive the query without the
trailing NUL.
- Around line 57-68: In on_client_data, validate the parsed msg_len before using
msg_len - 4 to avoid underflow: after computing msg_len = ntohl(msg_len) and
before checking buffer bounds or calling process_client_message, ensure msg_len
>= 4 (and that m_client_buffer.size() >= 1 + msg_len); if the frame is malformed
(msg_len < 4 or the size check fails) handle it safely (e.g., log/error and
discard the offending bytes or break the loop) instead of computing msg_len - 4
and calling process_client_message with an enormous length; update logic around
m_client_buffer, msg_len, process_client_message, and the buffer erase to only
proceed when msg_len is valid.
- Around line 70-81: In on_server_data, guard against malformed frames by
verifying msg_len >= 4 before computing payload length and calling
process_server_message; if msg_len < 4 (or msg_len + 1 exceeds
m_server_buffer.size()), treat the frame as invalid (log or drop bytes) and
remove/skip the offending bytes to avoid infinite loops. Update the logic in
PgSQLFFTO::on_server_data to check msg_len >= 4 after ntohl(msg_len) and only
call process_server_message(type, payload, msg_len - 4) when that condition
holds, otherwise advance/clean m_server_buffer appropriately to recover.
- Around line 96-118: The handlers for message types 'P', 'B', and 'E' construct
std::string from reinterpret_cast<const char*>(payload) without ensuring a NUL
exists inside the message buffer, causing possible buffer-overreads; update the
code in PgSQLFFTO.cpp to validate the payload buffer length (the received length
parameter for messages) before creating strings: use a bounded search (e.g.
std::memchr) to find the terminating '\0' within the payload length for the
statement name/portal name and for the subsequent query pointer arithmetic, only
build std::string when the terminator is found, and otherwise treat the message
as malformed (log/ignore/close via on_close()); ensure when computing stmt_ptr
and copying stmt_name you calculate offsets using the found length instead of
relying on C-string termination so m_statements, m_portals, m_current_query,
m_query_start_time and m_state updates only happen for validated inputs.
- Around line 31-44: The regex in extract_pg_rows_affected is both wrong (\\S*
greedily swallows digits causing multi-digit counts to be mis-captured) and is
being recompiled on every call; change the pattern to use a
non-greedy/word-boundary form that reliably captures the trailing row count
(e.g. "(INSERT|UPDATE|DELETE|SELECT|MOVE|FETCH)\\b.*?\\b(\\d+)") and make the
std::regex a single compiled instance instead of constructing it on each call
(declare it static const outside or as a static local in
extract_pg_rows_affected); keep the rest of the logic (matches checking,
std::stoull, setting is_select for SELECT/FETCH/MOVE) the same.

---

Nitpick comments:
In `@lib/MySQLFFTO.cpp`:
- Line 76: The erase call on m_client_buffer (and similar on m_server_buffer)
inside on_client_data/on_server_data causes O(n) shifts per packet; replace it
with a non-shifting scheme by introducing a read offset index (e.g.,
client_head/server_head) that advances by sizeof(mysql_hdr)+pkt_len instead of
erasing, and only compact/resize the underlying vector when the head grows past
a threshold (or alternatively switch the buffers to std::deque and use pop_front
for whole-packet removal). Update all uses of m_client_buffer/m_server_buffer
access to account for the head offset (or adapt to deque semantics) and ensure
mysql_hdr parsing still reads from the correct offset; perform the same change
for the erase at the other reported location.

In `@lib/PgSQLFFTO.cpp`:
- Around line 96-118: Add a handler for frontend type 'C' (Close) to remove
prepared statements/portals so m_statements and m_portals do not leak: parse the
Close payload to read the target byte (e.g., 'S' for prepared Statement, 'P' for
Portal) then the null-terminated name string (const char* name =
reinterpret_cast<const char*>(payload) + 1) and call m_statements.erase(name)
for 'S' or m_portals.erase(name) for 'P'; also handle the empty-name case
(unnamed statement/portal) consistently and make sure this branch is placed
between the existing 'E' and 'X' branches so entries are removed when the client
sends Close.

Comment on lines +101 to +126
void MySQLFFTO::process_client_packet(const unsigned char* data, size_t len) {
if (len == 0) return;
uint8_t command = data[0];
if (command == _MYSQL_COM_QUERY) {
m_current_query = std::string(reinterpret_cast<const char*>(data + 1), len - 1);
m_query_start_time = monotonic_time();
m_state = AWAITING_RESPONSE;
m_affected_rows = 0; m_rows_sent = 0;
} else if (command == _MYSQL_COM_STMT_PREPARE) {
m_pending_prepare_query = std::string(reinterpret_cast<const char*>(data + 1), len - 1);
m_state = AWAITING_PREPARE_OK;
} else if (command == _MYSQL_COM_STMT_EXECUTE) {
if (len >= 5) {
uint32_t stmt_id; memcpy(&stmt_id, data + 1, 4);
auto it = m_statements.find(stmt_id);
if (it != m_statements.end()) {
m_current_query = it->second;
m_query_start_time = monotonic_time();
m_state = AWAITING_RESPONSE;
m_affected_rows = 0; m_rows_sent = 0;
}
}
} else if (command == _MYSQL_COM_QUIT) {
on_close();
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

COM_STMT_CLOSE not handled — m_statements grows without bound.

When a client sends COM_STMT_CLOSE, the statement ID is removed from the server's statement cache, but m_statements retains the entry indefinitely. For long-lived connections with repeated prepare/close cycles this is an unbounded memory leak.

🛡️ Proposed fix
     } else if (command == _MYSQL_COM_QUIT) {
         on_close();
+    } else if (command == _MYSQL_COM_STMT_CLOSE) {
+        if (len >= 5) {
+            uint32_t stmt_id;
+            memcpy(&stmt_id, data + 1, 4);
+            m_statements.erase(stmt_id);
+        }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/MySQLFFTO.cpp` around lines 101 - 126, In process_client_packet, handle
the COM_STMT_CLOSE case so m_statements doesn't leak: detect the
_MYSQL_COM_STMT_CLOSE command in MySQLFFTO::process_client_packet, extract the
4-byte statement id (same approach as in the _MYSQL_COM_STMT_EXECUTE branch),
and call m_statements.erase(stmt_id) (and any associated cleanup) to remove the
statement entry from the cache; ensure this is done safely only when len is
sufficient and the stmt_id exists in m_statements.

Comment on lines +121 to +138
void PgSQLFFTO::process_server_message(char type, const unsigned char* payload, size_t len) {
if (m_state == IDLE) return;
if (type == 'C') {
unsigned long long duration = monotonic_time() - m_query_start_time;
bool is_select = false;
uint64_t rows = extract_pg_rows_affected(payload, len, is_select);
if (is_select) report_query_stats(m_current_query, duration, 0, rows);
else report_query_stats(m_current_query, duration, rows, 0);
m_state = IDLE;
} else if (type == 'Z' || type == 'E') {
// ReadyForQuery or ErrorResponse. We don't always want to report stats here if 'C' already did it.
// But if we didn't get 'C', report what we have.
if (m_state == AWAITING_RESPONSE) {
unsigned long long duration = monotonic_time() - m_query_start_time;
report_query_stats(m_current_query, duration);
}
m_state = IDLE;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Multi-statement and pipelined queries produce incorrect digest statistics.

Two related tracking gaps:

  1. Simple Query batches — For 'Q' with "SELECT 1; SELECT 2", the simple query protocol sends a single textual query string that is parsed and immediately executed. The server returns a CommandComplete per statement. After the first 'C' sets m_state = IDLE (Line 129), all subsequent 'C' messages for the same batch are dropped by the if (m_state == IDLE) return; guard (Line 122). The reported rows_sent / affected_rows therefore only reflects the first statement, silently discarding the rest.

  2. Pipelined Extended Query — A client may send Parse → Bind → Execute → Parse → Bind → Execute → Sync without waiting for responses. In the extended-query protocol, execution of SQL commands is divided into multiple steps; the state is represented by prepared statements and portals. The second 'E' (Execute) handler on Lines 110–113 unconditionally overwrites m_current_query and m_query_start_time while m_state is already AWAITING_RESPONSE. When the first CommandComplete arrives, it reports the second query's text with the second query's start time — the first query's tracking is silently lost.

For the pipelining case, an AWAITING_RESPONSE check in the 'E' branch (reporting or queuing the prior query before overwriting) would prevent silent data loss. The multi-statement case requires accumulating rows across all 'C' messages until 'Z', which is a more substantial state-machine change.

🧰 Tools
🪛 Cppcheck (2.19.0)

[warning] 138-138: If memory allocation fails, then there is a possible null pointer dereference

(nullPointerOutOfMemory)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/PgSQLFFTO.cpp` around lines 121 - 138, process_server_message currently
drops subsequent CommandComplete messages and overwrites outstanding
extended-query tracking: modify the 'C' handling so it does not early-return on
m_state == IDLE but instead accumulates rows_sent/affected_rows across multiple
'C' messages (keep m_state until a terminal 'Z' and sum rows returned by
extract_pg_rows_affected into a per-query accumulator), and change the 'E'
(Execute) path to check if m_state == AWAITING_RESPONSE before overwriting
m_current_query and m_query_start_time—if so, first report or enqueue the prior
query's stats (using report_query_stats) or push it onto a queue, then set
m_current_query/m_query_start_time for the new execute; use the existing symbols
(process_server_message, m_state, IDLE, AWAITING_RESPONSE, 'C', 'E',
m_current_query, m_query_start_time, extract_pg_rows_affected,
report_query_stats) to locate and implement these two fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments