FFTO Implementation: MySQL & PostgreSQL Protocol Observation in Fast-Forward#5393
FFTO Implementation: MySQL & PostgreSQL Protocol Observation in Fast-Forward#5393renecannao wants to merge 12 commits intov3.0from
Conversation
- Added TrafficObserver base interface - Implemented MySQLFFTO and PgSQLFFTO classes for protocol-aware observation - Integrated FFTO hooks into MySQL and PostgreSQL session handlers in FAST_FORWARD state - Added configuration variables: mysql-ffto_enabled, mysql-ffto_max_buffer_size, pgsql-ffto_enabled, pgsql-ffto_max_buffer_size - Implemented query digest extraction and reporting parity with internal stats - Added session-level bypass when packet size exceeds buffer threshold - Included architectural documentation in doc/ffto_design.md
- Implemented statement ID tracking for MySQL COM_STMT_PREPARE/EXECUTE - Implemented Parse/Bind/Execute portal mapping for PostgreSQL Extended Query - Corrected documentation regarding binary protocol support - Ensured metric parity for prepared statements in Fast Forward mode
- Added comprehensive MySQL FFTO test (text and binary protocols) - Added comprehensive PgSQL FFTO test (Simple and Extended Query protocols) - Added FFTO memory bypass logic test - Added bash orchestration scripts for each test group - Integrated new tests into the TAP Makefile
… scripts Following the CI convention where all binaries ending in -t in test/tap/tests/ are automatically executed, the subdirectories and run.sh scripts are no longer needed.
- Implemented MySQL OK_Packet parsing to extract affected_rows - Implemented PostgreSQL CommandComplete parsing to extract rows affected/sent - Updated TAP tests to validate sum_rows_affected and sum_rows_sent metrics - Fixed incorrect assertion in test_ffto_mysql-t.cpp
- Resolved signature mismatch in report_query_stats - Fixed missing type declarations for Query_Processor in FFTO classes - Verified Metric Parity for affected_rows and rows_sent - Confirmed full coverage in TAP tests for text/binary protocols and bypass logic - Verified clean build with make debug
…andling - Finalized MySQL and PostgreSQL protocol state machines. - Implemented accurate affected_rows and rows_sent capture. - Added defensive null checks to prevent early-session crashes. - Enhanced TAP tests with result-set draining and success verification. - Verified 100% pass rate for MySQL CRUD, binary protocol, and memory bypass.
📝 WalkthroughWalkthroughAdds FFTO (Fast Forward Traffic Observer) infrastructure: a TrafficObserver interface and MySQL/PgSQL concrete observers that parse fast-forwarded protocol data, extract query metrics (duration, affected rows, rows sent), report digests, and integrate into session/thread handlers with runtime controls and buffer-size bypassing. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Session as MySQL/PgSQL<br/>Session
participant Observer as MySQLFFTO/<br/>PgSQLFFTO
participant Backend
participant QP as Query<br/>Processor
rect rgba(120, 180, 240, 0.5)
Client->>Session: send query bytes (fast-forward)
activate Session
Session->>Observer: on_client_data(buf)
activate Observer
Observer->>Observer: parse/track query, start timer
deactivate Observer
Session->>Backend: forward bytes
deactivate Session
end
rect rgba(200, 160, 120, 0.5)
Backend->>Session: response bytes
activate Session
Session->>Observer: on_server_data(buf)
activate Observer
Observer->>Observer: parse response, extract rows/OK/CommandComplete
Observer->>Observer: finalize metrics
Observer->>QP: update_query_digest(digest, duration, rows_sent, affected_rows)
deactivate Observer
Session->>Client: forward response
deactivate Session
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRsPoem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @renecannao, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces the Fast Forward Traffic Observer (FFTO) into ProxySQL, enabling detailed query statistics collection for both MySQL and PostgreSQL connections even when they are operating in fast-forward mode. Previously, fast-forwarded connections bypassed all statistics gathering. The FFTO passively observes the protocol streams to extract critical metadata like SQL digests, execution latency, affected rows, and rows sent, ensuring metric parity with standard query processing while maintaining high performance and resource safety through a configurable bypass mechanism for large payloads. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces the Fast Forward Traffic Observer (FFTO) for both MySQL and PostgreSQL, a significant feature enhancing observability for connections in Fast-Forward mode. The implementation is well-structured, introducing a TrafficObserver interface and specific implementations for each database, along with necessary changes to session management and configuration. The addition of documentation and comprehensive tests is commendable. My review focuses on improving code maintainability by addressing duplicated logic, fixing minor style issues, removing a temporary log message, and replacing unsafe function calls in the new tests to align with best practices.
|
|
||
| // Ensure user exists | ||
| char user_query[1024]; | ||
| sprintf(user_query, "INSERT OR REPLACE INTO mysql_users (username, password, default_hostgroup, fast_forward) VALUES ('%s', '%s', 0, 1)", cl.username, cl.password); |
There was a problem hiding this comment.
Using sprintf is unsafe as it doesn't perform bounds checking, which can lead to buffer overflows. It's recommended to use snprintf instead to ensure safety. This issue is also present on line 39.
| sprintf(user_query, "INSERT OR REPLACE INTO mysql_users (username, password, default_hostgroup, fast_forward) VALUES ('%s', '%s', 0, 1)", cl.username, cl.password); | |
| snprintf(user_query, sizeof(user_query), "INSERT OR REPLACE INTO mysql_users (username, password, default_hostgroup, fast_forward) VALUES ('%s', '%s', 0, 1)", cl.username, cl.password); |
|
|
||
| // Ensure backend server exists | ||
| char server_query[1024]; | ||
| sprintf(server_query, "INSERT OR REPLACE INTO mysql_servers (hostgroup_id, hostname, port) VALUES (0, '%s', %d)", cl.mysql_host, cl.mysql_port); |
There was a problem hiding this comment.
Using sprintf is unsafe as it doesn't perform bounds checking, which can lead to buffer overflows. It's recommended to use snprintf instead to ensure safety.
| sprintf(server_query, "INSERT OR REPLACE INTO mysql_servers (hostgroup_id, hostname, port) VALUES (0, '%s', %d)", cl.mysql_host, cl.mysql_port); | |
| snprintf(server_query, sizeof(server_query), "INSERT OR REPLACE INTO mysql_servers (hostgroup_id, hostname, port) VALUES (0, '%s', %d)", cl.mysql_host, cl.mysql_port); |
| void verify_digest(MYSQL* admin, const char* template_text, int expected_count, uint64_t expected_rows_affected = 0, uint64_t expected_rows_sent = 0) { | ||
| char query[1024]; | ||
| // Use a more relaxed LIKE pattern to handle potential normalization differences | ||
| sprintf(query, "SELECT count_star, sum_rows_affected, sum_rows_sent, digest_text FROM stats_mysql_query_digest WHERE digest_text LIKE '%%%s%%'", template_text); |
There was a problem hiding this comment.
Using sprintf is unsafe as it doesn't perform bounds checking, which can lead to buffer overflows. It's recommended to use snprintf instead to ensure safety. This issue is also present on line 85.
snprintf(query, sizeof(query), "SELECT count_star, sum_rows_affected, sum_rows_sent, digest_text FROM stats_mysql_query_digest WHERE digest_text LIKE '%%%s%%'", template_text);|
|
||
| // Ensure backend server exists | ||
| char server_query[1024]; | ||
| sprintf(server_query, "INSERT OR REPLACE INTO mysql_servers (hostgroup_id, hostname, port) VALUES (0, '%s', %d)", cl.mysql_host, cl.mysql_port); |
There was a problem hiding this comment.
Using sprintf is unsafe as it doesn't perform bounds checking, which can lead to buffer overflows. It's recommended to use snprintf instead to ensure safety.
snprintf(server_query, sizeof(server_query), "INSERT OR REPLACE INTO mysql_servers (hostgroup_id, hostname, port) VALUES (0, '%s', %d)", cl.mysql_host, cl.mysql_port);|
|
||
| void verify_pg_digest(MYSQL* admin, const char* template_text, int expected_count, uint64_t expected_rows_affected = 0, uint64_t expected_rows_sent = 0) { | ||
| char query[1024]; | ||
| sprintf(query, "SELECT count_star, sum_rows_affected, sum_rows_sent, digest_text FROM stats_pgsql_query_digest WHERE digest_text LIKE '%%%s%%'", template_text); |
There was a problem hiding this comment.
Using sprintf is unsafe as it doesn't perform bounds checking, which can lead to buffer overflows. It's recommended to use snprintf instead to ensure safety. This issue is also present on lines 84 and 92.
snprintf(query, sizeof(query), "SELECT count_star, sum_rows_affected, sum_rows_sent, digest_text FROM stats_pgsql_query_digest WHERE digest_text LIKE '%%%s%%'", template_text);| sprintf(conninfo, "host=%s port=%d user=%s password=%s dbname=postgres sslmode=disable", | ||
| cl.pgsql_host, cl.pgsql_port, cl.pgsql_root_username, cl.pgsql_root_password); |
There was a problem hiding this comment.
Using sprintf is unsafe as it doesn't perform bounds checking, which can lead to buffer overflows. It's recommended to use snprintf instead to ensure safety.
snprintf(conninfo, sizeof(conninfo), "host=%s port=%d user=%s password=%s dbname=postgres sslmode=disable",
cl.pgsql_host, cl.pgsql_port, cl.pgsql_root_username, cl.pgsql_root_password);| std::unique_ptr<TrafficObserver> m_ffto; | ||
| bool ffto_bypassed; |
| /** | ||
| * @class PgSQLFFTO | ||
| * @brief PostgreSQL-specific implementation of TrafficObserver. | ||
| */ |
| if (mysql_thread___ffto_enabled && !ffto_bypassed) { | ||
| if (pkt.size > (size_t)mysql_thread___ffto_max_buffer_size) { | ||
| ffto_bypassed = true; | ||
| m_ffto.reset(); | ||
| } else { | ||
| if (!m_ffto) { | ||
| m_ffto = std::make_unique<MySQLFFTO>(this); | ||
| } | ||
| if (m_ffto) { | ||
| m_ffto->on_client_data((const char*)pkt.ptr, pkt.size); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
This block of code for handling FFTO logic is duplicated in a few places in this file (e.g., GPFC_Statuses2 for the CONNECTING_SERVER case, and in GPFC_WaitingClientData_FastForwardSession). To improve maintainability and reduce redundancy, consider extracting this logic into a private helper method within the MySQL_Session class.
| } | ||
| #endif /* PROXYSQLGENAI */ | ||
| ProxySQL_Main_init_Admin_module(boostrap_info); | ||
| proxy_info("TEST_LOG: ProxySQL is starting\n"); |
There was a problem hiding this comment.
Actionable comments posted: 16
🧹 Nitpick comments (5)
include/MySQLFFTO.hpp (1)
31-31: Useoverrideinstead ofvirtualon the destructor for consistency.
virtualis redundant here because the base already declares the destructorvirtual;overrideis more explicit and catches base-class changes.♻️ Proposed fix
- virtual ~MySQLFFTO(); + ~MySQLFFTO() override;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@include/MySQLFFTO.hpp` at line 31, The destructor declaration "virtual ~MySQLFFTO();" should use override for clarity and safety; replace it with "~MySQLFFTO() override;" in the MySQLFFTO class declaration so the compiler enforces that it overrides a virtual base destructor (refer to the MySQLFFTO::~MySQLFFTO symbol).include/TrafficObserver.hpp (2)
16-16: Prefer= defaultfor the virtual destructor.
virtual ~TrafficObserver() {}is functionally equivalent but= defaultis the idiomatic modern C++ form.♻️ Proposed fix
- virtual ~TrafficObserver() {} + virtual ~TrafficObserver() = default;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@include/TrafficObserver.hpp` at line 16, Replace the empty user-defined virtual destructor implementation for the TrafficObserver interface with the modern defaulted form: change the virtual destructor in class TrafficObserver from the empty-body form (virtual ~TrafficObserver() {}) to the defaulted declaration (virtual ~TrafficObserver() = default;), keeping it virtual to preserve polymorphic deletion semantics.
4-4: Prefer<cstddef>over<stddef.h>in C++ code.
<stddef.h>is the C header;<cstddef>is the C++ counterpart placingsize_tcleanly instd::.♻️ Proposed fix
-#include <stddef.h> +#include <cstddef>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@include/TrafficObserver.hpp` at line 4, Replace the C header include with the C++ equivalent: change the include of <stddef.h> to <cstddef> in the header that declares TrafficObserver (so symbols like size_t come from the C++ header); ensure any uses of size_t either qualify as std::size_t or have an appropriate using-declaration if needed.include/MySQL_Thread.h (1)
418-419:ffto_*fields placed in the middle of themonitor_*variables block.
ffto_enabledandffto_max_buffer_sizeare inserted betweenmonitor_replication_lag_group_by_hostand themonitor_replication_lag_intervalgroup, fragmenting the logical monitor-variable cluster. Moving them to the end of the struct (or alongside the other non-monitor feature booleans such ascommands_stats,servers_stats, etc.) would improve scanability and make the grouping intent clear.Additionally,
int ffto_max_buffer_sizepermits negative values; while the existing codebase usesintfor many size fields, downstream consumers will need to guard against<= 0inputs. Ensureset_variable/refresh_variablesvalidates the range accordingly.♻️ Suggested placement (move to end of struct, before closing brace at ~line 600)
- bool ffto_enabled; - int ffto_max_buffer_size; //! How frequently a replication lag check is performed. Unit: 'ms'. int monitor_replication_lag_interval; // ... rest of monitor_* block unchanged ... + + // FFTO (Fast-Forward Traffic Observer) + bool ffto_enabled; + int ffto_max_buffer_size; } variables;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@include/MySQL_Thread.h` around lines 418 - 419, Move the two fields ffto_enabled and ffto_max_buffer_size out of the middle of the monitor_* block and place them with other non-monitor feature flags (for example alongside commands_stats and servers_stats) at the end of the struct to restore logical grouping; then add validation in set_variable and refresh_variables to guard ffto_max_buffer_size against non-positive values (treat <=0 as invalid and either clamp to a safe default or return an error), ensuring any assignment path checks ffto_max_buffer_size before use.include/PgSQLFFTO.hpp (1)
11-22: Remove the redundant first Doxygen block.Lines 11–14 duplicate the class documentation already provided in lines 15–22. Having two
@class/@briefentries for the same symbol confuses Doxygen and pollutes generated docs.♻️ Proposed fix
-/** - * `@class` PgSQLFFTO - * `@brief` PostgreSQL-specific implementation of TrafficObserver. - */ /** * `@class` PgSQLFFTO * `@brief` Observer class for PostgreSQL traffic in Fast-Forward mode.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@include/PgSQLFFTO.hpp` around lines 11 - 22, Remove the duplicate Doxygen block for the PgSQLFFTO class: keep the more detailed second block (the one that starts with "Observer class for PostgreSQL traffic in Fast-Forward mode") and delete the earlier, redundant /** `@class` PgSQLFFTO `@brief` ... */ block so there is only a single Doxygen comment describing PgSQLFFTO.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@doc/ffto_design.md`:
- Around line 86-88: The MySQL bullet contains a grammar/typo issue: change
"Corrects captures `COM_STMT_PREPARE` and `COM_STMT_EXECUTE`, tracking statement
IDs to their respective SQL text." to a clear sentence such as "MySQL Binary
Protocol: captures `COM_STMT_PREPARE` and `COM_STMT_EXECUTE`, tracking statement
IDs to their respective SQL text."—update the wording around the
`COM_STMT_PREPARE` and `COM_STMT_EXECUTE` references so the verb and subject
agree and the sentence reads smoothly.
In `@include/MySQL_Session.h`:
- Around line 489-494: The new class member declarations (Session_Regex
**match_regexes, std::unique_ptr<TrafficObserver> m_ffto, bool ffto_bypassed,
ProxySQL_Node_Address * proxysql_node_address) are over-indented; align them
with the surrounding members (use the same single-tab indentation as members
like server_myds and use_ldap_auth) so their indentation level matches the class
style and improves readability.
In `@include/MySQLFFTO.hpp`:
- Line 76: process_client_packet currently doesn't handle _MYSQL_COM_STMT_CLOSE,
so m_statements (the unordered_map of stmt_id to SQL) is never cleared; add a
branch in process_client_packet to detect command == _MYSQL_COM_STMT_CLOSE,
extract the 4‑byte stmt_id from the packet payload (little‑endian uint32), and
call m_statements.erase(stmt_id) to remove the prepared statement entry (this
complements where process_server_packet inserts entries for prepared
statements).
In `@lib/MySQL_Session.cpp`:
- Around line 6026-6035: The current server→client handling ignores the session
bypass and buffer-size limit: ensure you only run FFTO when
mysql_thread___ffto_enabled is true AND ffto_bypassed is false, do not recreate
m_ffto here (remove/avoid the std::make_unique<MySQLFFTO>(this) in this block —
creation should remain at session init), and before calling
m_ffto->on_server_data(...) enforce mysql_thread___ffto_max_buffer_size by
skipping or truncating payloads whose .size exceeds that threshold; operate on
mybe->server_myds->PSarrayIN->pdata as shown but gated by these checks so you
honor the session-level bypass and the max buffer size.
- Line 652: reset() currently doesn’t clear session-level FFTO state, so add
explicit resets for ffto_bypassed and m_ffto after the existing fast-forward
state cleanup; update MySQL_Session::reset() to set ffto_bypassed = false and
m_ffto = nullptr (or equivalent cleared state for m_ffto) so stale FFTO observer
state and the bypass flag cannot persist across COM_CHANGE_USER or
COM_RESET_CONNECTION.
In `@lib/MySQLFFTO.cpp`:
- Line 159: The code unsafely dereferences
m_session->client_myds->myconn->options.client_flag in process_server_packet
(used at the first_byte checks) without the same null guards present in
report_query_stats; compute a local bool deprecate_eof at the start of
process_server_packet by checking m_session, m_session->client_myds and
m_session->client_myds->myconn before reading options.client_flag &
CLIENT_DEPRECATE_EOF, then replace both direct dereferences (the branches that
check first_byte == 0x00 and the other occurrence) with this local deprecate_eof
variable so the pointer chain is only evaluated once and safely guarded.
- Around line 162-176: The check for CLIENT_DEPRECATE_EOF uses the wrong packet
header byte: replace comparisons of first_byte == 0x00 with first_byte == 0xFE
in the EOF-detection logic so OK packets are correctly detected when
CLIENT_DEPRECATE_EOF is set; specifically update the condition in the
READING_COLUMNS handling and the condition in the READING_ROWS handling that
reference (m_session->client_myds->myconn->options.client_flag &
CLIENT_DEPRECATE_EOF) and first_byte, ensuring m_state transitions and
report_query_stats() are only triggered when first_byte == 0xFE.
In `@lib/PgSQL_Session.cpp`:
- Around line 2288-2300: Server-side FFTO processing currently calls
m_ffto->on_server_data(...) without bounding packet size; add the same guard
used for client packets: when pgsql_thread___ffto_enabled && !ffto_bypassed and
before calling PgSQLFFTO::on_server_data, check if pkt.size >
(size_t)pgsql_thread___ffto_max_buffer_size; if so set ffto_bypassed = true and
m_ffto.reset(), otherwise ensure m_ffto is created
(std::make_unique<PgSQLFFTO>(this)) if needed and then call
m_ffto->on_server_data((const char*)pkt.ptr, pkt.size). This mirrors the
client-side logic around pgsql_thread___ffto_max_buffer_size, ffto_bypassed, and
m_ffto to prevent unbounded server-side buffering.
In `@lib/PgSQLFFTO.cpp`:
- Around line 60-67: The loop reads a uint32_t msg_len and then passes msg_len -
4 to process_client_message (and similarly in on_server_data), which underflows
for msg_len < 4; add an explicit guard after decoding msg_len that
rejects/handles any msg_len < 4 (and optionally an upper-bound sanity check)
before computing msg_len - 4 or using it as a size_t, and return/skip/close the
connection on malformed frames; update the client loop using m_client_buffer,
the call to process_client_message(type, payload, msg_len - 4) and the
corresponding server loop/on_server_data that calls process_server_message to
use this validation so no underflow/OOB occurs.
- Line 93: The code sets m_current_query using all len bytes from the 'Q'
payload, which includes the PostgreSQL trailing NUL, causing
pgsql_query_digest_and_first_comment to see a different string; change the
assignment to strip the trailing NUL when present (e.g., check len>0 and
payload[len-1]==0 and use length (len-1) when constructing m_current_query, or
use strlen on payload) so m_current_query does not include the framing NUL
before calling pgsql_query_digest_and_first_comment.
- Around line 31-44: The function extract_pg_rows_affected builds command_tag
then uses a flawed regex that mis-captures multi-digit counts and re-compiles on
every call; replace the regex approach with simple string parsing: trim
command_tag, find the last space (or last token) to extract the numeric suffix
as rows (use std::stoull) and determine the command type by examining the
leading token (e.g., compare the first word to
"SELECT","FETCH","MOVE","UPDATE","DELETE","INSERT") to set is_select
appropriately; if you prefer to keep regex, make the regex static and change it
to unambiguous pattern (e.g., match start-of-string command word then \\s+ then
(\\d+)$) to both fix correctness and avoid recompiling per call—locate
extract_pg_rows_affected and command_tag to apply the change.
- Around line 96-118: Parsing of 'P', 'B', and 'E' messages reads
null-terminated strings without bounds checks causing buffer over-reads; update
the Parse/Bind/Execute handling to use strnlen (or equivalent) against the
available payload length to verify each null terminator lies within the provided
length before creating std::string or advancing pointers. For the 'P' case
validate stmt_name length+1 <= payload_len then create stmt_name with the
bounded length and ensure the query pointer + query_length is inside payload;
for the 'B' case validate portal_name and stmt_name each terminate within
remaining bytes before assigning m_portals[portal_name] = stmt_name; for the 'E'
case validate portal_name termination before lookup. On any validation failure,
handle safely (e.g., call on_close() or drop the message) instead of performing
the unchecked reads; refer to symbols m_statements, m_portals, m_current_query,
m_query_start_time, monotonic_time and on_close to locate where to add these
checks.
In `@src/main.cpp`:
- Line 1508: Remove the test-only log artifact from production by eliminating or
conditionalizing the proxy_info call that prints "TEST_LOG: ProxySQL is
starting\n" in main.cpp: either delete the proxy_info("TEST_LOG: ...")
invocation entirely, or wrap it in a test-only conditional (e.g., `#ifdef`
TEST_BUILD / runtime test flag) so that proxy_info is not emitted in production;
ensure any remaining startup logs use production-safe messages or move test-only
signaling into the test harness instead of using the proxy_info call.
In `@test/tap/tests/test_ffto_bypass-t.cpp`:
- Around line 52-64: The test executes a large query with mysql_query(conn,
large_query.c_str()) without asserting it succeeded and then checks for a
fragile LIKE match; change to use the test helper/MACRO that asserts execution
(e.g., MYSQL_QUERY or equivalent) to ensure the query actually ran, and replace
the run_q call that uses "SELECT count(*) FROM stats_mysql_query_digest WHERE
digest_text LIKE '%xxxx%'" with a simple total count query "SELECT count(*) FROM
stats_mysql_query_digest" and assert that this total count is 0 (use the
existing mysql_store_result/mysql_fetch_row/ok pattern to read and assert the
count).
In `@test/tap/tests/test_ffto_mysql-t.cpp`:
- Around line 25-57: In verify_digest, ensure the TAP plan emits the same number
of ok() assertions on all paths: when run_q(admin, query) fails (rc != 0) and
when no row is returned, add the two missing ok() calls (matching the
success-path checks for rows_affected and rows_sent) so the function always
emits three ok() calls; update the failure messages to indicate the digest
lookup failed for the given template_text and keep references to the same
diagnostic output; modify the branches around run_q(admin, query) and the "no
row" else to call ok(...) twice more (for expected_rows_affected and
expected_rows_sent) in addition to the existing failure ok() to preserve TAP
plan stability.
In `@test/tap/tests/test_ffto_pgsql-t.cpp`:
- Around line 27-57: The failure branch in verify_pg_digest currently calls only
one ok(0, ...) when no row is found, which desyncs the TAP plan; update the else
branch in verify_pg_digest to also emit the two additional ok(0, ...) assertions
for rows_affected and rows_sent (using template_text and
expected_rows_affected/expected_rows_sent in the messages, similar to the
success branch) so the total number of ok() calls matches both success and
failure paths, and keep existing diagnostics and mysql_free_result/res cleanup
unchanged.
---
Nitpick comments:
In `@include/MySQL_Thread.h`:
- Around line 418-419: Move the two fields ffto_enabled and ffto_max_buffer_size
out of the middle of the monitor_* block and place them with other non-monitor
feature flags (for example alongside commands_stats and servers_stats) at the
end of the struct to restore logical grouping; then add validation in
set_variable and refresh_variables to guard ffto_max_buffer_size against
non-positive values (treat <=0 as invalid and either clamp to a safe default or
return an error), ensuring any assignment path checks ffto_max_buffer_size
before use.
In `@include/MySQLFFTO.hpp`:
- Line 31: The destructor declaration "virtual ~MySQLFFTO();" should use
override for clarity and safety; replace it with "~MySQLFFTO() override;" in the
MySQLFFTO class declaration so the compiler enforces that it overrides a virtual
base destructor (refer to the MySQLFFTO::~MySQLFFTO symbol).
In `@include/PgSQLFFTO.hpp`:
- Around line 11-22: Remove the duplicate Doxygen block for the PgSQLFFTO class:
keep the more detailed second block (the one that starts with "Observer class
for PostgreSQL traffic in Fast-Forward mode") and delete the earlier, redundant
/** `@class` PgSQLFFTO `@brief` ... */ block so there is only a single Doxygen
comment describing PgSQLFFTO.
In `@include/TrafficObserver.hpp`:
- Line 16: Replace the empty user-defined virtual destructor implementation for
the TrafficObserver interface with the modern defaulted form: change the virtual
destructor in class TrafficObserver from the empty-body form (virtual
~TrafficObserver() {}) to the defaulted declaration (virtual ~TrafficObserver()
= default;), keeping it virtual to preserve polymorphic deletion semantics.
- Line 4: Replace the C header include with the C++ equivalent: change the
include of <stddef.h> to <cstddef> in the header that declares TrafficObserver
(so symbols like size_t come from the C++ header); ensure any uses of size_t
either qualify as std::size_t or have an appropriate using-declaration if
needed.
| - **Text and Binary Protocols**: FFTO supports both standard text-based queries and the binary protocol used by prepared statements. | ||
| - **MySQL Binary Protocol**: Corrects captures `COM_STMT_PREPARE` and `COM_STMT_EXECUTE`, tracking statement IDs to their respective SQL text. | ||
| - **PostgreSQL Extended Query**: Supports the multi-phase `Parse` -> `Bind` -> `Execute` sequence by tracking Statement and Portal mappings. |
There was a problem hiding this comment.
Typo in MySQL binary protocol bullet.
Small grammar fix improves readability.
✏️ Suggested edit
- **MySQL Binary Protocol**: Corrects captures `COM_STMT_PREPARE` and `COM_STMT_EXECUTE`, tracking statement IDs to their respective SQL text.
+ **MySQL Binary Protocol**: Correctly captures `COM_STMT_PREPARE` and `COM_STMT_EXECUTE`, tracking statement IDs to their respective SQL text.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - **Text and Binary Protocols**: FFTO supports both standard text-based queries and the binary protocol used by prepared statements. | |
| - **MySQL Binary Protocol**: Corrects captures `COM_STMT_PREPARE` and `COM_STMT_EXECUTE`, tracking statement IDs to their respective SQL text. | |
| - **PostgreSQL Extended Query**: Supports the multi-phase `Parse` -> `Bind` -> `Execute` sequence by tracking Statement and Portal mappings. | |
| - **Text and Binary Protocols**: FFTO supports both standard text-based queries and the binary protocol used by prepared statements. | |
| - **MySQL Binary Protocol**: Correctly captures `COM_STMT_PREPARE` and `COM_STMT_EXECUTE`, tracking statement IDs to their respective SQL text. | |
| - **PostgreSQL Extended Query**: Supports the multi-phase `Parse` -> `Bind` -> `Execute` sequence by tracking Statement and Portal mappings. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@doc/ffto_design.md` around lines 86 - 88, The MySQL bullet contains a
grammar/typo issue: change "Corrects captures `COM_STMT_PREPARE` and
`COM_STMT_EXECUTE`, tracking statement IDs to their respective SQL text." to a
clear sentence such as "MySQL Binary Protocol: captures `COM_STMT_PREPARE` and
`COM_STMT_EXECUTE`, tracking statement IDs to their respective SQL text."—update
the wording around the `COM_STMT_PREPARE` and `COM_STMT_EXECUTE` references so
the verb and subject agree and the sentence reads smoothly.
| Session_Regex **match_regexes; | ||
| std::unique_ptr<TrafficObserver> m_ffto; | ||
| bool ffto_bypassed; | ||
|
|
||
| ProxySQL_Node_Address * proxysql_node_address; | ||
|
|
There was a problem hiding this comment.
Indentation inconsistency — new members are over-indented relative to surrounding class members.
Lines 489–494 carry 3 levels of indentation while adjacent members (server_myds at line 411, use_ldap_auth at line 496) use a single tab.
♻️ Proposed fix
- Session_Regex **match_regexes;
- std::unique_ptr<TrafficObserver> m_ffto;
- bool ffto_bypassed;
-
- ProxySQL_Node_Address * proxysql_node_address;
-
+ Session_Regex **match_regexes;
+ std::unique_ptr<TrafficObserver> m_ffto;
+ bool ffto_bypassed = false;
+ ProxySQL_Node_Address * proxysql_node_address;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Session_Regex **match_regexes; | |
| std::unique_ptr<TrafficObserver> m_ffto; | |
| bool ffto_bypassed; | |
| ProxySQL_Node_Address * proxysql_node_address; | |
| Session_Regex **match_regexes; | |
| std::unique_ptr<TrafficObserver> m_ffto; | |
| bool ffto_bypassed = false; | |
| ProxySQL_Node_Address * proxysql_node_address; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@include/MySQL_Session.h` around lines 489 - 494, The new class member
declarations (Session_Regex **match_regexes, std::unique_ptr<TrafficObserver>
m_ffto, bool ffto_bypassed, ProxySQL_Node_Address * proxysql_node_address) are
over-indented; align them with the surrounding members (use the same single-tab
indentation as members like server_myds and use_ldap_auth) so their indentation
level matches the class style and improves readability.
| uint64_t m_affected_rows; ///< Number of rows affected by the current DML query. | ||
| uint64_t m_rows_sent; ///< Number of rows returned by the current SELECT query. | ||
|
|
||
| std::unordered_map<uint32_t, std::string> m_statements; ///< Map of stmt_id to original SQL query for prepared statements. |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Search for COM_STMT_CLOSE handling and m_statements usage
rg -n "COM_STMT_CLOSE" --type cpp -C 5Repository: sysown/proxysql
Length of output: 4706
🏁 Script executed:
# Search for m_statements erasure/cleanup patterns
rg -n "m_statements.*erase|erase.*m_statements" --type cpp -B 2 -A 2Repository: sysown/proxysql
Length of output: 41
🏁 Script executed:
# Find MySQLFFTO.cpp implementation
fd -e cpp "MySQLFFTO"Repository: sysown/proxysql
Length of output: 76
🏁 Script executed:
# First, check the size and structure of MySQLFFTO.cpp
wc -l lib/MySQLFFTO.cppRepository: sysown/proxysql
Length of output: 80
🏁 Script executed:
# Search for process_client_packet or similar methods in MySQLFFTO.cpp
rg -n "process_client_packet|COM_STMT_CLOSE|m_statements" lib/MySQLFFTO.cpp -B 2 -A 5Repository: sysown/proxysql
Length of output: 1646
🏁 Script executed:
cat -n lib/MySQLFFTO.cppRepository: sysown/proxysql
Length of output: 11182
Add COM_STMT_CLOSE handler to remove entries from m_statements.
The process_client_packet method does not handle _MYSQL_COM_STMT_CLOSE commands. The m_statements map (which stores prepared statement IDs) is populated when statements are prepared (line 135 in process_server_packet) but never cleared. In long-lived fast-forward sessions with frequent prepared-statement churn, this results in unbounded map growth for the session's lifetime.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@include/MySQLFFTO.hpp` at line 76, process_client_packet currently doesn't
handle _MYSQL_COM_STMT_CLOSE, so m_statements (the unordered_map of stmt_id to
SQL) is never cleared; add a branch in process_client_packet to detect command
== _MYSQL_COM_STMT_CLOSE, extract the 4‑byte stmt_id from the packet payload
(little‑endian uint32), and call m_statements.erase(stmt_id) to remove the
prepared statement entry (this complements where process_server_packet inserts
entries for prepared statements).
| user_attributes=NULL; | ||
| schema_locked=false; | ||
| session_fast_forward=SESSION_FORWARD_TYPE_NONE; | ||
| ffto_bypassed=false; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find MySQL_Session::reset definition
rg -n "void MySQL_Session::reset" lib/MySQL_Session.cpp
# Check COM_CHANGE_USER and COM_RESET_CONNECTION usage in the file
rg -n "COM_CHANGE_USER|COM_RESET_CONNECTION" lib/MySQL_Session.cpp
# Find all occurrences of ffto_bypassed and m_ffto
rg -n "ffto_bypassed|m_ffto" --glob="*.cpp" --glob="*.hpp"Repository: sysown/proxysql
Length of output: 4672
🏁 Script executed:
# Read the MySQL_Session::reset() method starting at line 709
sed -n '709,800p' lib/MySQL_Session.cpp | cat -nRepository: sysown/proxysql
Length of output: 2935
🏁 Script executed:
# Read the COM_CHANGE_USER handler
sed -n '8313,8400p' lib/MySQL_Session.cpp | cat -n
# Also check COM_RESET_CONNECTION handler
sed -n '8398,8450p' lib/MySQL_Session.cpp | cat -nRepository: sysown/proxysql
Length of output: 7001
Reset FFTO session state in MySQL_Session::reset().
ffto_bypassed and m_ffto are session-level state not cleared by reset(). Since COM_CHANGE_USER and COM_RESET_CONNECTION invoke reset() to establish a new logical session, stale FFTO observer state or the bypassed flag could persist across session resets, skewing metrics or unexpectedly skipping FFTO in the new session.
Add the following lines to reset() after the existing fast-forward state cleanup:
backend_closed_in_fast_forward = false;
fast_forward_grace_start_time = 0;
+ ffto_bypassed = false;
+ m_ffto.reset();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/MySQL_Session.cpp` at line 652, reset() currently doesn’t clear
session-level FFTO state, so add explicit resets for ffto_bypassed and m_ffto
after the existing fast-forward state cleanup; update MySQL_Session::reset() to
set ffto_bypassed = false and m_ffto = nullptr (or equivalent cleared state for
m_ffto) so stale FFTO observer state and the bypass flag cannot persist across
COM_CHANGE_USER or COM_RESET_CONNECTION.
lib/MySQL_Session.cpp
Outdated
| if (mysql_thread___ffto_enabled) { | ||
| if (!m_ffto) { | ||
| m_ffto = std::make_unique<MySQLFFTO>(this); | ||
| } | ||
| if (m_ffto) { | ||
| for (unsigned int i = 0; i < mybe->server_myds->PSarrayIN->len; i++) { | ||
| m_ffto->on_server_data((const char*)mybe->server_myds->PSarrayIN->pdata[i].ptr, mybe->server_myds->PSarrayIN->pdata[i].size); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Honor FFTO bypass and size limits for server→client packets.
This block ignores ffto_bypassed and recreates m_ffto after a bypass, which defeats the session-level safety switch. It also doesn’t apply mysql_thread___ffto_max_buffer_size to server payloads, so large resultsets can still drive excessive buffering.
Recommend guarding with !ffto_bypassed, avoiding re-creation here, and applying the same size threshold to server packets.
🐛 Suggested fix
- if (mysql_thread___ffto_enabled) {
- if (!m_ffto) {
- m_ffto = std::make_unique<MySQLFFTO>(this);
- }
- if (m_ffto) {
- for (unsigned int i = 0; i < mybe->server_myds->PSarrayIN->len; i++) {
- m_ffto->on_server_data((const char*)mybe->server_myds->PSarrayIN->pdata[i].ptr, mybe->server_myds->PSarrayIN->pdata[i].size);
- }
- }
- }
+ if (mysql_thread___ffto_enabled && !ffto_bypassed && m_ffto) {
+ for (unsigned int i = 0; i < mybe->server_myds->PSarrayIN->len; i++) {
+ const auto& p = mybe->server_myds->PSarrayIN->pdata[i];
+ if (p.size > (size_t)mysql_thread___ffto_max_buffer_size) {
+ ffto_bypassed = true;
+ m_ffto.reset();
+ break;
+ }
+ m_ffto->on_server_data((const char*)p.ptr, p.size);
+ }
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/MySQL_Session.cpp` around lines 6026 - 6035, The current server→client
handling ignores the session bypass and buffer-size limit: ensure you only run
FFTO when mysql_thread___ffto_enabled is true AND ffto_bypassed is false, do not
recreate m_ffto here (remove/avoid the std::make_unique<MySQLFFTO>(this) in this
block — creation should remain at session init), and before calling
m_ffto->on_server_data(...) enforce mysql_thread___ffto_max_buffer_size by
skipping or truncating payloads whose .size exceeds that threshold; operate on
mybe->server_myds->PSarrayIN->pdata as shown but gated by these checks so you
honor the session-level bypass and the max buffer size.
| } else if (type == 'P') { | ||
| std::string stmt_name = reinterpret_cast<const char*>(payload); | ||
| const char* query = reinterpret_cast<const char*>(payload) + stmt_name.length() + 1; | ||
| m_statements[stmt_name] = query; | ||
| } else if (type == 'B') { | ||
| std::string portal_name = reinterpret_cast<const char*>(payload); | ||
| const char* stmt_ptr = reinterpret_cast<const char*>(payload) + portal_name.length() + 1; | ||
| std::string stmt_name = stmt_ptr; | ||
| m_portals[portal_name] = stmt_name; | ||
| } else if (type == 'E') { | ||
| std::string portal_name = reinterpret_cast<const char*>(payload); | ||
| auto pit = m_portals.find(portal_name); | ||
| if (pit != m_portals.end()) { | ||
| auto sit = m_statements.find(pit->second); | ||
| if (sit != m_statements.end()) { | ||
| m_current_query = sit->second; | ||
| m_query_start_time = monotonic_time(); | ||
| m_state = AWAITING_RESPONSE; | ||
| } | ||
| } | ||
| } else if (type == 'X') { | ||
| on_close(); | ||
| } |
There was a problem hiding this comment.
No bounds check before reading null-terminated strings in Parse ('P'), Bind ('B'), and Execute ('E') messages.
Lines 97, 98, 101, 102, and 106 all construct std::string or raw pointers from reinterpret_cast<const char*>(payload) treating it as a null-terminated string, with no verification that a null byte falls within the [payload, payload+len) region. A malformed (or truncated) message causes a buffer over-read into adjacent m_client_buffer memory.
🐛 Proposed fix — use strnlen for all null-terminated reads
} else if (type == 'P') {
- std::string stmt_name = reinterpret_cast<const char*>(payload);
- const char* query = reinterpret_cast<const char*>(payload) + stmt_name.length() + 1;
- m_statements[stmt_name] = query;
+ size_t sn_len = strnlen(reinterpret_cast<const char*>(payload), len);
+ if (sn_len >= len) return; // no null terminator within bounds
+ std::string stmt_name(reinterpret_cast<const char*>(payload), sn_len);
+ size_t q_off = sn_len + 1;
+ if (q_off >= len) return;
+ size_t q_len = strnlen(reinterpret_cast<const char*>(payload) + q_off, len - q_off);
+ m_statements[stmt_name] = std::string(reinterpret_cast<const char*>(payload) + q_off, q_len);
} else if (type == 'B') {
- std::string portal_name = reinterpret_cast<const char*>(payload);
- const char* stmt_ptr = reinterpret_cast<const char*>(payload) + portal_name.length() + 1;
- std::string stmt_name = stmt_ptr;
+ size_t pn_len = strnlen(reinterpret_cast<const char*>(payload), len);
+ if (pn_len >= len) return;
+ std::string portal_name(reinterpret_cast<const char*>(payload), pn_len);
+ size_t sn_off = pn_len + 1;
+ if (sn_off >= len) return;
+ std::string stmt_name(reinterpret_cast<const char*>(payload) + sn_off,
+ strnlen(reinterpret_cast<const char*>(payload) + sn_off, len - sn_off));
m_portals[portal_name] = stmt_name;
} else if (type == 'E') {
- std::string portal_name = reinterpret_cast<const char*>(payload);
+ size_t pn_len = strnlen(reinterpret_cast<const char*>(payload), len);
+ if (pn_len >= len) return;
+ std::string portal_name(reinterpret_cast<const char*>(payload), pn_len);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/PgSQLFFTO.cpp` around lines 96 - 118, Parsing of 'P', 'B', and 'E'
messages reads null-terminated strings without bounds checks causing buffer
over-reads; update the Parse/Bind/Execute handling to use strnlen (or
equivalent) against the available payload length to verify each null terminator
lies within the provided length before creating std::string or advancing
pointers. For the 'P' case validate stmt_name length+1 <= payload_len then
create stmt_name with the bounded length and ensure the query pointer +
query_length is inside payload; for the 'B' case validate portal_name and
stmt_name each terminate within remaining bytes before assigning
m_portals[portal_name] = stmt_name; for the 'E' case validate portal_name
termination before lookup. On any validation failure, handle safely (e.g., call
on_close() or drop the message) instead of performing the unchecked reads; refer
to symbols m_statements, m_portals, m_current_query, m_query_start_time,
monotonic_time and on_close to locate where to add these checks.
| } | ||
| #endif /* PROXYSQLGENAI */ | ||
| ProxySQL_Main_init_Admin_module(boostrap_info); | ||
| proxy_info("TEST_LOG: ProxySQL is starting\n"); |
There was a problem hiding this comment.
Remove the TEST_LOG: debug/test artifact from production startup logging.
proxy_info("TEST_LOG: ProxySQL is starting\n") emits unconditionally to the production error log on every startup. The TEST_LOG: prefix is a test-harness artifact and will appear in operator-visible production logs, polluting structured log pipelines. If TAP tests grep for this marker to detect startup readiness, that couples test infrastructure directly to a production log string — a fragile, non-standard approach.
Standard alternatives:
- Remove the line entirely and use connection probing in the TAP tests to detect startup.
- If a startup marker is genuinely needed, guard it with
#ifdef TEST_BUILDor a runtime flag, or emit it only from the test harness itself.
🗑️ Proposed fix
- proxy_info("TEST_LOG: ProxySQL is starting\n");📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| proxy_info("TEST_LOG: ProxySQL is starting\n"); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/main.cpp` at line 1508, Remove the test-only log artifact from production
by eliminating or conditionalizing the proxy_info call that prints "TEST_LOG:
ProxySQL is starting\n" in main.cpp: either delete the proxy_info("TEST_LOG:
...") invocation entirely, or wrap it in a test-only conditional (e.g., `#ifdef`
TEST_BUILD / runtime test flag) so that proxy_info is not emitted in production;
ensure any remaining startup logs use production-safe messages or move test-only
signaling into the test harness instead of using the proxy_info call.
| // Send a query larger than 100 bytes | ||
| std::string large_query = "SELECT '"; | ||
| for(int i=0; i<200; i++) large_query += "x"; | ||
| large_query += "'"; | ||
|
|
||
| mysql_query(conn, large_query.c_str()); | ||
|
|
||
| // Verify that NO digest was recorded for this query because it was bypassed | ||
| int rc = run_q(admin, "SELECT count(*) FROM stats_mysql_query_digest WHERE digest_text LIKE '%xxxx%'"); | ||
| MYSQL_RES* res = mysql_store_result(admin); | ||
| MYSQL_ROW row = mysql_fetch_row(res); | ||
| int count = atoi(row[0]); | ||
| ok(count == 0, "Query larger than threshold was correctly bypassed (count: %d)", count); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Search for digest_text normalization and stats_mysql_query_digest handling
rg -n "digest_text" -g '*.cpp' -g '*.h' -A 3 -B 3 | head -100Repository: sysown/proxysql
Length of output: 8992
🏁 Script executed:
# Look at the MYSQL_QUERY macro definition
rg -n "MYSQL_QUERY" -g '*.cpp' -g '*.h' -A 5 -B 2 | head -80Repository: sysown/proxysql
Length of output: 6502
🏁 Script executed:
# Examine the test file context around the lines in question
sed -n '40,75p' test/tap/tests/test_ffto_bypass-t.cppRepository: sysown/proxysql
Length of output: 1184
🏁 Script executed:
# Check how stats_mysql_query_digest is used elsewhere in tests
rg -n "stats_mysql_query_digest" -g '*.cpp' -g '*.h' | head -50Repository: sysown/proxysql
Length of output: 6345
🏁 Script executed:
# Find MYSQL_QUERY macro definition
rg -n "define MYSQL_QUERY" -g '*.h' -g '*.cpp' -A 5Repository: sysown/proxysql
Length of output: 7815
🏁 Script executed:
# Check for query normalization/truncation in the codebase
rg -n "normalize" -g '*.cpp' -g '*.h' | grep -i digestRepository: sysown/proxysql
Length of output: 50372
🏁 Script executed:
# Look at other FFTO tests to see their assertion patterns
sed -n '20,100p' test/tap/tests/test_ffto_mysql-t.cppRepository: sysown/proxysql
Length of output: 3927
🏁 Script executed:
# Check how test_ffto_mysql.cpp handles query success checking
sed -n '70,130p' test/tap/tests/test_ffto_mysql-t.cppRepository: sysown/proxysql
Length of output: 2958
🏁 Script executed:
# Look for bypass test patterns in test_ffto_mysql.cpp to compare with test_ffto_bypass-t.cpp
rg -n "bypass" -A 5 -B 5 test/tap/tests/test_ffto_mysql-t.cppRepository: sysown/proxysql
Length of output: 41
🏁 Script executed:
# Read the full test_ffto_bypass-t.cpp to understand context
cat test/tap/tests/test_ffto_bypass-t.cppRepository: sysown/proxysql
Length of output: 2641
🏁 Script executed:
# Check if there are other bypass tests to understand the pattern
fd -t f "bypass" test/tap/tests/ --type fRepository: sysown/proxysql
Length of output: 96
Assert query execution and simplify digest count check for bypass verification.
Line 57 executes the large query without checking success; if execution fails, no digest is recorded and the test passes incorrectly. Line 60 relies on LIKE '%xxxx%', but this pattern is fragile if digest_text is normalized or truncated. Since bypass behavior means NO digests should be recorded (the stats table was cleared at line 43), use MYSQL_QUERY to assert the query executed and check the total digest count instead.
🔧 Suggested adjustment
- mysql_query(conn, large_query.c_str());
+ MYSQL_QUERY(conn, large_query.c_str());
- int rc = run_q(admin, "SELECT count(*) FROM stats_mysql_query_digest WHERE digest_text LIKE '%xxxx%'");
+ int rc = run_q(admin, "SELECT count(*) FROM stats_mysql_query_digest");📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Send a query larger than 100 bytes | |
| std::string large_query = "SELECT '"; | |
| for(int i=0; i<200; i++) large_query += "x"; | |
| large_query += "'"; | |
| mysql_query(conn, large_query.c_str()); | |
| // Verify that NO digest was recorded for this query because it was bypassed | |
| int rc = run_q(admin, "SELECT count(*) FROM stats_mysql_query_digest WHERE digest_text LIKE '%xxxx%'"); | |
| MYSQL_RES* res = mysql_store_result(admin); | |
| MYSQL_ROW row = mysql_fetch_row(res); | |
| int count = atoi(row[0]); | |
| ok(count == 0, "Query larger than threshold was correctly bypassed (count: %d)", count); | |
| // Send a query larger than 100 bytes | |
| std::string large_query = "SELECT '"; | |
| for(int i=0; i<200; i++) large_query += "x"; | |
| large_query += "'"; | |
| MYSQL_QUERY(conn, large_query.c_str()); | |
| // Verify that NO digest was recorded for this query because it was bypassed | |
| int rc = run_q(admin, "SELECT count(*) FROM stats_mysql_query_digest"); | |
| MYSQL_RES* res = mysql_store_result(admin); | |
| MYSQL_ROW row = mysql_fetch_row(res); | |
| int count = atoi(row[0]); | |
| ok(count == 0, "Query larger than threshold was correctly bypassed (count: %d)", count); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/tap/tests/test_ffto_bypass-t.cpp` around lines 52 - 64, The test
executes a large query with mysql_query(conn, large_query.c_str()) without
asserting it succeeded and then checks for a fragile LIKE match; change to use
the test helper/MACRO that asserts execution (e.g., MYSQL_QUERY or equivalent)
to ensure the query actually ran, and replace the run_q call that uses "SELECT
count(*) FROM stats_mysql_query_digest WHERE digest_text LIKE '%xxxx%'" with a
simple total count query "SELECT count(*) FROM stats_mysql_query_digest" and
assert that this total count is 0 (use the existing
mysql_store_result/mysql_fetch_row/ok pattern to read and assert the count).
| void verify_digest(MYSQL* admin, const char* template_text, int expected_count, uint64_t expected_rows_affected = 0, uint64_t expected_rows_sent = 0) { | ||
| char query[1024]; | ||
| // Use a more relaxed LIKE pattern to handle potential normalization differences | ||
| sprintf(query, "SELECT count_star, sum_rows_affected, sum_rows_sent, digest_text FROM stats_mysql_query_digest WHERE digest_text LIKE '%%%s%%'", template_text); | ||
| int rc = run_q(admin, query); | ||
| if (rc != 0) { | ||
| ok(0, "Failed to query stats_mysql_query_digest for %s", template_text); | ||
| return; | ||
| } | ||
| MYSQL_RES* res = mysql_store_result(admin); | ||
| MYSQL_ROW row = mysql_fetch_row(res); | ||
| if (row) { | ||
| int count = atoi(row[0]); | ||
| uint64_t rows_affected = strtoull(row[1], NULL, 10); | ||
| uint64_t rows_sent = strtoull(row[2], NULL, 10); | ||
|
|
||
| ok(count >= expected_count, "Found digest: %s (count: %d, expected: %d)", row[3], count, expected_count); | ||
| ok(rows_affected == expected_rows_affected, "Affected rows for %s: %llu (expected: %llu)", row[3], (unsigned long long)rows_affected, (unsigned long long)expected_rows_affected); | ||
| ok(rows_sent == expected_rows_sent, "Sent rows for %s: %llu (expected: %llu)", row[3], (unsigned long long)rows_sent, (unsigned long long)expected_rows_sent); | ||
| } else { | ||
| ok(0, "Digest NOT found for pattern: %s", template_text); | ||
| // Dump the table to see what's actually in there | ||
| diag("Dumping stats_mysql_query_digest for debugging:"); | ||
| run_q(admin, "SELECT digest_text, count_star FROM stats_mysql_query_digest"); | ||
| MYSQL_RES* dump_res = mysql_store_result(admin); | ||
| MYSQL_ROW dump_row; | ||
| while (dump_res && (dump_row = mysql_fetch_row(dump_res))) { | ||
| diag(" Actual digest in table: %s", dump_row[0]); | ||
| } | ||
| if (dump_res) mysql_free_result(dump_res); | ||
| } | ||
| mysql_free_result(res); | ||
| } |
There was a problem hiding this comment.
Keep TAP plan count consistent on digest miss.
On failure paths (rc != 0 or no rows), only one ok() is emitted, which can make the TAP plan count short and obscure the real failure. Emit the two additional assertions even on failure to keep the plan stable.
✅ Suggested fix
if (rc != 0) {
ok(0, "Failed to query stats_mysql_query_digest for %s", template_text);
+ ok(0, "Affected rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_affected);
+ ok(0, "Sent rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_sent);
return;
}
@@
} else {
ok(0, "Digest NOT found for pattern: %s", template_text);
+ ok(0, "Affected rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_affected);
+ ok(0, "Sent rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_sent);
// Dump the table to see what's actually in there
diag("Dumping stats_mysql_query_digest for debugging:");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/tap/tests/test_ffto_mysql-t.cpp` around lines 25 - 57, In verify_digest,
ensure the TAP plan emits the same number of ok() assertions on all paths: when
run_q(admin, query) fails (rc != 0) and when no row is returned, add the two
missing ok() calls (matching the success-path checks for rows_affected and
rows_sent) so the function always emits three ok() calls; update the failure
messages to indicate the digest lookup failed for the given template_text and
keep references to the same diagnostic output; modify the branches around
run_q(admin, query) and the "no row" else to call ok(...) twice more (for
expected_rows_affected and expected_rows_sent) in addition to the existing
failure ok() to preserve TAP plan stability.
| void verify_pg_digest(MYSQL* admin, const char* template_text, int expected_count, uint64_t expected_rows_affected = 0, uint64_t expected_rows_sent = 0) { | ||
| char query[1024]; | ||
| sprintf(query, "SELECT count_star, sum_rows_affected, sum_rows_sent, digest_text FROM stats_pgsql_query_digest WHERE digest_text LIKE '%%%s%%'", template_text); | ||
| int rc = run_q(admin, query); | ||
| if (rc != 0) { | ||
| ok(0, "Failed to query stats_pgsql_query_digest for %s", template_text); | ||
| return; | ||
| } | ||
| MYSQL_RES* res = mysql_store_result(admin); | ||
| MYSQL_ROW row = mysql_fetch_row(res); | ||
| if (row) { | ||
| int count = atoi(row[0]); | ||
| uint64_t rows_affected = strtoull(row[1], NULL, 10); | ||
| uint64_t rows_sent = strtoull(row[2], NULL, 10); | ||
|
|
||
| ok(count >= expected_count, "Found PG digest: %s (count: %d, expected: %d)", row[3], count, expected_count); | ||
| ok(rows_affected == expected_rows_affected, "Affected rows for %s: %llu (expected: %llu)", row[3], (unsigned long long)rows_affected, (unsigned long long)expected_rows_affected); | ||
| ok(rows_sent == expected_rows_sent, "Sent rows for %s: %llu (expected: %llu)", row[3], (unsigned long long)rows_sent, (unsigned long long)expected_rows_sent); | ||
| } else { | ||
| ok(0, "PG Digest NOT found for pattern: %s", template_text); | ||
| diag("Dumping stats_pgsql_query_digest for debugging:"); | ||
| run_q(admin, "SELECT digest_text, count_star FROM stats_pgsql_query_digest"); | ||
| MYSQL_RES* dump_res = mysql_store_result(admin); | ||
| MYSQL_ROW dump_row; | ||
| while (dump_res && (dump_row = mysql_fetch_row(dump_res))) { | ||
| diag(" Actual PG digest in table: %s", dump_row[0]); | ||
| } | ||
| if (dump_res) mysql_free_result(dump_res); | ||
| } | ||
| mysql_free_result(res); | ||
| } |
There was a problem hiding this comment.
Keep TAP plan count consistent on digest miss.
Like the MySQL test helper, the failure branches only emit one ok(); that can desync the TAP plan and mask the real failure. Emit the two additional assertions even on failure.
✅ Suggested fix
if (rc != 0) {
ok(0, "Failed to query stats_pgsql_query_digest for %s", template_text);
+ ok(0, "Affected rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_affected);
+ ok(0, "Sent rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_sent);
return;
}
@@
} else {
ok(0, "PG Digest NOT found for pattern: %s", template_text);
+ ok(0, "Affected rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_affected);
+ ok(0, "Sent rows for %s: expected %llu", template_text, (unsigned long long)expected_rows_sent);
diag("Dumping stats_pgsql_query_digest for debugging:");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@test/tap/tests/test_ffto_pgsql-t.cpp` around lines 27 - 57, The failure
branch in verify_pg_digest currently calls only one ok(0, ...) when no row is
found, which desyncs the TAP plan; update the else branch in verify_pg_digest to
also emit the two additional ok(0, ...) assertions for rows_affected and
rows_sent (using template_text and expected_rows_affected/expected_rows_sent in
the messages, similar to the success branch) so the total number of ok() calls
matches both success and failure paths, and keep existing diagnostics and
mysql_free_result/res cleanup unchanged.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
lib/MySQLFFTO.cpp (1)
76-76:std::vector::erasefrom the front is O(n) on every processed packet.Both
m_client_bufferandm_server_buffershift all remaining bytes on every consumed packet. For a fast-forward observer sitting on the hot I/O path, this linear cost accumulates. A read-offset index into the vector (advancing head without moving data) or switching tostd::dequeeliminates the shift entirely.♻️ Sketch: offset-based approach for `on_client_data`
- m_client_buffer.insert(m_client_buffer.end(), buf, buf + len); - while (m_client_buffer.size() >= sizeof(mysql_hdr)) { - const mysql_hdr* hdr = reinterpret_cast<const mysql_hdr*>(m_client_buffer.data()); - uint32_t pkt_len = hdr->pkt_length; - if (m_client_buffer.size() < sizeof(mysql_hdr) + pkt_len) break; - const unsigned char* payload = reinterpret_cast<const unsigned char*>(m_client_buffer.data()) + sizeof(mysql_hdr); - process_client_packet(payload, pkt_len); - m_client_buffer.erase(m_client_buffer.begin(), m_client_buffer.begin() + sizeof(mysql_hdr) + pkt_len); - } + m_client_buffer.insert(m_client_buffer.end(), buf, buf + len); + size_t offset = 0; + while (m_client_buffer.size() - offset >= sizeof(mysql_hdr)) { + const mysql_hdr* hdr = reinterpret_cast<const mysql_hdr*>(m_client_buffer.data() + offset); + uint32_t pkt_len = hdr->pkt_length; + if (m_client_buffer.size() - offset < sizeof(mysql_hdr) + pkt_len) break; + const unsigned char* payload = reinterpret_cast<const unsigned char*>(m_client_buffer.data()) + offset + sizeof(mysql_hdr); + process_client_packet(payload, pkt_len); + offset += sizeof(mysql_hdr) + pkt_len; + } + if (offset > 0) + m_client_buffer.erase(m_client_buffer.begin(), m_client_buffer.begin() + offset);Apply the same pattern to
on_server_data.Also applies to: 89-89
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/MySQLFFTO.cpp` at line 76, The erase call on m_client_buffer (and similar on m_server_buffer) inside on_client_data/on_server_data causes O(n) shifts per packet; replace it with a non-shifting scheme by introducing a read offset index (e.g., client_head/server_head) that advances by sizeof(mysql_hdr)+pkt_len instead of erasing, and only compact/resize the underlying vector when the head grows past a threshold (or alternatively switch the buffers to std::deque and use pop_front for whole-packet removal). Update all uses of m_client_buffer/m_server_buffer access to account for the head offset (or adapt to deque semantics) and ensure mysql_hdr parsing still reads from the correct offset; perform the same change for the erase at the other reported location.lib/PgSQLFFTO.cpp (1)
96-118: Missing client'C'(Close) handler —m_statementsandm_portalsgrow without bound.The PostgreSQL protocol identifies frontend type byte
'C'as a Close command, whose payload names the prepared statement or portal to close. Without handling this message, entries added tom_statements(via'P') andm_portals(via'B') are never erased. For long-running sessions with frequent prepare-and-close cycles, both maps grow monotonically for the session lifetime, leaking memory proportional to the number of statements created.♻️ Proposed handler (add between the `'E'` and `'X'` branches)
+ } else if (type == 'C') { // Frontend Close (not server CommandComplete) + if (len < 1) return; + char close_type = static_cast<char>(payload[0]); + size_t name_len = strnlen(reinterpret_cast<const char*>(payload) + 1, len - 1); + std::string name(reinterpret_cast<const char*>(payload) + 1, name_len); + if (close_type == 'S') { + m_statements.erase(name); + } else if (close_type == 'P') { + m_portals.erase(name); + } } else if (type == 'X') {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/PgSQLFFTO.cpp` around lines 96 - 118, Add a handler for frontend type 'C' (Close) to remove prepared statements/portals so m_statements and m_portals do not leak: parse the Close payload to read the target byte (e.g., 'S' for prepared Statement, 'P' for Portal) then the null-terminated name string (const char* name = reinterpret_cast<const char*>(payload) + 1) and call m_statements.erase(name) for 'S' or m_portals.erase(name) for 'P'; also handle the empty-name case (unnamed statement/portal) consistently and make sure this branch is placed between the existing 'E' and 'X' branches so entries are removed when the client sends Close.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/MySQLFFTO.cpp`:
- Around line 101-126: In process_client_packet, handle the COM_STMT_CLOSE case
so m_statements doesn't leak: detect the _MYSQL_COM_STMT_CLOSE command in
MySQLFFTO::process_client_packet, extract the 4-byte statement id (same approach
as in the _MYSQL_COM_STMT_EXECUTE branch), and call m_statements.erase(stmt_id)
(and any associated cleanup) to remove the statement entry from the cache;
ensure this is done safely only when len is sufficient and the stmt_id exists in
m_statements.
In `@lib/PgSQLFFTO.cpp`:
- Around line 121-138: process_server_message currently drops subsequent
CommandComplete messages and overwrites outstanding extended-query tracking:
modify the 'C' handling so it does not early-return on m_state == IDLE but
instead accumulates rows_sent/affected_rows across multiple 'C' messages (keep
m_state until a terminal 'Z' and sum rows returned by extract_pg_rows_affected
into a per-query accumulator), and change the 'E' (Execute) path to check if
m_state == AWAITING_RESPONSE before overwriting m_current_query and
m_query_start_time—if so, first report or enqueue the prior query's stats (using
report_query_stats) or push it onto a queue, then set
m_current_query/m_query_start_time for the new execute; use the existing symbols
(process_server_message, m_state, IDLE, AWAITING_RESPONSE, 'C', 'E',
m_current_query, m_query_start_time, extract_pg_rows_affected,
report_query_stats) to locate and implement these two fixes.
---
Duplicate comments:
In `@lib/MySQL_Session.cpp`:
- Line 652: The reset() method in MySQL_Session must clear the FFTO-related
state to avoid carrying stale values across COM_CHANGE_USER/RESET; update
MySQL_Session::reset() to set ffto_bypassed = false and clear m_ffto (e.g.
nullptr or equivalent reset) so both ffto_bypassed and m_ffto are explicitly
reinitialized during reset.
- Around line 6026-6035: The server→client feed must enforce
mysql_thread___ffto_max_buffer_size and tear down/bypass FFTO when exceeded:
before calling MySQLFFTO::on_server_data in the block guarded by
mysql_thread___ffto_enabled/ffto_bypassed and using m_ffto, check the size of
the incoming packet plus the FFTO's current buffered bytes and if that sum would
exceed mysql_thread___ffto_max_buffer_size then call the teardown/bypass logic
(reset/cleanup m_ffto and set ffto_bypassed) instead of feeding data; if
MySQLFFTO lacks an accessor for current buffered bytes add a small accessor
(e.g., getBufferedBytes()/buffered_size()) or maintain a session-side counter
(m_ffto_buffered_bytes) updated when calling on_server_data so you can compare
against mysql_thread___ffto_max_buffer_size and enforce bypass/teardown.
In `@lib/MySQLFFTO.cpp`:
- Line 159: The checks that treat a packet as the CLIENT_DEPRECATE_EOF
terminator are using first_byte == 0x00 and len checks that allow misidentifying
binary row packets; update both occurrences (the conditional guarded by
m_session->client_myds->myconn->options.client_flag & CLIENT_DEPRECATE_EOF) to
test for first_byte == 0xFE and require len >= 9 (to distinguish from a legacy
5-byte EOF), so that the code path that calls report_query_stats, resets m_state
and observes m_rows_sent only triggers for the proper 0xFE terminator and not
for normal OK or binary row packets.
- Line 159: process_server_packet dereferences m_session->client_myds->myconn
unsafely; mirror report_query_stats by first computing a safe boolean (e.g.,
deprecate_eof) at the top of process_server_packet after verifying m_session,
m_session->client_myds, and m_session->client_myds->myconn are non-null, then
replace direct checks like (m_session->client_myds->myconn->options.client_flag
& CLIENT_DEPRECATE_EOF) in the branches at the current lines with that computed
deprecate_eof variable to avoid null dereference.
In `@lib/PgSQLFFTO.cpp`:
- Around line 91-95: In PgSQLFFTO::process_client_message the Simple Query ('Q')
payload is being stored into m_current_query including the PostgreSQL framing
trailing NUL, which causes pgsql_query_digest_and_first_comment to see an
embedded null and produce differing digests; fix process_client_message so when
type == 'Q' you detect and remove the final NUL (e.g., if len>0 and
payload[len-1]==0) and construct m_current_query from len-1 bytes (otherwise use
len), then set m_query_start_time and m_state as before, ensuring downstream
callers like pgsql_query_digest_and_first_comment receive the query without the
trailing NUL.
- Around line 57-68: In on_client_data, validate the parsed msg_len before using
msg_len - 4 to avoid underflow: after computing msg_len = ntohl(msg_len) and
before checking buffer bounds or calling process_client_message, ensure msg_len
>= 4 (and that m_client_buffer.size() >= 1 + msg_len); if the frame is malformed
(msg_len < 4 or the size check fails) handle it safely (e.g., log/error and
discard the offending bytes or break the loop) instead of computing msg_len - 4
and calling process_client_message with an enormous length; update logic around
m_client_buffer, msg_len, process_client_message, and the buffer erase to only
proceed when msg_len is valid.
- Around line 70-81: In on_server_data, guard against malformed frames by
verifying msg_len >= 4 before computing payload length and calling
process_server_message; if msg_len < 4 (or msg_len + 1 exceeds
m_server_buffer.size()), treat the frame as invalid (log or drop bytes) and
remove/skip the offending bytes to avoid infinite loops. Update the logic in
PgSQLFFTO::on_server_data to check msg_len >= 4 after ntohl(msg_len) and only
call process_server_message(type, payload, msg_len - 4) when that condition
holds, otherwise advance/clean m_server_buffer appropriately to recover.
- Around line 96-118: The handlers for message types 'P', 'B', and 'E' construct
std::string from reinterpret_cast<const char*>(payload) without ensuring a NUL
exists inside the message buffer, causing possible buffer-overreads; update the
code in PgSQLFFTO.cpp to validate the payload buffer length (the received length
parameter for messages) before creating strings: use a bounded search (e.g.
std::memchr) to find the terminating '\0' within the payload length for the
statement name/portal name and for the subsequent query pointer arithmetic, only
build std::string when the terminator is found, and otherwise treat the message
as malformed (log/ignore/close via on_close()); ensure when computing stmt_ptr
and copying stmt_name you calculate offsets using the found length instead of
relying on C-string termination so m_statements, m_portals, m_current_query,
m_query_start_time and m_state updates only happen for validated inputs.
- Around line 31-44: The regex in extract_pg_rows_affected is both wrong (\\S*
greedily swallows digits causing multi-digit counts to be mis-captured) and is
being recompiled on every call; change the pattern to use a
non-greedy/word-boundary form that reliably captures the trailing row count
(e.g. "(INSERT|UPDATE|DELETE|SELECT|MOVE|FETCH)\\b.*?\\b(\\d+)") and make the
std::regex a single compiled instance instead of constructing it on each call
(declare it static const outside or as a static local in
extract_pg_rows_affected); keep the rest of the logic (matches checking,
std::stoull, setting is_select for SELECT/FETCH/MOVE) the same.
---
Nitpick comments:
In `@lib/MySQLFFTO.cpp`:
- Line 76: The erase call on m_client_buffer (and similar on m_server_buffer)
inside on_client_data/on_server_data causes O(n) shifts per packet; replace it
with a non-shifting scheme by introducing a read offset index (e.g.,
client_head/server_head) that advances by sizeof(mysql_hdr)+pkt_len instead of
erasing, and only compact/resize the underlying vector when the head grows past
a threshold (or alternatively switch the buffers to std::deque and use pop_front
for whole-packet removal). Update all uses of m_client_buffer/m_server_buffer
access to account for the head offset (or adapt to deque semantics) and ensure
mysql_hdr parsing still reads from the correct offset; perform the same change
for the erase at the other reported location.
In `@lib/PgSQLFFTO.cpp`:
- Around line 96-118: Add a handler for frontend type 'C' (Close) to remove
prepared statements/portals so m_statements and m_portals do not leak: parse the
Close payload to read the target byte (e.g., 'S' for prepared Statement, 'P' for
Portal) then the null-terminated name string (const char* name =
reinterpret_cast<const char*>(payload) + 1) and call m_statements.erase(name)
for 'S' or m_portals.erase(name) for 'P'; also handle the empty-name case
(unnamed statement/portal) consistently and make sure this branch is placed
between the existing 'E' and 'X' branches so entries are removed when the client
sends Close.
| void MySQLFFTO::process_client_packet(const unsigned char* data, size_t len) { | ||
| if (len == 0) return; | ||
| uint8_t command = data[0]; | ||
| if (command == _MYSQL_COM_QUERY) { | ||
| m_current_query = std::string(reinterpret_cast<const char*>(data + 1), len - 1); | ||
| m_query_start_time = monotonic_time(); | ||
| m_state = AWAITING_RESPONSE; | ||
| m_affected_rows = 0; m_rows_sent = 0; | ||
| } else if (command == _MYSQL_COM_STMT_PREPARE) { | ||
| m_pending_prepare_query = std::string(reinterpret_cast<const char*>(data + 1), len - 1); | ||
| m_state = AWAITING_PREPARE_OK; | ||
| } else if (command == _MYSQL_COM_STMT_EXECUTE) { | ||
| if (len >= 5) { | ||
| uint32_t stmt_id; memcpy(&stmt_id, data + 1, 4); | ||
| auto it = m_statements.find(stmt_id); | ||
| if (it != m_statements.end()) { | ||
| m_current_query = it->second; | ||
| m_query_start_time = monotonic_time(); | ||
| m_state = AWAITING_RESPONSE; | ||
| m_affected_rows = 0; m_rows_sent = 0; | ||
| } | ||
| } | ||
| } else if (command == _MYSQL_COM_QUIT) { | ||
| on_close(); | ||
| } | ||
| } |
There was a problem hiding this comment.
COM_STMT_CLOSE not handled — m_statements grows without bound.
When a client sends COM_STMT_CLOSE, the statement ID is removed from the server's statement cache, but m_statements retains the entry indefinitely. For long-lived connections with repeated prepare/close cycles this is an unbounded memory leak.
🛡️ Proposed fix
} else if (command == _MYSQL_COM_QUIT) {
on_close();
+ } else if (command == _MYSQL_COM_STMT_CLOSE) {
+ if (len >= 5) {
+ uint32_t stmt_id;
+ memcpy(&stmt_id, data + 1, 4);
+ m_statements.erase(stmt_id);
+ }
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/MySQLFFTO.cpp` around lines 101 - 126, In process_client_packet, handle
the COM_STMT_CLOSE case so m_statements doesn't leak: detect the
_MYSQL_COM_STMT_CLOSE command in MySQLFFTO::process_client_packet, extract the
4-byte statement id (same approach as in the _MYSQL_COM_STMT_EXECUTE branch),
and call m_statements.erase(stmt_id) (and any associated cleanup) to remove the
statement entry from the cache; ensure this is done safely only when len is
sufficient and the stmt_id exists in m_statements.
| void PgSQLFFTO::process_server_message(char type, const unsigned char* payload, size_t len) { | ||
| if (m_state == IDLE) return; | ||
| if (type == 'C') { | ||
| unsigned long long duration = monotonic_time() - m_query_start_time; | ||
| bool is_select = false; | ||
| uint64_t rows = extract_pg_rows_affected(payload, len, is_select); | ||
| if (is_select) report_query_stats(m_current_query, duration, 0, rows); | ||
| else report_query_stats(m_current_query, duration, rows, 0); | ||
| m_state = IDLE; | ||
| } else if (type == 'Z' || type == 'E') { | ||
| // ReadyForQuery or ErrorResponse. We don't always want to report stats here if 'C' already did it. | ||
| // But if we didn't get 'C', report what we have. | ||
| if (m_state == AWAITING_RESPONSE) { | ||
| unsigned long long duration = monotonic_time() - m_query_start_time; | ||
| report_query_stats(m_current_query, duration); | ||
| } | ||
| m_state = IDLE; | ||
| } |
There was a problem hiding this comment.
Multi-statement and pipelined queries produce incorrect digest statistics.
Two related tracking gaps:
-
Simple Query batches — For
'Q'with"SELECT 1; SELECT 2", the simple query protocol sends a single textual query string that is parsed and immediately executed. The server returns aCommandCompleteper statement. After the first'C'setsm_state = IDLE(Line 129), all subsequent'C'messages for the same batch are dropped by theif (m_state == IDLE) return;guard (Line 122). The reportedrows_sent/affected_rowstherefore only reflects the first statement, silently discarding the rest. -
Pipelined Extended Query — A client may send
Parse → Bind → Execute → Parse → Bind → Execute → Syncwithout waiting for responses. In the extended-query protocol, execution of SQL commands is divided into multiple steps; the state is represented by prepared statements and portals. The second'E'(Execute) handler on Lines 110–113 unconditionally overwritesm_current_queryandm_query_start_timewhilem_stateis alreadyAWAITING_RESPONSE. When the firstCommandCompletearrives, it reports the second query's text with the second query's start time — the first query's tracking is silently lost.
For the pipelining case, an AWAITING_RESPONSE check in the 'E' branch (reporting or queuing the prior query before overwriting) would prevent silent data loss. The multi-statement case requires accumulating rows across all 'C' messages until 'Z', which is a more substantial state-machine change.
🧰 Tools
🪛 Cppcheck (2.19.0)
[warning] 138-138: If memory allocation fails, then there is a possible null pointer dereference
(nullPointerOutOfMemory)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lib/PgSQLFFTO.cpp` around lines 121 - 138, process_server_message currently
drops subsequent CommandComplete messages and overwrites outstanding
extended-query tracking: modify the 'C' handling so it does not early-return on
m_state == IDLE but instead accumulates rows_sent/affected_rows across multiple
'C' messages (keep m_state until a terminal 'Z' and sum rows returned by
extract_pg_rows_affected into a per-query accumulator), and change the 'E'
(Execute) path to check if m_state == AWAITING_RESPONSE before overwriting
m_current_query and m_query_start_time—if so, first report or enqueue the prior
query's stats (using report_query_stats) or push it onto a queue, then set
m_current_query/m_query_start_time for the new execute; use the existing symbols
(process_server_message, m_state, IDLE, AWAITING_RESPONSE, 'C', 'E',
m_current_query, m_query_start_time, extract_pg_rows_affected,
report_query_stats) to locate and implement these two fixes.


Summary
This PR completes the implementation of the Fast Forward Traffic Observer (FFTO) for both MySQL and PostgreSQL engines. FFTO allows ProxySQL to collect and report detailed query statistics even when a connection is in Fast-Forward mode, which previously bypassed all statistics gathering.
Key Features
Testing & Verification
Documentation
Summary by CodeRabbit
New Features
Documentation
Tests
Chores