Skip to content

Conversation

@CPerezz
Copy link

@CPerezz CPerezz commented Jan 21, 2026

Summary

Implement standardized JSON format for slow block logging to enable cross-client performance analysis and protocol research.

This change is part of the Cross-Client Execution Metrics initiative proposed by Gary Rong and CPerezz.

Motivation

Standardized execution metrics are critical for:

  • Cross-client performance comparison
  • Network health monitoring
  • Data-driven protocol research

Real-world example: The EIP-7907 analysis used execution metrics to measure code read latency, per-call overhead scaling, and block execution breakdown. Without standardized metrics across clients, such analysis cannot be validated cross-client.

References

JSON Format

{
  "level": "warn",
  "msg": "Slow block",
  "block": { "number": ..., "hash": ..., "gas_used": ..., "tx_count": ... },
  "timing": { "execution_ms": ..., "total_ms": ... },
  "throughput": { "mgas_per_sec": ... },
  "state_reads": { "accounts": ..., "storage_slots": ..., "code": ..., "code_bytes": ... },
  "state_writes": { "accounts": ..., "storage_slots": ... },
  "cache": {
    "account": { "hits": ..., "misses": ..., "hit_rate": ... },
    "storage": { ... },
    "code": { ... }
  },
  "evm": { "sload": ..., "sstore": ..., "calls": ..., "creates": ... }
}

@CPerezz CPerezz force-pushed the feature/execution-metrics-standardization branch 2 times, most recently from ed18b5c to a97b4a2 Compare January 21, 2026 13:44
@macfarla macfarla moved this to Open PRs in 26.1.0 Release Jan 22, 2026
@macfarla
Copy link
Contributor

@CPerezz thanks for this - can you follow the instructions here to add commit signoff which will fix the DCO https://github.com/hyperledger/besu/pull/9660/checks?check_run_id=61142420367

Implements execution metrics following the cross-client specification:
https://github.com/ethereum/execution-specs/blob/main/docs/execution-metrics-spec.md

- Add ExecutionStats class for collecting block processing statistics
- Add EXECUTION metric category to BesuMetricCategory
- Add execution timing (execution, validation, commit phases)
- Add slow block logging in JSON format (threshold: 1000ms)
- Track gas usage and transaction counts per block

Signed-off-by: CPerezz <[email protected]>
Implement standardized JSON format for slow block logging to enable
cross-client performance analysis and protocol research.

This change is part of the Cross-Client Execution Metrics initiative
proposed by Gary Rong and CPerezz: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ

The standardized metrics enabled data-driven analysis like the EIP-7907
research: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850

JSON format includes:
- block: number, hash, gas_used, tx_count
- timing: execution_ms, total_ms
- throughput: mgas_per_sec
- state_reads: accounts, storage_slots, code, code_bytes
- state_writes: accounts, storage_slots
- cache: account/storage/code hits, misses, hit_rate
- evm: sload, sstore, calls, creates

Signed-off-by: CPerezz <[email protected]>
Convert timing getter methods from long to double to preserve
sub-millisecond precision in slow block JSON output.

Changes:
- getExecutionTimeMs(), getStateReadTimeMs(), etc: long -> double
- Division changed from integer (/ 1_000_000) to float (/ 1_000_000.0)
- JSON format string: %d -> %.3f for 3 decimal places
- Update test assertions for double comparisons

Signed-off-by: CPerezz <[email protected]>
Add tracking for EIP-7702 delegation set/cleared operations as part of
the cross-client execution metrics standardization effort.

New fields in ExecutionStats:
- eip7702DelegationsSet: Number of EIP-7702 delegations set
- eip7702DelegationsCleared: Number of EIP-7702 delegations cleared

Includes increment methods and JSON output in toSlowBlockJson().
These fields will be 0 for pre-Pectra blocks per spec.

Signed-off-by: CPerezz <[email protected]>
@CPerezz CPerezz force-pushed the feature/execution-metrics-standardization branch from 06da4e4 to 09fdb2b Compare January 26, 2026 14:17
@macfarla macfarla self-assigned this Jan 28, 2026
Copy link
Contributor

@macfarla macfarla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logging logic looks ok. a few comments.

@CPerezz is this PR meant to add metrics as well? Looks like you've added the category but no metrics

we prob want to check for performance impact


/** Threshold in milliseconds for slow block logging. */
private static final long SLOW_BLOCK_THRESHOLD_MS =
Long.getLong("besu.execution.slowBlockThresholdMs", 1000L);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prob worth documenting that this option is configurable via system prop

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you prefer system prop for standardization reasons? It would be more besu-idiomatic to use a cli flag (which automatically gives you toml and envvar config too)

@macfarla macfarla removed their assignment Jan 28, 2026
Copy link
Contributor

@siladu siladu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excited to get these changes in. Have done a quick first pass. I would like to assess performance impact and I need to think about the ThreadLocal approach some more.

@@ -75,6 +79,12 @@ TransactionReceipt create(
}

private static final Logger LOG = LoggerFactory.getLogger(AbstractBlockProcessor.class);
private static final Logger SLOW_BLOCK_LOG = LoggerFactory.getLogger("SlowBlock");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to require a separate logger here?

pros: avoids potential noise for existing besu users (including non-Ethereum networks)
cons: you might need some extra config to see these logs - that can be a runtime log4j config though so might suit your use case fine.


/** Threshold in milliseconds for slow block logging. */
private static final long SLOW_BLOCK_THRESHOLD_MS =
Long.getLong("besu.execution.slowBlockThresholdMs", 1000L);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you prefer system prop for standardization reasons? It would be more besu-idiomatic to use a cli flag (which automatically gives you toml and envvar config too)

Comment on lines +26 to +28
public final class ExecutionStatsHolder {

private static final ThreadLocal<ExecutionStats> CURRENT = new ThreadLocal<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first glance I'm skeptical about this approach. It has clean separation but I need to have a think if there's a more Besu-appropriate way like using a Tracer or even a log4j MDC. Also need to consider performance impact.

@@ -266,6 +265,9 @@ public void persist(final BlockHeader blockHeader, final StateRootCommitter comm
success = true;
} finally {
if (success) {
// Track commit time (writing state to DB)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how informative this stat is going to be with something like RocksDB where writes are fast but compaction/reads are more significant. Still, might be interesting to compare with other clients.


boolean parallelizedTxFound = false;
int nbParallelTx = 0;

// Execution metrics tracking
final ExecutionStats executionStats = new ExecutionStats();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intended to run at all times? I would implement this via a Tracer which you can add via ProtocolContext and behind a besu flag to enable/disable when not needed.

The tracer should cover all your use cases from what I can see.

Copy link
Member

@lu-pinto lu-pinto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted in the comment I believe this should be done via a Tracer using dependency injection and not using static calls.

- Remove package.json and package-lock.json (accidental inclusions)
- Remove unused EXECUTION metric category from BesuMetricCategory
- Remove unused runWith() method from ExecutionStatsHolder

Addresses review feedback from macfarla.

Signed-off-by: Carlos Perez <[email protected]>
Signed-off-by: CPerezz <[email protected]>
Address reviewer feedback by implementing the execution metrics collection
as a BlockAwareOperationTracer instead of embedding it directly in
AbstractBlockProcessor.

Key changes:
- Create SlowBlockTracer implementing BlockAwareOperationTracer
- Move slow block logging and metrics collection into the tracer
- Support tracer composition (wraps delegate tracer)
- Centralize state access counters in EvmOperationCounters
- Remove direct ExecutionStats management from AbstractBlockProcessor
- Delete obsolete unit tests in favor of integration test

The SlowBlockTracer is enabled via system property (threshold >= 0)
and wraps the existing block import tracer. Future work will convert
the system property to a CLI flag for better Besu integration.

Signed-off-by: CPerezz <[email protected]>
@CPerezz
Copy link
Author

CPerezz commented Jan 29, 2026

@lu-pinto @macfarla @siladu

All minor things were addressed in 85f3801

Then I did the big refactor to lower the perf impact as much as possible in: 92919d1

LMK your thoughts. Happy to give it another go if you still find that there are things that should be addressed!

@macfarla
Copy link
Contributor

running some nodes with this to verify on our side @CPerezz

Copy link
Contributor

@ahamlat ahamlat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CPerezz for the proposal, I really like the idea of standardising these metrics across all clients.
I think the PR is not ready as it is with current design. This is how we should implement these metrics :

  • All counters and opcode related metrics should be handled by a tracer, so no change in EVM operations, in the same way debug_trace* RPC calls work
  • The tracer will be enabled behind a new flag that adds these metrics, so we can have the possibility to disable them
  • The latency metrics can be inserted in the code, but need to take into account parallel execution of the transactions, and execute as less as possible as it is on the hot path.
  • In general, the implementation needs to take into account transaction parallel execution and some of the transactions that are executed in parallel are not committed as sequential thread can be faster or in case of a conflit.

private static final ObjectMapper JSON_MAPPER = new ObjectMapper();

private final long slowBlockThresholdMs;
private final BlockAwareOperationTracer delegate;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need for delegation if we don't want to stop tracing because of an interruption.

@@ -62,6 +63,8 @@ public OperationResult execute(final MessageFrame frame, final EVM evm) {
return new OperationResult(cost, ExceptionalHaltReason.INSUFFICIENT_GAS);
} else {
frame.pushStackItem(getStorageValue(account, UInt256.fromBytes(key), frame));
// Track SLOAD for cross-client execution metrics
EvmOperationCounters.incrementSload();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a tracer means that we should do explicit calls in EVM Operations.

*/
public final class EvmOperationCounters {

private static final ThreadLocal<Counters> COUNTERS = ThreadLocal.withInitial(Counters::new);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ThreadLocal is a good idea in its own, but you need to handle the case of parallel execution and conflict detection.
You need to cumulate the results of all threads at the end of block execution and remove non committed transactions counters from parallel executions, otherwise some opcodes will be counted twice. This is a tricky one BTW, cc. @matkt.

@@ -98,6 +99,7 @@ public int getStackItemsProduced() {
* @return the {@link Account}, or {@code null} if it does not exist
*/
protected Account getAccount(final Address address, final MessageFrame frame) {
EvmOperationCounters.incrementAccountReads();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of tracer means we should not change the code in hot path.

@@ -113,6 +115,7 @@ protected Account getAccount(final Address address, final MessageFrame frame) {
* @return the {@link MutableAccount}, or {@code null} if it does not exist
*/
protected MutableAccount getMutableAccount(final Address address, final MessageFrame frame) {
EvmOperationCounters.incrementAccountReads();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of tracer means we should not change the code in hot path.

@@ -105,6 +106,9 @@ public OperationResult execute(final MessageFrame frame, final EVM evm) {
frame.storageWasUpdated(key, newValue);
frame.getEip7928AccessList().ifPresent(t -> t.addSlotAccessForAccount(address, key));

// Track SSTORE for cross-client execution metrics
EvmOperationCounters.incrementSstore();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of tracer means we should not change the code in hot path.

@@ -200,6 +159,9 @@ public void codeSuccess(final MessageFrame frame, final OperationTracer operatio
final MutableAccount contract =
frame.getWorldUpdater().getOrCreate(frame.getContractAddress());
contract.setCode(contractCode);
// Track code write for cross-client execution metrics
EvmOperationCounters.incrementCodeWrites();
EvmOperationCounters.addCodeBytesWritten(contractCode.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of tracer means we should not change the code in hot path.

account.setCode(Bytes.concatenate(CODE_DELEGATION_PREFIX, codeDelegationAddress.getBytes()));
account.setCode(Bytes.concatenate(CODE_DELEGATION_PREFIX, codeDelegationAddress));
// Track EIP-7702 delegation set for cross-client execution metrics
EvmOperationCounters.incrementEip7702DelegationsSet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of tracer means we should not change the code in hot path.

@@ -73,6 +74,7 @@ default MutableAccount createAccount(final Address address) {
* #createAccount(Address)} (and thus all his fields will be zero/empty).
*/
default MutableAccount getOrCreate(final Address address) {
EvmOperationCounters.incrementAccountReads();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of tracer means we should not change the code in hot path.

@@ -105,6 +107,7 @@ default MutableAccount getOrCreateSenderAccount(final Address address) {
* @return the account {@code address}, or {@code null} if the account does not exist.
*/
default MutableAccount getSenderAccount(final MessageFrame frame) {
EvmOperationCounters.incrementAccountReads();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of tracer means we should not change the code in hot path.

@matkt
Copy link
Contributor

matkt commented Jan 30, 2026

As Ameziane said, I think we can use a custom tracer to count opcodes (write, read, etc.).Block processing already supports passing tracers, so we can use the same mechanism.

Regarding parallelization: the code already allows passing a custom tracer during parallel execution, which will be different from the block’s tracer. Then, during execution consolidation, if there are no conflicts we can take the tracer from the transaction executed in the background and merge it with the block’s; otherwise, we can re-run the transaction, and the block’s main tracer will be the one that counts. We already have an object to store the result of the transaction executed in parallel (ParallelizedTransactionContext) where we could put the tracing result while waiting for consolidation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Open PRs

Development

Successfully merging this pull request may close these issues.

6 participants