core: standardize slow block JSON output for cross-client metrics #9660

CPerezz · 2026-01-21T09:45:16Z

Summary

Implement standardized JSON format for slow block logging to enable cross-client performance analysis and protocol research.

This change is part of the Cross-Client Execution Metrics initiative proposed by Gary Rong and CPerezz.

Motivation

Standardized execution metrics are critical for:

Cross-client performance comparison
Network health monitoring
Data-driven protocol research

Real-world example: The EIP-7907 analysis used execution metrics to measure code read latency, per-call overhead scaling, and block execution breakdown. Without standardized metrics across clients, such analysis cannot be validated cross-client.

References

Original proposal: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ (Gary Rong)
EIP-7907 analysis: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850

JSON Format

{
  "level": "warn",
  "msg": "Slow block",
  "block": { "number": ..., "hash": ..., "gas_used": ..., "tx_count": ... },
  "timing": { "execution_ms": ..., "total_ms": ... },
  "throughput": { "mgas_per_sec": ... },
  "state_reads": { "accounts": ..., "storage_slots": ..., "code": ..., "code_bytes": ... },
  "state_writes": { "accounts": ..., "storage_slots": ... },
  "cache": {
    "account": { "hits": ..., "misses": ..., "hit_rate": ... },
    "storage": { ... },
    "code": { ... }
  },
  "evm": { "sload": ..., "sstore": ..., "calls": ..., "creates": ... }
}

macfarla · 2026-01-23T04:35:37Z

@CPerezz thanks for this - can you follow the instructions here to add commit signoff which will fix the DCO https://github.com/hyperledger/besu/pull/9660/checks?check_run_id=61142420367

Implements execution metrics following the cross-client specification: https://github.com/ethereum/execution-specs/blob/main/docs/execution-metrics-spec.md - Add ExecutionStats class for collecting block processing statistics - Add EXECUTION metric category to BesuMetricCategory - Add execution timing (execution, validation, commit phases) - Add slow block logging in JSON format (threshold: 1000ms) - Track gas usage and transaction counts per block Signed-off-by: CPerezz <[email protected]>

Implement standardized JSON format for slow block logging to enable cross-client performance analysis and protocol research. This change is part of the Cross-Client Execution Metrics initiative proposed by Gary Rong and CPerezz: https://hackmd.io/dg7rizTyTXuCf2LSa2LsyQ The standardized metrics enabled data-driven analysis like the EIP-7907 research: https://ethresear.ch/t/data-driven-analysis-on-eip-7907/23850 JSON format includes: - block: number, hash, gas_used, tx_count - timing: execution_ms, total_ms - throughput: mgas_per_sec - state_reads: accounts, storage_slots, code, code_bytes - state_writes: accounts, storage_slots - cache: account/storage/code hits, misses, hit_rate - evm: sload, sstore, calls, creates Signed-off-by: CPerezz <[email protected]>

Convert timing getter methods from long to double to preserve sub-millisecond precision in slow block JSON output. Changes: - getExecutionTimeMs(), getStateReadTimeMs(), etc: long -> double - Division changed from integer (/ 1_000_000) to float (/ 1_000_000.0) - JSON format string: %d -> %.3f for 3 decimal places - Update test assertions for double comparisons Signed-off-by: CPerezz <[email protected]>

Add tracking for EIP-7702 delegation set/cleared operations as part of the cross-client execution metrics standardization effort. New fields in ExecutionStats: - eip7702DelegationsSet: Number of EIP-7702 delegations set - eip7702DelegationsCleared: Number of EIP-7702 delegations cleared Includes increment methods and JSON output in toSlowBlockJson(). These fields will be 0 for pre-Pectra blocks per spec. Signed-off-by: CPerezz <[email protected]>

macfarla

logging logic looks ok. a few comments.

@CPerezz is this PR meant to add metrics as well? Looks like you've added the category but no metrics

we prob want to check for performance impact

package-lock.json

package.json

ethereum/core/src/main/java/org/hyperledger/besu/ethereum/mainnet/ExecutionStatsHolder.java

macfarla · 2026-01-28T03:34:05Z

ethereum/core/src/main/java/org/hyperledger/besu/ethereum/mainnet/AbstractBlockProcessor.java

+
+  /** Threshold in milliseconds for slow block logging. */
+  private static final long SLOW_BLOCK_THRESHOLD_MS =
+      Long.getLong("besu.execution.slowBlockThresholdMs", 1000L);


prob worth documenting that this option is configurable via system prop

Do you prefer system prop for standardization reasons? It would be more besu-idiomatic to use a cli flag (which automatically gives you toml and envvar config too)

ethereum/core/src/main/java/org/hyperledger/besu/ethereum/mainnet/AbstractBlockProcessor.java

metrics/core/src/main/java/org/hyperledger/besu/metrics/BesuMetricCategory.java

siladu

Excited to get these changes in. Have done a quick first pass. I would like to assess performance impact and I need to think about the ThreadLocal approach some more.

siladu · 2026-01-28T06:14:34Z

ethereum/core/src/main/java/org/hyperledger/besu/ethereum/mainnet/AbstractBlockProcessor.java

@@ -75,6 +79,12 @@ TransactionReceipt create(
  }

  private static final Logger LOG = LoggerFactory.getLogger(AbstractBlockProcessor.class);
+  private static final Logger SLOW_BLOCK_LOG = LoggerFactory.getLogger("SlowBlock");


Is there a reason to require a separate logger here?

pros: avoids potential noise for existing besu users (including non-Ethereum networks)
cons: you might need some extra config to see these logs - that can be a runtime log4j config though so might suit your use case fine.

siladu · 2026-01-28T06:16:52Z

ethereum/core/src/main/java/org/hyperledger/besu/ethereum/mainnet/AbstractBlockProcessor.java

+
+  /** Threshold in milliseconds for slow block logging. */
+  private static final long SLOW_BLOCK_THRESHOLD_MS =
+      Long.getLong("besu.execution.slowBlockThresholdMs", 1000L);


Do you prefer system prop for standardization reasons? It would be more besu-idiomatic to use a cli flag (which automatically gives you toml and envvar config too)

siladu · 2026-01-28T07:46:34Z

ethereum/core/src/main/java/org/hyperledger/besu/ethereum/mainnet/ExecutionStatsHolder.java

+public final class ExecutionStatsHolder {
+
+  private static final ThreadLocal<ExecutionStats> CURRENT = new ThreadLocal<>();


At first glance I'm skeptical about this approach. It has clean separation but I need to have a think if there's a more Besu-appropriate way like using a Tracer or even a log4j MDC. Also need to consider performance impact.

siladu · 2026-01-28T07:49:55Z

.../java/org/hyperledger/besu/ethereum/trie/pathbased/common/worldview/PathBasedWorldState.java

@@ -266,6 +265,9 @@ public void persist(final BlockHeader blockHeader, final StateRootCommitter comm
      success = true;
    } finally {
      if (success) {
+        // Track commit time (writing state to DB)


I wonder how informative this stat is going to be with something like RocksDB where writes are fast but compaction/reads are more significant. Still, might be interesting to compare with other clients.

lu-pinto · 2026-01-28T10:09:06Z

ethereum/core/src/main/java/org/hyperledger/besu/ethereum/mainnet/AbstractBlockProcessor.java


      boolean parallelizedTxFound = false;
      int nbParallelTx = 0;

+      // Execution metrics tracking
+      final ExecutionStats executionStats = new ExecutionStats();


Is this intended to run at all times? I would implement this via a Tracer which you can add via ProtocolContext and behind a besu flag to enable/disable when not needed.

The tracer should cover all your use cases from what I can see.

lu-pinto

As noted in the comment I believe this should be done via a Tracer using dependency injection and not using static calls.

- Remove package.json and package-lock.json (accidental inclusions) - Remove unused EXECUTION metric category from BesuMetricCategory - Remove unused runWith() method from ExecutionStatsHolder Addresses review feedback from macfarla. Signed-off-by: Carlos Perez <[email protected]> Signed-off-by: CPerezz <[email protected]>

Address reviewer feedback by implementing the execution metrics collection as a BlockAwareOperationTracer instead of embedding it directly in AbstractBlockProcessor. Key changes: - Create SlowBlockTracer implementing BlockAwareOperationTracer - Move slow block logging and metrics collection into the tracer - Support tracer composition (wraps delegate tracer) - Centralize state access counters in EvmOperationCounters - Remove direct ExecutionStats management from AbstractBlockProcessor - Delete obsolete unit tests in favor of integration test The SlowBlockTracer is enabled via system property (threshold >= 0) and wraps the existing block import tracer. Future work will convert the system property to a CLI flag for better Besu integration. Signed-off-by: CPerezz <[email protected]>

CPerezz · 2026-01-29T09:15:56Z

@lu-pinto @macfarla @siladu

All minor things were addressed in 85f3801

Then I did the big refactor to lower the perf impact as much as possible in: 92919d1

LMK your thoughts. Happy to give it another go if you still find that there are things that should be addressed!

macfarla · 2026-01-30T01:37:00Z

running some nodes with this to verify on our side @CPerezz

ahamlat

Thanks @CPerezz for the proposal, I really like the idea of standardising these metrics across all clients.
I think the PR is not ready as it is with current design. This is how we should implement these metrics :

All counters and opcode related metrics should be handled by a tracer, so no change in EVM operations, in the same way debug_trace* RPC calls work
The tracer will be enabled behind a new flag that adds these metrics, so we can have the possibility to disable them
The latency metrics can be inserted in the code, but need to take into account parallel execution of the transactions, and execute as less as possible as it is on the hot path.
In general, the implementation needs to take into account transaction parallel execution and some of the transactions that are executed in parallel are not committed as sequential thread can be faster or in case of a conflit.

ahamlat · 2026-01-30T10:48:46Z

ethereum/core/src/main/java/org/hyperledger/besu/ethereum/mainnet/SlowBlockTracer.java

+  private static final ObjectMapper JSON_MAPPER = new ObjectMapper();
+
+  private final long slowBlockThresholdMs;
+  private final BlockAwareOperationTracer delegate;


There is no need for delegation if we don't want to stop tracing because of an interruption.

ahamlat · 2026-01-30T10:51:00Z

evm/src/main/java/org/hyperledger/besu/evm/operation/SLoadOperation.java

@@ -62,6 +63,8 @@ public OperationResult execute(final MessageFrame frame, final EVM evm) {
        return new OperationResult(cost, ExceptionalHaltReason.INSUFFICIENT_GAS);
      } else {
        frame.pushStackItem(getStorageValue(account, UInt256.fromBytes(key), frame));
+        // Track SLOAD for cross-client execution metrics
+        EvmOperationCounters.incrementSload();


Using a tracer means that we should do explicit calls in EVM Operations.

ahamlat · 2026-01-30T10:53:35Z

evm/src/main/java/org/hyperledger/besu/evm/EvmOperationCounters.java

+ */
+public final class EvmOperationCounters {
+
+  private static final ThreadLocal<Counters> COUNTERS = ThreadLocal.withInitial(Counters::new);


I think ThreadLocal is a good idea in its own, but you need to handle the case of parallel execution and conflict detection.
You need to cumulate the results of all threads at the end of block execution and remove non committed transactions counters from parallel executions, otherwise some opcodes will be counted twice. This is a tricky one BTW, cc. @matkt.

ahamlat · 2026-01-30T10:54:53Z

evm/src/main/java/org/hyperledger/besu/evm/operation/AbstractOperation.java

@@ -98,6 +99,7 @@ public int getStackItemsProduced() {
   * @return the {@link Account}, or {@code null} if it does not exist
   */
  protected Account getAccount(final Address address, final MessageFrame frame) {
+    EvmOperationCounters.incrementAccountReads();