Compute operations to reprice in EIP-7904

Maria Silva, January 2026

In this report, we present which operations should have a cost increase with EIP-7904 and describe the methodology to pick them. The analysis can be reproduced in the 0.5-gasbench_data_eda notebook.

Methodology

Data

The raw benchmark data was generated by running the EEST benchmark suite with the Nethermind benchmarking tooling. The database is hosted on a PostgreSQL server managed by the Nethermind team.

Each test was run multiple times to isolate random variations in runtime and outliers. The data was collected between 2026-01-05 and 2026-01-22.

The test are still using the Prague fork. A similar analysis is still needed for the Osaka fork.

All benchmarks were run on the performance branches of each client using the following hardware specification:

SpecificationValue
spec_processor_typex86_64
spec_system_osLinux
spec_kernel_release6.8.0-53-generic
spec_kernel_version#55-Ubuntu SMP PREEMPT_DYNAMIC
spec_machine_archx86_64
spec_processor_arch64bit
spec_cpu_modelAMD EPYC 7713 64-Core Processor
spec_num_cpus32

Data processing

The raw benchmark data was processed as follows:

  1. Parsing test metadata: Each test title was parsed to extract the test file, test name, test parameters, fork version, and the target opcode or precompile being tested.

  2. Filtering invalid data: Tests with execution time of 0ms were excluded. The ethrex client was also excluded from the analysis as it is still in early development.

  3. Removing outliers: For each (client, test, opcode) combination, outliers were identified using the interquartile range (IQR) method:

    • Lower threshold: Q1 - 1.5 × IQR
    • Upper threshold: Q3 + 1.5 × IQR
    • Data points outside these thresholds were flagged as outliers and excluded from the worst-case analysis.
  4. Computing worst-case performance: For each (client, test, opcode) combination, the minimum non-outlier MGas/s value was selected to represent the worst-case execution performance.

  5. Aggregating by opcode: For each opcode, the worst-performing test across all clients was identified, along with the second-worst client's performance on that same test.

Selecting candidates

Operations were selected as candidates for repricing based on the following criteria:

  1. Performance threshold: The worst-case MGas/s must be below 60 MGas/s. By increase the costs of these operations, we will be able to increase our base throughput 3x from our current 20Mgas/s.

  2. Multi-client validation: To avoid penalizing all clients for a single client's implementation inefficiency, the second-worst client's performance is also considered. If the second-worst client achieves significantly better performance (>20% above the threshold), the operation is flagged for client optimization rather than repricing.

  3. Excluding very slow tests: Tests with worst-case MGas/s below 20 MGas/s were analyzed separately to understand if the slowness is due to specific test parameters or possible errors in data. After Osaka, we are running at 20Mgas/s, so we should not observe test with values lower than this.

Performance results

Run times distribution by client

The benchmark data covers 237,733 test runs across 5 clients (Besu, Erigon, Geth, Nethermind, and Reth) on the Prague fork. The overall distribution of MGas/s shows significant variation across tests and clients.

mgas_distribution_by_client

The boxplot above shows that different clients have different performance characteristics:

Worst vs. second-worst client

An important consideration for repricing is whether poor performance is isolated to a single client or affects multiple implementations. This is important for distinguishing between:

The chart shows the number of tests in which each clietn was the worst performer. We can see that Besu is the worst performer in the majority of tests, followed by Erigon.

worst_client_countt

The next plot shows the distribution of the performance ratio between the worst client and second-worst client. Each boxplot shows this distribution by the client (i.e., when each client is the worst performer).

worst_second_worst_gap

For a majority of tests, the gap between worst and second-worst is small (<20%), suggesting that the worst client is not significantly underperforming on relation to the other clients. However, we do see some tests with the gas is much wider. These test are more frequent for when Besu is the worst client.

Underpriced operations at 60 MGas/s

The following table shows operations with worst-case performance below 60 MGas/s:

OperationTypeWorst MGas/sWorst ClientSecond Worst MGas/s
MULMODopcode20.60besu57.02
MODEXPprecompile21.63geth25.06
EQopcode22.80besu122.96
SDIVopcode23.14besu85.18
REVERTopcode23.38besu110.41
SMODopcode24.99besu67.98
MODopcode25.27besu70.98
SARopcode27.15besu134.36
MULopcode27.60besu147.87
SUBopcode28.80besu122.66
DIVopcode29.94besu88.54
SHIFTopcode30.47besu124.32
point evaluationprecompile31.75erigon31.85
RETURNopcode32.95besu103.85
ADDMODopcode32.98besu91.12
CALLCODEopcode34.78besu112.96
CALLDATALOADopcode35.52besu77.65
CALLopcode35.60besu98.94
DELEGATECALLopcode36.30besu127.70
SELFDESTRUCTopcode36.65besu628.81
STATICCALLopcode37.60besu105.77
CALLDATACOPYopcode38.08besu193.12
KECCAKopcode38.49besu70.90
SHLopcode39.64besu136.58
SHRopcode41.84besu134.38
BLS12_G1ADDprecompile41.97besu73.64
XORopcode47.34besu122.21
blake2fprecompile47.48reth50.07
ecAddprecompile47.89besu63.51
BLS12_G2ADDprecompile49.01besu69.11
SHA2-256precompile52.29besu235.32
ANDopcode54.66besu122.87
identityprecompile54.74besu178.01
ORopcode54.91besu126.59
ecRecoverprecompile55.04besu58.41
TLOADopcode55.90erigon789.42
CALLDATASIZEopcode56.91besu134.27
MSTOREopcode57.07besu145.72
ecPairingprecompile57.34nethermind67.85
ecMulprecompile58.66reth90.32

This table is then split into two categories below: Candidates for repricing (where multiple clients struggle) and Operations requiring client optimization (where only one client struggles).

As expected, there are a significant number of operations where Besu is performing bellow 60Mgas/s, but the rest of the clients have a significantly higher performance. These are the likely cases where a single client optimization is needed.

MODEXP is still performing at 21.63 Mgas/s, however we expect this value to change in the Osaka branch as this operation was already repriced there. We need to run this again on the newest fork.

Slow tests

We also observed test performing at less than 20Mgas/s. Since there are likely issues in the data, we exclude them from the underpriced operations. However, we need to confirm whether this is actually issues in the test and not a new bottleneck. The test in question are the following:

Final list

Candidates for repricing

The following operations are candidates for repricing under EIP-7904. These are operations where:

OperationTypeWorst MGas/sWorst ClientSecond Worst MGas/sSecond Worst / Worst
MULMODopcode20.60besu57.022.77×
MODEXPprecompile21.63geth25.061.16×
SMODopcode24.99besu67.982.72×
MODopcode25.27besu70.982.81×
point evaluationprecompile31.75erigon31.851.00×
KECCAKopcode38.49besu70.901.84×
BLS12_G1ADDprecompile41.97besu73.641.75×
blake2fprecompile47.48reth50.071.05×
ecAddprecompile47.89besu63.511.33×
BLS12_G2ADDprecompile49.01besu69.111.41×
ecRecoverprecompile55.04besu58.411.06×
ecPairingprecompile57.34nethermind67.851.18×

Operations requiring client optimization

The following operations have poor performance on a single client but acceptable performance on others. These should be addressed through client optimization rather than protocol-level repricing:

OperationTypeWorst MGas/sWorst ClientSecond Worst MGas/sGap
EQopcode22.80besu122.965.4×
SDIVopcode23.14besu85.183.7×
REVERTopcode23.38besu110.414.7×
SARopcode27.15besu134.364.9×
MULopcode27.60besu147.875.4×
SUBopcode28.80besu122.664.3×
DIVopcode29.94besu88.543.0×
SHIFTopcode30.47besu124.324.1×
RETURNopcode32.95besu103.853.2×
ADDMODopcode32.98besu91.122.8×
CALLCODEopcode34.78besu112.963.2×
CALLDATALOADopcode35.52besu77.652.2×
CALLopcode35.60besu98.942.8×
DELEGATECALLopcode36.30besu127.703.5×
SELFDESTRUCTopcode36.65besu628.8117.2×
STATICCALLopcode37.60besu105.772.8×
CALLDATACOPYopcode38.08besu193.125.1×
SHLopcode39.64besu136.583.4×
SHRopcode41.84besu134.383.2×
XORopcode47.34besu122.212.6×
SHA2-256precompile52.29besu235.324.5×
ANDopcode54.66besu122.872.2×
identityprecompile54.74besu178.013.3×
ORopcode54.91besu126.592.3×
TLOADopcode55.90erigon789.4214.1×
CALLDATASIZEopcode56.91besu134.272.4×
MSTOREopcode57.07besu145.722.6×
ecMulprecompile58.66reth90.321.5×

The next step is to reach out to the individual clients and assess the reason for the slow performance and whether it can be improved by Glamsterdam.

Client feedback

The Besu team provided the following feedback: