FWSS gas & cost calculator

Fully dynamic — gas values queried live from FOC Observer on every page load. provePossession cost uses a logarithmic model fitted to 670 real mainnet datasets (R²=0.9551). Base fee from Glif. FIL price from CoinGecko. · ⚡ extreme scenarios ← back to index

Loading gas data from FOC Observer... | Fetching base fee... | Loading FIL price...

Querying FOC Observer for live gas data...

provePossession gas model — logarithmic fit (R²=0.9551)

gas = 157.45M + 8.004M × log₂(pieces)

Fit from 720 real mainnet datasets · MAE = 5.1M gas · R² = 0.9286 · updated 2026-04-29nextProvingPeriod is flat at ~124M gas (independent of piece count)

Methodology & model explanation — how are these numbers calculated?

1. How Filecoin gas works

Every transaction on Filecoin burns gas. The cost in FIL is: gas_units × base_fee ÷ 10¹⁸. Gas units measure the computation performed — they are deterministic and do not change with market conditions. Base fee is the market price per gas unit in attoFIL (1 FIL = 10¹⁸ attoFIL), and it fluctuates with network congestion.

Filecoin's base fee mechanism was introduced in FIP-0001, inspired by but distinct from EIP-1559. The adjustment rule is the same (±12.5% per epoch based on whether gas used exceeds or falls short of the 5B/block target), but Filecoin burns the base fee entirely (unlike Ethereum post-merge where it is partially burned), and has additional mechanisms — GasPremium applied on GasLimit (not GasUsed) and an OverEstimationBurn for over-estimated gas limits.

The base fee is fetched live from Filecoin mainnet via Glif's public Lotus node (Filecoin.ChainHead). Historical averages are computed from actual FWSS on-chain transactions in the FOC Observer database — they reflect what FWSS users really paid, not theoretical estimates.

2. Empirical gas measurements

Gas unit values for createDataSet, addPieces, nextProvingPeriod, and terminateService are empirical averages from all real mainnet transactions since FWSS v1.2.0 (block 5,864,769), queried live from FOC Observer on every page load. These are not estimates or assumptions — they are measured values.

—

createDataSet txns

—

addPieces txns

—

provePossession txns

—

nextProvingPeriod txns

3. The provePossession model — why logarithmic?

Unlike other operations, provePossession gas is not constant — it depends on how many pieces are in the dataset. In PDP, a piece is a Merkle root representing a file or data chunk (not a leaf). The PDP protocol always samples exactly 5 random pieces per proof (hardcoded). To sample a random piece, the contract must traverse the dataset's on-chain Sum Tree (Fenwick tree), which has depth proportional to log₂(N_pieces). More pieces → deeper Sum Tree traversal → more gas per challenge.

Real mainnet data confirms this clearly: a dataset with 130 small pieces costs ~209M gas to prove, while a dataset with 1 large piece of similar total data costs only ~147M gas. Total data size does not drive proving cost — piece count does. A logarithmic regression across all mainnet datasets gives R²=0.97 with the formula gas ≈ 153M + 7.1M × log₂(pieces).

This gives a logarithmic relationship between piece count and gas cost — each doubling of pieces adds a constant amount of gas (~8.5M), not a proportional amount.

gas_provePossession(N) = α + β × log₂(N)
α = 158,670,000 gas (base cost, independent of piece count)
β = 8,485,000 gas (cost per doubling of piece count)

Example: N=1 → 158.7M gas | N=1,000 → 244.8M gas | N=1,000,000 → 327.8M gas

4. How the model was fitted — and why not multilinear?

FIPs discussion #761 proposes a multilinear regression model for Filecoin gas estimation. That proposal targets the miner actor cron job, where several structurally distinct operations contribute independently to gas — live partitions, fault partitions, precommit expiries — each with a different marginal cost. Because these contributions are additive and the variance across the dataset is driven by multiple independent factors, a weighted multilinear model is appropriate there.

For provePossession, the situation is structurally different. A multilinear approach was considered, but the data shows a single dominant predictor: piece count, via the Merkle tree depth mechanism described above. All other candidate variables are either constant across all observations (challenge count is always 5, hardcoded in the PDP contract) or collinear with piece count (raw dataset size, tree depth). Adding further predictors would introduce model complexity without improving fit.

A single-predictor logarithmic regression therefore captures the data as well as a multilinear model would — with less risk of overfitting and easier interpretability. The model was fitted using ordinary least squares (OLS) on 670 real mainnet datasets.

630

datasets in training set

0.9899

R² (variance explained)

3.2M

MAE (mean abs error)

1 → 868k

piece count range

On weights: the multilinear model in FIPs discussion #761 uses a weighted combination of subgroup regressions (e.g. separate fits for different precommit expiry regimes) to improve accuracy across heterogeneous data. For provePossession, no such subgroup structure exists — the logarithmic transformation is sufficient to linearise the relationship uniformly across the full range of piece counts (1 to 868k). Weights are therefore not needed and not used. If future protocol changes introduce structurally new sources of gas variance (for example, variable challenge counts or new proof types), the model should be revisited and the multilinear approach from FIPs #761 considered.

5. nextProvingPeriod — why a constant?

nextProvingPeriod manages the proof schedule (which epoch the next proof is due). It does not access the Merkle tree of pieces, so its gas cost is independent of piece count. Across 32,393 real transactions, the gas is essentially flat at ~124M with a standard deviation of ~7.7M. A constant (the empirical mean, fetched live from FOC Observer) is therefore the correct model.

6. Data sources and reproducibility

Value	Source	How fetched
Gas unit averages	FOC Observer REST API	Live query on page load (`POST /sql`)
Base fee (current)	Glif public Lotus node	Live (`Filecoin.ChainHead`)
Base fee (historical)	FOC Observer transaction data	Aggregated from `pdp_pieces_added.effective_gas_price`
FIL/USD price	CoinGecko public API	Live on page load
Model coefficients (α, β)	OLS regression on FOC Observer data	Hardcoded (refitted periodically)

provePossession cost by piece count

Model: gas = 158.67M + 8.485M × log₂(pieces), R²=0.9551, 670 real datasets. A piece is a Merkle root — more pieces = deeper Sum Tree traversal = more gas per proof.

Base fee

Base fee: — attoFIL

FIL price

FIL: —

Pieces in dataset

Quick jump:

Batch size (pieces per addPieces tx)

📌 Note 1 — Do these values update when I refresh FOC Observer data?

No. The batching model (gas/tx = 140M + 66M × k) is a static fit — it reflects the cost structure of PDPVerifier.sol, not usage volume. It would only change if the contract is upgraded to modify addPieces logic. The values that do update on every page load (from FOC Observer) are: addPieces ×1–×10 empirical averages, provePossession, nextProvingPeriod, createDataSet, and terminateService.

📌 Note 2 — How does the model work above ×10 pieces per batch?

The model is gas/tx = 140M + 66M × batch_size, fitted from ×1–×10 real mainnet data (R²=0.9551). It works because each addPieces tx has:
• Fixed overhead ~140M: FVM tx base cost + Sum Tree setup + FWSS listener callback (cross-contract FVM call)
• Marginal ~66M per piece: one SSTORE to insert each leaf into the Sum Tree. Note: piecesAdded() FWSS callback does NOT call modifyRailRate — rate updates happen only in nextProvingPeriod()
Amortizing the fixed overhead across more pieces drives gas/piece toward the ~66M asymptote. The linear fit extrapolates well because the per-piece cost is a constant FVM SSTORE (FEVM maps EVM opcodes to FVM storage, cost is fixed per slot write) — there is no structural reason it would change at larger batch sizes.

⚠ Beyond ×10: extrapolation only. As of today, zero mainnet transactions have been observed with more than 10 pieces per batch. The ×10 limit appears to be a client-side cap, not a contract constraint. If future FOC Observer data shows txns with batch > 10, this calculator will be updated with observed values and the model re-validated. The extrapolation is conservative — if anything, shared Sum Tree ancestor nodes (FVM-agnostic tree property) at larger batches could make the actual cost lower than predicted.

addPieces / day

removePieces / day

Daily

provePossession

—

gas / proof

Proving gas

—

PP + NPP

addPieces

—

USD

removePieces

—

USD

Batching saving

—

vs ×1 no batching

Total / day

—

USD

Monthly (30 days)

Proving cost

—

30 × daily PP+NPP

addPieces cost

—

30 × daily add

removePieces cost

—

30 × daily remove

Total / 30 days

—

USD

Piece count slider is log₂ scale · click labels to jump · removePieces: observed values K≤144, K^0.614 extrapolation beyond

Gas values — source per operation

Operations marked LIVE use fresh averages from FOC Observer on every page load. MODEL uses the logarithmic fit, recalculated with your piece count.

Base fee — live & historical

Current value fetched live from Glif's Lotus node (Filecoin.ChainHead). Historical averages are computed from the effective_gas_price field on real FWSS on-chain transactions in the FOC Observer database — these reflect what users actually paid.

Current (Glif live)

—

attoFIL · latest block

Historical low

—

attoFIL · min observed

24h avg

—

attoFIL · FOC Observer

30d avg

—

attoFIL · FOC Observer

90d avg

—

attoFIL · FOC Observer

High spike

—

attoFIL · observed max

Select scenario

Custom base fee (attoFIL)

FIL price

Live from CoinGecko. Override manually to model different price scenarios.

FIL price (USD) —

Minimum floor price — computed from live data & model

$—

Minimum a client must pay per dataset per 30 days to cover SP daily proving gas.
Formula: (gas_PP(N) + gas_NPP) × base_fee ÷ 10¹⁸ × FIL_price × 30
where gas_PP(N) = 158.67M + 8.485M × log₂(N). Updates automatically.

Usage parameters

The gas values above are unit costs per single operation (live from mainnet). These parameters define your usage volume — the cost breakdown below multiplies unit costs × your volume to give total cost for the period.

Each parameter drives a specific cost: Pieces per dataset → provePossession gas (log₂N model). Pieces to add → addPieces cost. Batching → pieces per tx, reduces cost per piece. Datasets × Observation period → scales all proving costs.

Observation period — multiplies all costs 30 days

Datasets — multiplies provePossession + NPP 1

Pieces per dataset — drives provePossession gas via log₂(N) 100

Pieces to add (uploads) — drives addPieces cost 100

Batching

→ addPieces: 100 txns → provePossession: 215.0M gas for 100 pieces

Pieces to remove (optional) 0

⚠ removePieces — cost structure, scaling, and alternatives

Why it's expensive: The PDPVerifier contract stores each dataset's pieces in an on-chain Sum Tree (Fenwick tree), used to efficiently sample random challenges during proving. Removing a piece calls sumTreeRemove() once per piece, which rewrites O(log N) storage slots per piece — each SSTORE costs ~5,000 gas units (FVM/FEVM). This is unavoidable given the current data structure.

Scaling (real mainnet data, 159 txns):
×1: ~437M | ×5: ~1,010M | ×10: ~1,700M | ×25: ~3,760M | ×50: ~6,380M | ×80: ~9,530M Gas/piece decreases as K grows — from ~437M/piece at ×1 down to ~120M/piece at ×80. Why? The Sum Tree has internal nodes that are shared by multiple pieces. When you remove piece #1 and piece #2 in the same tx, they share many ancestor nodes in the tree. A SSTORE on an already-written slot within the same tx costs only ~200 gas instead of ~5,000 gas (warm access). So the second and subsequent pieces get their ancestor updates much cheaper. This "warm cache" effect grows as K grows — more pieces share more nodes — which is why the marginal cost keeps falling. There is likely a true asymptote around ~60–80M gas/piece, corresponding to the irreducible cost of the leaf-level operations that cannot be shared. The data at ×125–126 shows ~65M/piece (2 txns — sparse, treat as indicative). Beyond that, the cost/piece probably flattens.

Can we do better? (not today — requires a contract upgrade): The scheduleRemovals() function already exists in PDPVerifier and was designed to defer removals to nextProvingPeriod, spreading the cost across the daily proving cycle. However, it's not wired up to FWSS in a way that amortizes the Sum Tree update cost — the tree still gets updated per-piece. A future upgrade could implement lazy deletion: mark pieces as "removed" in a bitmap (cheap), rebuild the tree only when needed for challenges, or batch-update the tree during the proving period when gas cost is already being paid.

Best strategy today:

Cost breakdown — total for the period

daily SP paid every day per dataset, regardless of client activity. upload paid when adding data. one-time once per dataset lifecycle.

Operation	Txns	Gas/tx	Total gas	FIL cost	USD	Type
Total

Summary metrics

Total cost

—

for 30 days

Cost per piece (upload)

—

USD · amortised

Batching saving

—

gas per piece vs ×1

SP daily proving cost

—

per dataset · per day

SP 30-day proving cost

—

per dataset

provePossession gas

—

for — pieces

USD cost by operation type

provePossession gas vs piece count — model fit

Blue dots = real on-chain data (670 datasets). Green line = logarithmic model. X axis is log scale.

These costs are separate from the per-dataset costs in the Calculator tab. They happen at the FilecoinPay and SP Registry layer — the economic infrastructure underneath FWSS.

SP settlement & payment layer parameters

These costs are separate from the per-dataset operations above. They happen at the FilecoinPay and SP Registry layer — the economic infrastructure underneath FWSS.

SP rail settlements per tx 10 rails

Settlement txns (SP, over period) 1 txns

Client deposits (fp_deposit) 1 txns

Operator approvals (fp_operator_approval) 1 txns

Complete FWSS gas cost reference — all operations, all layers

Every on-chain operation in the FWSS stack, with real mainnet data, technical explanation, who pays, and path-forward notes where improvements are possible. Data from FOC Observer (block 5,864,769 → present). Amounts in gas units; cost in FIL = gas × base_fee ÷ 10¹⁸.

Dataset lifecycle — paid by client or operator

Operation	Avg gas	Txns	Who pays	When	Notes
`createDataSet`	~1,114M	381	Client	Once per dataset	Creates PDPVerifier proof set + FWSS listener + 2–3 FilecoinPay rails (PDP, CDN, cache-miss). Also fires `fp_rail_created` + `fp_rail_lockup_modified` + `fp_rail_rate_modified` — all inside the same tx, gas already included. Path forward: cost is dominated by 3 separate storage writes for rail creation; batching rail setup could reduce this ~20–30%.
`addPieces ×1`	~315M	330,786	Client/SP	Per upload batch	Inserts piece CID + size into PDPVerifier Sum Tree. Single piece = base overhead only. By far the most frequent operation on mainnet (330k txns). Batching up to ×9 pieces per tx reduces cost per piece significantly — see addPieces scaling below.
`addPieces ×N (batched)`	315M + ~86M/piece	~730 (×2–×9)	Client/SP	Per upload batch	Each additional piece in the same tx adds ~86M gas (amortized). Max observed on mainnet: ×9 pieces = ~927M gas. Path forward: the max batch size is not enforced by the contract — ×9 is the current client-side limit. Raising it would reduce per-piece cost further. Marginal cost per piece decreases with batch size (shared Sum Tree ancestor nodes within same tx — FVM-agnostic tree property).
`removePieces ×K`	437M + ~170M×(K-1)	159	Client/SP	When deleting data	Calls `sumTreeRemove()` per piece — each rewrites O(log N) Sum Tree nodes. Scales sublinearly due to shared Sum Tree ancestor nodes within same tx (FVM-agnostic tree property): gas/piece falls from ~437M at ×1 to ~120M at ×80, with likely asymptote ~60–80M/piece. Path forward: lazy deletion via bitmap marking would reduce this to O(1) per piece; tree update deferred to `nextProvingPeriod`. Requires contract upgrade. Today: use terminate+recreate if removing >65% of pieces.
`terminateService`	~138M	399	Client/SP	Once, end of lifecycle	Terminates PDP proving + all FilecoinPay rails. Gas is flat — does not scale with piece count or dataset size. Emits `fwss_service_terminated` + `fwss_pdp_payment_terminated`. Note: this terminates the PDP storage layer; CDN service can be terminated separately via `terminateCDNService`.
`terminateCDNService`	~266M	3	Client/SP	Once, CDN only	Terminates only the CDN payment rails (cdn_rail + cache_miss_rail), leaving PDP storage active. Useful if client wants to stop paying for CDN but keep storage proving. Very few mainnet examples — treat gas as approximate.

Proving — paid by SP (daily, automatic)

Operation	Avg gas	Txns	Who pays	When	Notes
`provePossession`	158.67M + 8.485M×log₂(N)	~400k+	SP	Daily, per dataset	Submits 5 Merkle inclusion proofs. Gas scales logarithmically with piece count N (R²=0.9551 on 670 datasets). At N=1: ~158M. At N=1M: ~317M. The logarithmic scaling is fundamental to the PDP design — 5 challenges traverse O(log N) tree depth. Path forward: challenge count (currently hardcoded at 5) could be made dynamic; reducing to 3 would save ~40% gas with weaker security guarantees.
`nextProvingPeriod`	~126M	~400k+	SP	Daily, per dataset	Advances proving window, generates next challenge epoch. Gas is ~124M flat (mainnet) — independent of piece count. Includes FWSS callback: updatePaymentRates() → FilecoinPay.modifyRailPayment() (1 call base, 3 calls if CDN). Calibnet benchmark says ~54M — difference is real FilecoinPay settlement on mainnet. Paired with `provePossession`, together ~285–443M gas/dataset/day depending on piece count. Important: this tx also processes any pending `scheduledRemovals`, which is why removals are deferred to this call — but the Sum Tree update cost is still paid here per removed piece.
`pdp_proof_fee_paid` (FIL)	~191M gas	11,428	SP	Inside provePossession	A FIL-denominated proof fee paid inside the `provePossession` tx — the gas is already counted in provePossession's total. The fee amount (in FIL, not gas) is calculated by the `PDPFees` library based on data size, proving duration, and current gas costs. The fee is zeroed when base fee is very high (gas cost >5% of expected storage reward). This is an additional FIL cost beyond gas — separate from the USDFC payment rails.

FilecoinPay payment layer — infrastructure costs

Operation	Avg gas	Txns	Who pays	When	Notes
`fp_deposit`	~146M	124	Client	Before creating datasets	Deposits USDFC into the client's FilecoinPay account. Required before `createDataSet` — the account must have enough USDFC to cover rail lockups. Not per-dataset; one deposit can fund many datasets. Path forward: gas cost is low (~146M) and unavoidable; no significant optimization opportunity.
`fp_operator_approval`	~138M	44	Client	Once per operator	Authorizes an operator (e.g. FWSS contract) to create and manage payment rails on behalf of the client, up to a specified rate and lockup allowance. Required once before any FWSS interaction. Gas is flat and small. Path forward: none needed.
`fp_rail_created`	~1,114M	784	Client (via FWSS)	Inside createDataSet	Creates a payment rail between client and SP. Fires inside `createDataSet` — gas already counted there. Each dataset creates 2–3 rails (PDP storage, CDN delivery, cache-miss). Rail creation dominates the gas cost of `createDataSet`. Path forward: combining rail creation into a single storage write rather than 3 separate calls could reduce `createDataSet` cost significantly.
`fp_rail_settled`	~99M (×1 rail) → ~853M (×10) → ~9.7B (×65)	4,216	SP	Periodic — to collect payments	SP calls this to settle accumulated USDFC from clients. Gas scales linearly with the number of rails settled in one tx (~80M per additional rail). SPs typically batch many rails in a single tx (most common: 10 rails/tx = ~853M gas). One SP settled 65 rails in one tx (9.7B gas!) — extreme but valid. This is the SP's revenue-collection cost, separate from all per-dataset costs. Frequency depends on SP: some settle daily, some weekly. Path forward: rail settlement gas could be reduced if FilecoinPay implemented a merkle-proof-based batch settlement instead of iterating rails on-chain.
`fp_withdrawal`	~107M	2	Client or SP	When withdrawing USDFC	Withdraws USDFC from FilecoinPay back to wallet. Very rare on mainnet (only 2 txns). Low gas cost.
`fp_rail_rate_modified`	~317M	331,782	System (FWSS contract)	On global pricing update	Triggered automatically when FWSS updates the global storage price — fires on every active rail. With 632+ datasets × 2 rails each, a single price update generates 1,200+ txns. Gas paid by the FWSS contract, not by users. Very high total gas impact on the network during price updates. Path forward: instead of updating all rails on a price change, a "reference rate" pattern could let rails inherit a global rate without individual updates — would reduce this to ~1 tx per price change instead of N×rails txns.
`fp_burn_for_fees`	~98M	7	Keeper (auction)	Dutch auction, periodic	FilecoinPay accumulates a network fee (% of all settlements). Keepers bid FIL (which is burned) to claim the accumulated USDFC. Dutch auction — price starts high, decays over time. Not a user cost; paid by external keepers. Gas is low (~98M). Economic note: this creates a deflationary FIL pressure proportional to FWSS settlement volume.

CDN layer (FilBeam) — only if with_cdn=true

Operation	Avg gas	Txns	Who pays	When	Notes
`fwss_cdn_rails_topped_up`	~1,374M	5	Client	When CDN lockup runs low	Adds USDFC to the CDN lockup (cdn_rail + cache_miss_rail). CDN rails require a fixed lockup (not just streaming rate) to cover burst delivery costs. If the lockup runs out, CDN service may degrade. Very few mainnet examples (5 txns) — treat as approximate. Gas is high because it modifies two rail lockups simultaneously.
`fb_usage_reported`	~9M	3	FilBeam operator	Periodic CDN usage rollup	FilBeam reports off-chain CDN usage on-chain — total bytes served, cache hit ratio, cache miss bytes. Very cheap (~9M gas). Triggers CDN settlement. Not a user cost. Note: `cdn_bytes_used` = total egress (hits + misses); `cache_miss_bytes_used` = subset that required origin fetch from SP. Cache hit ratio = 1 - (cache_miss / cdn_total).
`fb_cdn_settlement`	~122M	3	FilBeam operator	After usage report	Settles the CDN payment rail based on reported usage. Charged against the CDN lockup. Gas is low and paid by FilBeam. Very few mainnet examples — CDN usage is early-stage.

Never observed on mainnet (0 txns)

pdp_storage_provider_changed, fwss_data_set_sp_changed — SP migration (designed but not yet used). pdp_data_set_deleted, pdp_fee_update_proposed — dataset deletion and fee governance. spr_provider_registered, spr_product_added — SP onboarding (may use different block range). These operations exist in the contract and are available, but have not been exercised since FWSS v1.2.0 launch.

Analysis based on reading PDPVerifier.sol source code (823 lines, commit 5d59f00) and real mainnet gas data from FOC Observer. Mandate: reduce gas consumption of the PDP protocol. Improvements are ranked by impact × implementation complexity.

LOW PRIORITY

1. Skip Pyth oracle call when proof fee is zero (not applicable today — fee is always > 0 on mainnet)

Every provePossession unconditionally calls the Pyth oracle (IPyth.getPrice()) to compute the proof fee. This external call costs ~20,000 gas regardless of outcome. On mainnet, 100% of 30,909 pdp_proof_fee_paid events have fee > 0 (min observed: 33,469 attoFIL, max: ~3.9T attoFIL) — the fee model never produces zero fees under current parameters. This optimization does not apply today and has been moved to low priority. It would only become relevant if the fee model were changed to allow zero-fee proving periods (e.g. during bootstrapping or for small datasets below a floor threshold).

Gas saved / txn

~20,000

when fee = 0

Complexity

Low

3 lines of code

Contract upgrade

Yes (minor)

no state migration

    // Current (provePossession, ~L575):

    uint256 fee = PDPFees.proofFee(PYTH.getPrice(FIL_USD_PRICE_FEED_ID), ...);

    // Proposed:

    uint256 expectedFee = PDPFees.estimateProofFee(...);

    uint256 fee = expectedFee > 0 ? PDPFees.proofFee(PYTH.getPrice(FIL_USD_PRICE_FEED_ID), ...) : 0;

Where in code: PDPVerifier.sol lines 575–591, provePossession() function, fee calculation block. The Pyth call is PYTH.getPrice(FIL_USD_PRICE_FEED_ID) — a cross-contract call that always costs gas even if the result is unused.

MEDIUM IMPACT

2. Struct packing: merge 3 separate mappings into 1

The contract stores piece data across three separate mapping(uint256 => mapping(uint256 => ...)): pieceCids, pieceLeafCounts, and sumTreeCounts. Every operation that touches a piece (addPieces, provePossession, removePieces) must do 3 separate SLOAD / SSTORE operations — one per mapping. Each cold SLOAD costs 2,100 gas. Packing into a single struct reduces this to 1 SLOAD per piece.

Gas saved / piece

~4,200

2 fewer SLOAD per piece

Complexity

Medium

data migration needed

Impact on addPieces

~10–15%

from 281M → ~240M gas

    // Current (PDPVerifier.sol L113–136): 3 separate mappings

    mapping(uint256 => mapping(uint256 => Cids.Cid)) pieceCids;

    mapping(uint256 => mapping(uint256 => uint256)) pieceLeafCounts;

    mapping(uint256 => mapping(uint256 => uint256)) sumTreeCounts;

    // Proposed: 1 mapping, 1 SLOAD per piece

    struct PieceRecord { Cids.Cid cid; uint128 leafCount; uint128 sumTree; }

    mapping(uint256 => mapping(uint256 => PieceRecord)) pieces;

Caveat: requires a migration script to rewrite all existing piece data into the new struct layout on upgrade. The team already noted this area as a performance TODO (// TODO PERF: https://github.com/FILCAT/pdp/issues/16). With 1.09M pieces across 636 active datasets on mainnet, the migration must be done carefully to avoid hitting block gas limits.

MEDIUM IMPACT

3. addPieces light variant: skip listener extraData allocation

Every addPieces call allocates a full IPDPTypes.PieceData[] array in memory and passes it to listener.piecesAdded() via the callback. Memory allocation in Solidity costs gas proportional to array size — and if the listener doesn't actually use pieceData (which FWSS may not in all paths), the allocation is pure waste. A addPiecesLight() variant with no callback data would save ~5% on the most frequent operation on mainnet (1.15M addPieces ×1 transactions, avg 281M gas each).

Gas saved / txn

~5–10%

of addPieces cost

Complexity

Low

additive, no breaking change

Txns affected

1.15M+

addPieces ×1 on mainnet

Where in code: PDPVerifier.sol lines 425–447, addPieces() function. The IPDPTypes.PieceData[] memory pieceData array is constructed for every call before the listener callback, regardless of whether the listener reads it. A new addPiecesLight() function would skip this construction for callers that don't need it.

HIGH IMPACT

4. Batch sumTreeRemove: O(K × log N) → O(log N + K)

removePieces calls sumTreeRemove() once per piece in the batch. Each call rewrites O(log N) storage slots — at 5,000 gas per cold SSTORE and ~200 gas per warm SSTORE. Real mainnet data shows gas/piece falls from 407M at ×1 to ~60M at ×143 due to shared Sum Tree ancestor nodes within the same tx (FVM-agnostic tree property — pieces removed together share ancestor nodes, each subsequent SSTORE on a warm slot is cheaper). The Sum Tree's ancestor nodes are shared across pieces in the same tx). A rewritten batch version could compute the total delta across all K pieces and apply it in a single tree traversal — reducing the SSTORE count from O(K × log N) to O(log N + K).

Gas saved (×1)

~30–50%

at small K

Complexity

High

rewrites core algorithm

Asymptote today

~60M/piece

at ×143 (mainnet)

Real data (mainnet, 253 removePieces txns):

Pieces removed	Avg gas	Gas/piece	Txns
×1	407M	407M	59
×5	1,010M	202M	6
×9	840M	93M	19
×20	1,755M	88M	6
×50	5,900M	118M	6
×80	9,530M	119M	1
×143	8,484M	~59M	4

The drop at ×8–9 and ×143 confirms the warm slot effect — pieces removed together share Sum Tree ancestor nodes within the same tx (FVM-agnostic). A batch-aware algorithm would achieve the ~60M/piece asymptote even at ×1 by computing all deltas before writing to storage.

QUICK WIN

5. Cache scheduled removal counter to avoid array .length SLOAD

Every call to scheduleRemovals() reads scheduledRemovals[setId].length from storage to check the MAX_ENQUEUED_REMOVALS = 2000 cap. Reading an array's length from a storage mapping costs a cold SLOAD (2,100 gas) on the first access. Adding a separate mapping(uint256 => uint256) scheduledRemovalCount that is incremented/decremented on schedule/flush would make this check cost ~200 gas (warm read from a simple slot) instead of 2,100.

Gas saved / call

~1,900

cold → warm SLOAD

Complexity

Low

1 extra uint mapping

ARCHITECTURAL

6. Lazy deletion via bitmap: defer Sum Tree updates to nextProvingPeriod

The root cause of removePieces being expensive is that the Sum Tree must be updated immediately and consistently — because provePossession uses it to sample challenge indices. The existing scheduleRemovals() was designed to defer this, but the actual Sum Tree writes still happen eagerly. A true lazy deletion approach would mark pieces as "removed" in a cheap bitmap (mapping(uint256 => mapping(uint256 => bool)), cost: 1 SSTORE per piece = ~5,000 gas vs ~400M today), then re-sample during provePossession if a challenge lands on a removed piece. The Sum Tree would be updated lazily during nextProvingPeriod, when gas is already being paid by the SP.

removePieces cost (new)

~5,000 gas/piece

vs 407M today at ×1

Complexity

Very High

changes proving semantics

Tradeoff

Cost shifts

to SP, not eliminated

Important nuance: this doesn't eliminate gas — it moves the cost from the client/operator paying for removePieces to the SP paying during nextProvingPeriod. Total gas consumption across the network stays similar. The benefit is economic: the party removing data (client) no longer pays a large upfront gas cost; instead the SP's ongoing proving cost increases marginally. Requires careful protocol design to avoid griefing vectors where clients remove many pieces to inflate SP proving costs.

Summary — ranked by implementation priority

Optimization	Gas impact	Complexity	Upgrade	Notes
#1 Skip Pyth when fee=0	~20k/prove	N/A today	Minor	Fee is always >0 on mainnet (min 33k attoFIL). Only relevant if fee model changes to allow zero-fee periods.
#3 addPieces light variant	~5–10% addPieces	Low	Minor	New additive function. No breaking change. Affects 1.15M+ txns/month.
#5 Cached removal counter	~1,900/schedule	Low	Minor	1 extra uint mapping. Simple bookkeeping change. No migration.
#2 Struct packing (3 mappings → 1)	~10–15% addPieces	Medium	Migration	Already flagged as TODO PERF in code. Requires migrating 1.09M pieces. High payoff.
#4 Batch sumTreeRemove	~30–50% removePieces	High	Major	Rewrites core Fenwick tree algorithm. Highest removePieces impact without semantic changes.
#6 Lazy deletion bitmap	~99% removePieces	Very High	Architectural	Shifts cost to SP proving. Changes protocol economics. Requires anti-griefing design.

Gas data from FOC Observer mainnet (1.19M addPieces txns, 30,879 provePossession txns, 253 removePieces txns). Source code: PDPVerifier.sol @ 5d59f00.