1. How Filecoin gas works
Every transaction on Filecoin burns gas. The cost in FIL is: gas_units × base_fee ÷ 10¹⁸. Gas units measure the computation performed — they are deterministic and do not change with market conditions. Base fee is the market price per gas unit in attoFIL (1 FIL = 10¹⁸ attoFIL), and it fluctuates with network congestion.
Filecoin's base fee mechanism was introduced in FIP-0001, inspired by but distinct from EIP-1559. The adjustment rule is the same (±12.5% per epoch based on whether gas used exceeds or falls short of the 5B/block target), but Filecoin burns the base fee entirely (unlike Ethereum post-merge where it is partially burned), and has additional mechanisms — GasPremium applied on GasLimit (not GasUsed) and an OverEstimationBurn for over-estimated gas limits.
The base fee is fetched live from Filecoin mainnet via Glif's public Lotus node (Filecoin.ChainHead). Historical averages are computed from actual FWSS on-chain transactions in the FOC Observer database — they reflect what FWSS users really paid, not theoretical estimates.
2. Empirical gas measurements
Gas unit values for createDataSet, addPieces, nextProvingPeriod, and terminateService are empirical averages from all real mainnet transactions since FWSS v1.2.0 (block 5,864,769), queried live from FOC Observer on every page load. These are not estimates or assumptions — they are measured values.
3. The provePossession model — why logarithmic?
Unlike other operations, provePossession gas is not constant — it depends on how many pieces are in the dataset. In PDP, a piece is a Merkle root representing a file or data chunk (not a leaf). The PDP protocol always samples exactly 5 random pieces per proof (hardcoded). To sample a random piece, the contract must traverse the dataset's on-chain Sum Tree (Fenwick tree), which has depth proportional to log₂(N_pieces). More pieces → deeper Sum Tree traversal → more gas per challenge.
Real mainnet data confirms this clearly: a dataset with 130 small pieces costs ~209M gas to prove, while a dataset with 1 large piece of similar total data costs only ~147M gas. Total data size does not drive proving cost — piece count does. A logarithmic regression across all mainnet datasets gives R²=0.97 with the formula gas ≈ 153M + 7.1M × log₂(pieces).
This gives a logarithmic relationship between piece count and gas cost — each doubling of pieces adds a constant amount of gas (~8.5M), not a proportional amount.
α = 158,670,000 gas (base cost, independent of piece count)
β = 8,485,000 gas (cost per doubling of piece count)
Example: N=1 → 158.7M gas | N=1,000 → 244.8M gas | N=1,000,000 → 327.8M gas
4. How the model was fitted — and why not multilinear?
FIPs discussion #761 proposes a multilinear regression model for Filecoin gas estimation. That proposal targets the miner actor cron job, where several structurally distinct operations contribute independently to gas — live partitions, fault partitions, precommit expiries — each with a different marginal cost. Because these contributions are additive and the variance across the dataset is driven by multiple independent factors, a weighted multilinear model is appropriate there.
For provePossession, the situation is structurally different. A multilinear approach was considered, but the data shows a single dominant predictor: piece count, via the Merkle tree depth mechanism described above. All other candidate variables are either constant across all observations (challenge count is always 5, hardcoded in the PDP contract) or collinear with piece count (raw dataset size, tree depth). Adding further predictors would introduce model complexity without improving fit.
A single-predictor logarithmic regression therefore captures the data as well as a multilinear model would — with less risk of overfitting and easier interpretability. The model was fitted using ordinary least squares (OLS) on 670 real mainnet datasets.
provePossession, no such subgroup structure exists — the logarithmic transformation is sufficient to linearise the relationship uniformly across the full range of piece counts (1 to 868k). Weights are therefore not needed and not used. If future protocol changes introduce structurally new sources of gas variance (for example, variable challenge counts or new proof types), the model should be revisited and the multilinear approach from FIPs #761 considered.5. nextProvingPeriod — why a constant?
nextProvingPeriod manages the proof schedule (which epoch the next proof is due). It does not access the Merkle tree of pieces, so its gas cost is independent of piece count. Across 32,393 real transactions, the gas is essentially flat at ~124M with a standard deviation of ~7.7M. A constant (the empirical mean, fetched live from FOC Observer) is therefore the correct model.
6. Data sources and reproducibility
| Value | Source | How fetched |
|---|---|---|
| Gas unit averages | FOC Observer REST API | Live query on page load (POST /sql) |
| Base fee (current) | Glif public Lotus node | Live (Filecoin.ChainHead) |
| Base fee (historical) | FOC Observer transaction data | Aggregated from pdp_pieces_added.effective_gas_price |
| FIL/USD price | CoinGecko public API | Live on page load |
| Model coefficients (α, β) | OLS regression on FOC Observer data | Hardcoded (refitted periodically) |
gas = 158.67M + 8.485M × log₂(pieces), R²=0.9551, 670 real datasets.
A piece is a Merkle root — more pieces = deeper Sum Tree traversal = more gas per proof.
Filecoin.ChainHead). Historical averages are computed from the effective_gas_price field on real FWSS on-chain transactions in the FOC Observer database — these reflect what users actually paid.Formula:
(gas_PP(N) + gas_NPP) × base_fee ÷ 10¹⁸ × FIL_price × 30where
gas_PP(N) = 158.67M + 8.485M × log₂(N). Updates automatically.
| Operation | Txns | Gas/tx | Total gas | FIL cost | USD | Type |
|---|---|---|---|---|---|---|
| Total | ||||||