What Is a Datachain?

The datachain category exists because most blockchains were not designed to hold significant volumes of data. Storage on a generic Layer-1 like Ethereum is technically possible but expensive by design. Writing a 32-byte word to smart contract storage on Ethereum costs around 20,000 gas (plus a cold-access surcharge per EIP-2929) because every full node has to keep that data in state. Unbounded onchain storage would push node hardware requirements up and degrade decentralization, so Ethereum prices storage high to constrain state growth. The consensus mechanism does not pay anyone for storage; it charges for it. Datachains close that gap by building storage incentives directly into the protocol.

What makes a datachain a datachain

The defining property of a datachain is the storage incentive at the consensus level. A datachain's consensus mechanism pays miners or validators to store data, the same way a generic blockchain's consensus mechanism pays them to produce blocks. Storage becomes part of the work the protocol natively rewards.

This single property shapes everything else about how a datachain behaves. Because miners are paid for storage, large data items are economically viable to keep onchain. Because the incentive lives at the protocol level, storage durability comes from the chain itself, with the same consensus guarantees that secure block production. Because the consensus mechanism is doing the storage accounting, applications that depend on durable onchain data have a native chain to build on.

Different datachains implement the storage incentive differently. Filecoin runs a marketplace where storage providers and clients enter deals with defined terms, and providers prove they are holding the data through Proof-of-Replication and Proof-of-Spacetime. Arweave uses an endowment model designed for permanent storage, with miners earning AR tokens by proving access to randomly recalled historical blocks (Succinct Proofs of Random Access, or SPoRA). Walrus distributes data via Red Stuff erasure coding across storage nodes in the Sui ecosystem and verifies storage through an asynchronous challenge protocol. Irys uses Useful Proof of Work, tying miner rewards to continuous storage proofs at the consensus level. The differences shape what each datachain is best suited to, but the unifying property is the consensus-level reward for storage.

Why this is a distinct category

Datachains are a distinct category because they sit in a gap that generic blockchains, decentralized storage networks, and data availability layers each leave open.

Generic Layer-1 blockchains, including Ethereum and Solana, are optimized for execution. Their consensus mechanisms reward block production and transaction ordering. Storage exists but is treated as a tax on execution. Holding a kilobyte of data in an Ethereum smart contract's state costs thousands of times what holding the same kilobyte on a datachain costs, because Ethereum's consensus mechanism is not paying anyone to keep it there.

Decentralized storage networks without a blockchain, such as IPFS, take a different approach. IPFS is a content-addressed file system that any node can participate in, but it has no consensus mechanism. Nodes can pin or unpin content at will, and the network does not pay anyone to store data. Persistence depends on someone choosing to pin a file, often through a third-party pinning service. IPFS is useful but is not a datachain because there is no protocol-level incentive to store.

Data availability (DA) layers, such as Celestia and EigenDA, are designed for a narrower job: making blob data retrievable by rollups for short, defined retention windows. DA layers do not retain data indefinitely, do not target application-level storage, and do not generally support smart contract execution. DA layers are category-adjacent infrastructure that serves rollups; the datachain category serves applications that need durable storage with smart contract access to it.

A datachain sits in the gap these adjacent categories leave: persistent or flexible-duration storage, incentivized at the consensus level, on a blockchain that can support smart contracts.

Examples of datachains

Four projects illustrate how the datachain category gets implemented in practice.

Arweave. A Layer-1 designed for permanent storage. Arweave's consensus mechanism, Succinct Proofs of Random Access (SPoRA), rewards miners for proving access to randomly recalled historical blocks, with a Verifiable Delay Function pacing the work. An endowment-style fee model funds storage for a minimum 200-year horizon based on declining storage cost projections.

Filecoin. A Layer-1 that combines a storage marketplace with consensus-level proofs. Clients and storage providers enter storage deals; providers prove they are holding the data using Proof-of-Replication and Proof-of-Spacetime. Filecoin's economic incentive sits at the deal layer, with consensus-level verification of those deals.

Walrus. A storage system operating in the Sui ecosystem. Walrus uses Red Stuff, a two-dimensional erasure coding scheme, to distribute data across storage nodes and an asynchronous challenge protocol to verify availability. Storage incentives are paid in WAL tokens. Walrus fits the datachain category because storage is incentivized at the protocol level, with the structural note that Sui acts as the control layer for some consensus functions instead of Walrus running as a fully standalone chain.

Irys. A Layer-1 datachain that combines storage incentives with an EVM-compatible execution environment. Storage is incentivized through Useful Proof of Work, where miners earn rewards by storing data and producing storage proofs. Irys is also the canonical example of the next category evolution: a programmable datachain, defined below.

These four illustrate the range. Permanent storage with simple economic incentives at one end (Arweave); marketplace-mediated storage with strong cryptographic proofs in the middle (Filecoin); audited erasure-coded storage within a larger ecosystem (Walrus); storage paired with native execution on the other end (Irys). All four are datachains because all four pay nodes to store data through the consensus mechanism.

Programmable datachains

A programmable datachain is the next evolution of the datachain category. It adds something the first generation generally lacked: native execution that can use the stored data directly during smart contract runtime.

In a first-generation datachain, stored data lives under the protocol's storage guarantees but does not interact directly with smart contract execution. A smart contract on a different chain wanting to use that data typically goes through an oracle, a bridge, or a separate retrieval service. The stored data is reachable but lives outside the execution environment that runs the smart contracts.

A programmable datachain closes that gap. Storage and execution live in the same Layer-1, and the execution environment can read stored data as an input during smart contract execution. Smart contracts can branch on the data's contents, enforce rules over it, and write new data back to storage during the same execution, without leaving the protocol that holds the data.

Irys is the first programmable datachain. The Irys whitepaper describes the goal directly: smart contracts that "read and act on onchain bytes at hot-access latency". IrysVM, the EVM-compatible execution environment, exposes a precompile that streams stored data directly into a running smart contract. The data abstraction this enables is called Programmable Data.

Programmable datachains are still a small subcategory; most existing datachains belong to the first generation. As more projects in the category add native execution that can use stored data, the programmable subcategory will grow.

Why datachains exist

Datachains exist because a class of applications needs persistent onchain data with consensus-level economic guarantees about its durability. Generic blockchains cannot offer this affordably. Non-blockchain storage networks like IPFS cannot offer it with consensus-backed guarantees. Datachains natively offer it, and programmable datachains go further by giving smart contracts direct access to stored data during execution.

A few application categories rely on this combination.

Verifiable datasets and AI provenance. A model checkpoint, a training set, or an evaluation dataset can be stored on a datachain with an onchain identifier. Applications can use that identifier to confirm they are reading the same dataset that was originally written. A programmable datachain extends this further: a smart contract can read the dataset directly during execution and enforce rules over its contents.

Decentralized Physical Infrastructure Networks (DePIN). Sensor networks, GPU networks, wireless networks, and other hardware-bound networks generate high-frequency data. A datachain stores that data with onchain provenance. A programmable datachain lets a payment smart contract read the data during execution to compute operator rewards.

Application metadata and reusable onchain datasets. Once a dataset is on a datachain, multiple applications can read it. Verifiable identity records, content registries, reputation graphs, and shared application state can all live on a datachain. Storing them onchain makes them available to any application that needs them.

Autonomous agents. An onchain agent's decision history, inputs, and outputs can be written to a datachain. A programmable datachain lets the agent's smart contract read its own history during execution and discover what other agents have written to the same chain. The shared chain becomes a discovery layer for multi-agent systems; agents can find and act on each other's outputs without an off-chain coordinator.

The common thread: applications that need data the chain itself proves is real, durable, and available.

Datachain compared to nearby categories

Property	Datachain	Execution Blockchain (generic L1)	Decentralized storage (non-blockchain)	Data availability layer
Storage incentivized at consensus level	Yes	No	N/A (no consensus)	No
Native execution layer	Varies (yes in programmable datachains)	Yes	No	No
Smart contracts can read stored data during execution	Only in programmable datachains	Only their own contract state	N/A	No
Storage durability	Persistent, variable retention	Limited, expensive	Persistent (depends on pinning)	Short retention windows
Examples	Arweave, Filecoin, Walrus, Irys	Ethereum, Solana	IPFS	Celestia, EigenDA

In prose:

A datachain is different from a generic Layer-1 because storage is part of what the consensus mechanism rewards. A datachain is different from IPFS because IPFS has no blockchain and no consensus-level storage incentive. A datachain is different from a DA layer because a DA layer is designed for short blob retention to serve rollups, while a datachain holds data durably and gives smart contracts access to it.

FAQ

Is a datachain the same as decentralized storage?

No. Decentralized storage networks like IPFS hold data across distributed nodes but are not blockchains and have no consensus mechanism that pays for storage. A datachain is a Layer-1 blockchain whose consensus mechanism includes storage incentives. The difference matters because a datachain offers protocol-level durability guarantees that a non-blockchain storage network cannot.

Is Arweave a datachain?

Yes. Arweave is a Layer-1 blockchain whose consensus mechanism rewards miners for storing data, and it is one of the longer-running projects in the datachain category. Arweave's design optimizes for permanent storage with a simple fee-once economic model.

Is Filecoin a datachain?

Yes. Filecoin is a Layer-1 blockchain whose consensus mechanism includes proofs of storage. Filecoin's model is marketplace-mediated: clients and storage providers enter deals, and providers prove they are holding the data with Proof-of-Replication and Proof-of-Spacetime.

Is a datachain the same as a data availability layer?

No. Data availability layers like Celestia and EigenDA serve a narrower purpose: making blob data retrievable by rollups for short, defined retention windows. They do not target application-level storage, do not retain data indefinitely, and do not typically support smart contract execution. A datachain holds data for application use, with consensus-level incentives for durable storage.

What is a programmable datachain?

A programmable datachain is a datachain whose execution environment can use stored data directly during smart contract execution. In a first-generation datachain, stored data lives under the protocol's storage guarantees but is generally not readable by smart contracts during their own execution. A programmable datachain closes that gap by integrating storage and execution in the same Layer-1, so smart contracts can read stored data, branch on its contents, enforce rules, and write new data back to storage in the same execution. Irys is the first programmable datachain.

Datachain, in one paragraph

A datachain is a Layer-1 blockchain whose consensus mechanism includes economic incentives for storing data. Miners or validators earn rewards for holding data, making large-scale data storage a native function of the protocol. Datachains include Arweave, Filecoin, Walrus, and Irys and more; the differences across them come down to how they implement storage, verification, and execution. The next evolution of the category is the programmable datachain, where stored data becomes directly usable by smart contracts during execution. Irys is the canonical example of that evolution.

For implementation details on programmable datachains and the precompile that gives smart contracts access to stored data, see Programmable Data and the Irys docs at docs.irys.xyz.