Summary of Decentralized Storage Network

Hacker Dōjo Workshop:
Bounty链接:Hacker Dōjō|课题研究:去中心化存储网络范式 | Bounties | DoraHacks
创作者:0xhhh @EthStorage
本项目由Hacker Dōjo资助,文章转载请联系
Telegram: @HackerDojo0
WeChat: @HackerDojo0

Why we need Decentralized Storage Network?

When facing this question, we should know we need DSN to achieve a great goal that supports to store of large amounts of data in a decentralized manner.

With the emergence of more and more Dapps, we obviously need a better-decentralized storage platform to provide a decentralized way to store data.
When DAPP uses a centralized storage platform, the access and ownership of data is controlled by the cloud service provider, because the data is stored on the server of the cloud service provider. The cloud service provider of centralized storage has already happened a lot because of Data loss due to disasters.
At the same time, when Dapp uses web2 cloud services, you also need to use web2 access methods to access data, such as the HTTP protocol, which makes decentralized applications less decentralized, so we expect to find a more decentralized central way to store data.

Here, we summarize the features of traditional cloud services:

  • Long-term decentralized storage solution for other Rollups
    • OR, ZKR
  • Fully decentralized front with dynamic websites
    • FILECOIN/AR can only do static ones
  • Fully on-chain NFTs
  • Web3 Social Dapps
  • Blockchain Games
  • Low Storage Fee

And we also summarize the features of a classic blockchain system:

  • Friendly data interoperability
  • Full replicas
  • Independent data ownership
  • Web3 access protocol
  • Permissionless Access
  • High Storage Fee

We can identify the factor that leads to the high storage fees of classic blockchains, compared to the low storage fees of traditional cloud services.

The answer is the number of replicas. Traditional cloud services with fewer replicas have lower storage fees, while blockchains with more have higher storage fees.

The reason why blockchain must maintain more replicas is that the nodes of the blockchain need to determine the unified world state.

It means that if we want to reduce the storage fee of blockchain, we can do so by reducing the number of replicas. However, this may cause some problems because, without full replicas, there is no unified state.

So we can derive the DSN is the low replicas Decentralized Network.

2. Problems caused by reducing replicas

2.1 Node only stores the data they interested

The nodes of DSN can select the data they want to store, they are not required to store all data.

Like the diagram below, every node only stores the partitioned files.

2.2 Nodes hard to access all the data of the DSN

No node has all the data of the DSN, and DSN can not ensure that the node will share the data they have when other nodes request the respective data, So we need a layer to publicize all the DSN data and make any DSN nodes can download the data they interested which producing by users.

This is very different from the classic blockchain, where all nodes store the same data and it is easy to access any data stored on the blockchain by every validator and full node.

We call this layer publication layer, and we define the proof of publication.

  • Proof of Publication: Ensures that the data show up on the network initially and the nodes can choose to download and store the data of interest or just ignore the data.

How to design the publication layer

Next, I will show you the specific case of how Ethstorage designed the publication layer.

EthStorage uses the Ethereum DA Layer as our publication layer, like the below diagram.

Later, a large number of decentralized applications with full chain storage appeared.

It ensures that the data of the DAPP can be fully accessed by any client of the blockchain and at the same time benefits from the consensus mechanism of the blockchain (native web3 access way), it can also ensure that the stored data will not be lost because all nodes participating in the consensus need to save a replica of the world state data.

But it also leads to high storage costs.

So we expected to achieve a Decentralized Storage Network to solve the above issues:

  • data access
  • data ownership
  • data lost
  • High storage cost

And all the Decentralized Storage Networks are running like a blockchain but with fewer replicas.

So the question converts to what will happen if we reduce the replicas of the data?

  1. Nodes do not need to store all data, they can choose only the data storage they are interested in.
  2. We need to provide a way for the data to be stored to be released to the entire DSN network so that nodes have the ability to choose their favorite data to store.
  3. Data is stored off-chain, but settlement of storage fees and data commitment are stored on-chain.
  4. Storage Providers need to submit proof of storage of data to storage settlement on the chain to prove that they still store data off-chain and get the salary of data storage service.
  5. We need to provide a way to search for any data because no node has the full amount of data.
  6. We need to design an economic model to confirm that the DSN network has enough replicas for every data and make it never be lost.

We name the content described in the second point as Proof of Publication.

We named the content described in the first, third, and fourth points as Proof of Storage.

We name the content described in the fifth point as Proof of Retrievability.

We name the content described in the sixth point as the Storage Economic Model**.

So if we want to build a practical decentralized storage network, we just need to develop a system to achieve Proof of Publication, Proof of Storage, Proof of Retrievability, and Storage Economic Model.