Recently, DAS founder Tim Yang (杨敏) developed DAS on Nervos CKB
Decentralized account service. Through this product development, TimYang will pass "Understanding CKB from DAS"
"Application Development" series of articles, explain his design ideas and development process to everyone, let everyone know how to build product-level applications on the world's first public chain CKB based on UTXO architecture.In the first article, Tim will introduce you to the first big problem they faced when designing DAS-how to ensure the uniqueness of the DAS account. Welcome to read and experience.
Opening
DAS (Decentralized Account Services) is a decentralized account service based on CKB. The DAS project itself aims to provide New World with an account system that is both anti-censorship, unique and identifiable. In the first stage of DAS, it looks like Ethereum's ENS, and has some better features than ENS. But what DAS needs to do is not only a better ENS, but also an attempt to bring a new definition to the puzzle of "decentralized account/identity" in the encrypted world.
DAS is not a conceptual product. It is currently running on the CKB testnet and is expected to be launched on the mainnet in the near future. You can experience the test version https://da.services
DAS is a blockchain application developed based on CKB. In many public chains, why do we choose to develop based on CKB? There are two reasons:
- PoW consensus + Cell (UTXO) model
- Custom cryptography primitives (highly open architecture), based on this, we can realize that DAS accounts can be held by any public chain address.
CKB is a rare public chain platform that builds a smart contract environment on top of the UTXO model and advocates "off-chain computing, on-chain verification". These propositions and designs have been fully considered and are very forward-looking, but at the same time they also bring a new paradigm of decentralized application development. Developers who are accustomed to centralized application development and Ethereum smart contract development will be full of discomfort when they first come into contact with CKB development. In addition, there is no benchmark application, which makes developers full of doubts about what CKB can do and whether it is really worth the effort to learn CKB.
The purpose of the series of articles on "Understanding CKB Application Development from DAS" is also here. We have compiled the problems, thoughts, and solutions in our DAS practice into a series of articles to let everyone understand how we build product-level applications based on CKB. I hope this will inspire more developers to understand what CKB can do and how it should be done.
needs to be explained:
When facing a problem, the ideas and solutions we adopt are not necessarily the optimal solution, or even with a high probability. But if these ideas and solutions that meet our scenario can inspire everyone, the goal will have been achieved.
This series of articles assumes that readers have fully understood the Cell model and the "off-chain calculation, on-chain verification" model.
How to ensure the uniqueness of the DAS account
In the first article, we will discuss the first thorny problem faced by DAS:
Every DAS account needs a Cell to store its data. Cells are created through different transactions, which means DAS
The global status data of the system is scattered and stored in every corner. At the same time, each DAS account must be unique. Well, when a DAS
When account registration occurs, how do we judge whether the account already exists?
Let's generalize this question: how to ensure the uniqueness of each piece of data when inserting data in a distributed storage data set?
For developers who are accustomed to centralized application development and Ethereum smart contract development, it is necessary to ensure that the registered accounts are not duplicated. This is a matter of almost no thinking. You can put all the data in the storage space of the contract. Since these data are stored centrally, you only need to retrieve whether the data exists before inserting the data.
However, in view of CKB's Cell model, the data is stored in the user's own space, and we cannot retrieve all the data on the chain. After all, it is impossible for us to put down all existing Cells in the input of a transaction. Even if it can be put down, the on-chain script cannot know whether the transaction initiator really put all the required cells into the input when the transaction was constructed.
We will list all the solutions that we have considered to ensure uniqueness. The reason for analyzing the solutions that were not adopted in the end is that we hope that everyone can start to adapt to the development paradigm of CKB by observing the "detours" we have traveled, and avoid "detours" on their own in the future.
Before discussing the plan, we should first clarify our design principles. It is these principles that ultimately determine what kind of plan we adopt. These principles, in descending order of priority, are:
- Degree of , for the goal of DAS, decentralization is the most basic principle
- user experience , technical solutions are not allowed to bring bad user experience
- Engineering complexity , the simpler the architecture is often the more effective
- cost is low , save the cost as much as possible
If you only care about the final plan, you can jump directly to "Plan 6" and start reading.
Solution 1: Store all accounts in a Cell
This is the most intuitive solution, after all, Ethereum's smart contract can do this. Create a GlobalStatusCell and store all registered accounts in the data of GlobalStatusCell. When a new registration occurs, take this GlobalStatusCell as input and the modified GlobalStatusCell as output in the transaction. The type script checks whether the newly registered account already exists, if it does, it returns non-zero, and the transaction fails; if it does not exist, it checks whether the new account is included in the output GlobalStatusCell, and then returns 0, the transaction is successful, and the registration is complete.
The reasons why this idea is not feasible are:
- Cell competition problem , each new account registration needs to spend this GlobalStatusCell as input, and a Live Cell can only be spent once, which means that at the same time, only one registration request can be processed forever. Users who fail to compete with Cell have to sign transactions over and over again until they successfully compete to Cell.
- Space cost issue , CKB is a layered architecture, and the final state space on Layer 1 is limited to about 80 GB. To store data on it, you need to use CKB to purchase storage space. Assuming that 100w DAS accounts are eventually registered, the capacity required by this GlobalStatusCell will be huge. Of course, since this storage space is gradually increasing with the amount of registration, for a single user, only need to pay CKB for the incremental space corresponding to this registration, and the cost of a single user is still acceptable.
In fact, we will find that "Cell competition problem" is a problem that needs to be vigilant when developing applications on CKB. Its impact on user experience can be fatal.
Option 2: Distribute all accounts to multiple Cells
Since putting all accounts in one GlobalStatusCell will cause competition, how about spreading the accounts to multiple accounts? For example, to hash the account name, put all registered accounts with the same hash value in the first 3 digits into the same SubStatusCell. When a new registration is generated, the corresponding SubStatusCell must be consumed to modify its internal data.
There are still some problems with this scheme:
There is still certain Cell competition. If you create SubStatusCells according to the first 3 bits of the hash, you need to create 4096 SubStatusCells in advance. Assuming that there are 50 concurrent registration requests in a cycle, according to the "drawer principle", there is still a 26% probability of appearing Cell competition. Although the concurrent request of 50 is slightly harsh and may not be reached in the early stage, it should be realized that:
- Since the number of SubStatusCells is fixed, the probability of this kind of competition is the same regardless of the stage. The "probability" itself means uncertainty, and its user experience may not have an impact, or it may be very large.
- There is a cost during initialization. Assuming that a SubStatusCell only needs 100 CKB as its capacity at the beginning, then 409,600 CKB is required to initialize all SubStatusCells.
Again: When developing applications on CKB, you should always pay attention to how much CKB storage space your application will occupy, because the total state space is extremely limited.
Option 3: DAS officials judge whether an account has been registered
All registration must be done through the official DAS service. After the official DAS judges that it is possible to register, it will sign a transaction with the official private key and issue the DAS account Cell to the user. This scheme is very simple in implementation, but the problem is also obvious:
- not decentralize , how to ensure that the uniqueness judgment of the official DAS service is correct. What if the official main action is evil? What if the official is acted evilly because of a program failure or improper storage of the private key?
- Dirty data problem , no matter what form of evil, centralized judgment is an off-chain judgment, which cannot guarantee uniqueness absolutely effectively. Therefore, dirty data on the chain may be generated at any time. How to clean up these dirty data? It is necessary to introduce a set of dirty data cleaning mechanism.
- derived from the availability problem . If the official service goes down, the entire registration service is unavailable.
Solution 4: Then multi-centralization, use multiple off-chain nodes to determine whether an account has been registered
For example, find 7 "trustworthy" organizations as super nodes to manage their private keys. Super nodes run the super node service program and store all registered accounts in their own centralized database. When a registration request is generated (referring to the user constructing a cell containing registration information), each super node will determine whether it has been Registered. If it has not been registered, then use the private key to sign a transaction and release a cell indicating that "this super node thinks this account can be registered". When more than 4 super nodes release such a cell, one of the nodes All these Cells will be gathered as a basis to create a DAS account.
This kind of thinking seems to be a good solution to some of the problems in the third solution, but it introduces more problems:
- Trust issue , "Trustable" organizations, how to be trusted organizations, how should we select these 7 nodes. The ethics of an organization may be trustworthy, but it does not mean that its behavior is also trustworthy. We can find the most credible organization to be the node, but in the early stage of a project, it is difficult for the most credible organization to have the motivation to maintain the node.
- Dirty data problem , due to "inevitably" program bugs, these super nodes may make consistent misjudgments. When a consistency error occurs, there must be a logical mechanism for cleaning up dirty data
- Node rotation problem , due to the loss of the private key or other reasons, the node inevitably needs to be rotated, how is the rotation performed? Is it through off-chain negotiation or on-chain consensus? Off-chain negotiation means that there must be a set of open and transparent governance processes; on-chain consensus means that there must be complex engineering implementations.
- complexity , which includes not only the complexity of engineering, but also the complexity of governance. Much of the work has deviated from the business logic of a dApp itself. Just imagine, if every application developer needs to consider so many issues that are not highly related to business logic, then the application cannot be created efficiently. This also means that the multi-center approach must not be the best practice.
Option 5: Do not remove duplicates during registration and remove duplicates during analysis
Since it is so complicated to realize the duplication during registration, it is not necessary to deduplicate it during registration. Anyone can "register" any account at any time, and then when the user wants to query the analysis record of an account, the analysis program will find the earliest "registered" account and return it to the user as a legal account.
The main problem with this unique way of thinking is how to ensure that the client runs a "reasonable" parsing program:
- Will all developers run a unified parser?
- When the official resolution program is upgraded, will those developers who choose to run the official resolution program, and can they upgrade it in time?
If it is not guaranteed that everyone will always run the same latest parsing program, the entire system is bound to be inconsistent at the application level. This will lead to various forms of fraud, and eventually everyone will lose confidence in this system.
Option six: ordered linked list
Finally, let's introduce the scheme finally adopted by DAS-ordered linked list.
We will describe the problem we want to solve in a more general way:
For distributed data sets, how to ensure the uniqueness of each piece of data when inserting data?
The answer is to use a logically ordered linked list. Thanks @guiqing for the inspiration.
Each registered DAS account has a Cell used to store its related information, called AccountCell. We require all AccountCells to be sorted in a certain order, such as lexicographical ascending order by account name. When registering a new DAS account, its AccountCell must be inserted in the appropriate position to ensure that this order is not disrupted.
The simplified structure of AccountCell is as follows:
Note: The value of account_id is the account name, just for the convenience of presentation. In fact, DAS uses the first 10 digits of its account name hash.
We assume that there are already a.bit and b.bit on the chain. Now a user wants to register d.bit. The structure of the linked list before registration is as follows:
The linked list structure after registration is as follows:
Subsequently, a user wants to register c.bit, then the linked list structure after registration is as follows:
From the above we can see that when a new account needs to be registered, the next_account_id field of the AccountCell in the front of the linked list needs to be modified. This also means that a transaction needs to be constructed to consume the cell in front of it and create a corresponding new cell. Regarding which Cell should be modified, that is, where the new DAS account should be inserted in the linked list, these are automatically completed by the user's registration program according to the state on the chain (see, under-chain calculation ).
What if the registration process accidentally (or maliciously by the user) constructs a transaction, tries to create a duplicate account, or inserts the account into the wrong location. This time our type script takes effect, lead to such a transaction failure is not packed into the block (see, verify the chain ).
The cell type script will run both when the cell is input and output. Our type script can make some judgments, such as:
- In inputs, whether the account_id of the imported parent AccountCell is less than the account_id of the newly registered account
- In inputs, whether the next_account_id of the imported parent AccountCell is greater than the account_id of the newly registered account
- In outputs, is the next_account_id of the new parent AccountCell equal to the account_id of the newly registered account?
- In outputs, is the next_account_id of the newly registered account equal to the next_account_id of the parent AccountCell introduced in inputs
Therefore, if the above judgment results are true and the entire transaction structure also meets some other necessary conditions, the type script will return 0, which means that this is a legal transaction. When the transaction is included in the block, The account registration is complete, and the status of the DAS system is updated. For transactions that do not meet these conditions, they are simply illegal transactions and will not be successfully registered.
It can be seen that this scheme satisfies the four design principles we set earlier.
Further derivative
judges the data repeatability, is it so complicated on CKB?
We must understand that the essential reason behind the "complexity" is the UTXO model, which leads to the decentralized storage of data.
Then why does CKB adopt UTXO model, isn't ETH's account model good?
The UTXO model and the account model have their own advantages and disadvantages. Some of the advantages of the UTXO model are:
- parallel computing . All transactions under a single account of ETH must be serialized. If a transaction is stuck, all subsequent transactions cannot be performed.
- User data stored in the user’s own UTXO (Cell) instead of being stored in the contract. Isn’t it more in line with the spirit of decentralization?
We should understand that the "complexity" of feelings comes more from our unsuitability to the new paradigm.
On-chain verification should be regarded as a protocol
As you can see, the constraint of type script is more like an agreement. He stipulates what kind of input and output a transaction should have, but who creates the transaction and how to create the transaction is not a concern of the agreement.
Solution 6 also has Cell competition issues?
Yes, if multiple newly registered accounts should be inserted directly behind a certain AccountCell, then they will face the issue of Cell competition. Therefore, we will introduce in the next article, how uses a mechanism we call "Keeper" to completely solve the cell competition problem on the basis of Solution 6.
Finally, as we mentioned at the beginning:
When facing a problem, the ideas and solutions we adopt are not necessarily the optimal solution, or even with a high probability. But if these ideas and solutions that meet our scenario can inspire everyone, the goal will have been achieved.
be continued...
In the next article, Tim will introduce us a mechanism called "Keeper" to deal with cell competition issues. Welcome to https://talk.nervos.org/t/das-ckb-das/5669 remind you.
If you have more experience of using DAS products, as well as insights developed on CKB, please go to the Nervos Talk forum to discuss:
https://talk.nervos.org/
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。