Introduction
A Merkle tree is a data structure used to store and verify data in a distributed system. It is a type of binary tree that is composed of a set of data blocks called “leaves”, which are connected by cryptographic hashes, forming a tree-like structure. The use of Merkle trees allows for the efficient and secure sharing of data between multiple parties, providing a way to verify the integrity of the data without the need to share the entire dataset. In this article, we will discuss the structure of Merkle trees, how they are used, and the benefits they offer.
07/14/2022

in-depthTechnical Basics

in-depthTechnical Basics
Main
- Merkle tree (hash tree) is an algorithm that allows you to get one hash for many pieces of data. The method is used to determine the integrity of files and verify information.
- A hash tree can be represented as a structure with branches branching from its base to intermediate nodes. At the ends of the branches are placed leaves representing data fragments. At the base of the tree is the root hash (Merkle root). The latter is a required element of the bitcoin block header.
- The root hash allows each transaction to be verified. For verification, only the block header and the operation authentication path need to be downloaded. Thanks to the Merkle tree, the amount of necessary calculations is reduced, which makes it possible to implement Simplified Payment Verification (SPV).
Who invented the merkle tree concept and when?
The hash tree concept was created by Professor Ralph Merkle. He invented a way to represent digital signatures in 1979. technology patent owns Stanford University.
The scientist suggested using a binary hash tree. Merkle also made significant contributions to the development of cryptography. He is best known for the 1987 publication “Digital signature based on conventional encryption function“.
What is the merkle tree for?
A centralized system provides data from a single source that all users rely on. The latter guarantees the correctness of the received information.
Blockchain is a distributed database. The information in it is stored on a set of independent nodes (nodes). A node cannot accept messages from other participants without checking them. The node needs to determine if the block contains valid transactions.
Merkle trees can be used to reduce computational costs. They allow you to reduce the amount of downloaded data and optimize their verification due to hashing.
The method is used in networks of bitcoin, Ethereum and other cryptocurrencies. With its help, a data string is obtained that verifies a group of transactions. The algorithm is also used in file systems and databases. Using Merkle trees, information is checked for errors and synchronized.
How does a Merkle tree work?
The Merkle tree is built from the bottom up. Values in leaf nodes are obtained by hashing data fragments. The nodes of the next level contain a hash of the sum of the two children. Used to combine data concatenation. The operation is repeated for nodes of the next levels until one hash is obtained. If the number of elements is odd, one of them is duplicated or carried over to the next level unchanged.
When building a tree, a single hash is obtained, which is called the Merkle root. The latter represents all pieces of data. Thus, a Merkle tree is a one-way hash function.
The algorithm allows you to build a binary structure in which the node values are formed from two strings. The last property provides the ability to verify a large amount of data without recalculating hashes for all fragments. The computational cost of determining the authenticity of a single element is much lower in this case.
To check the correctness of the array and its integrity, the root hash must be compared with the reference value. Fragments can be transaction data or parts of files.
How is a merkle tree used in bitcoin?
The bitcoin blockchain is formed from fragments that are written to the end of the chain. The latter contain data on transfers between users. The number of transactions, as well as the amount of information, is changeable, so the block does not have a fixed size.
To optimize calculations, bitcoin nodes form headers. They include the following elements:
- block version number;
- hash of the previous block;
- merkle tree root;
- timestamp;
- mining difficulty goal;
- the one-time code (nonce) used to generate the block.
The header does not include transactions and is 80 bytes. Since it is generated every ten minutes, the amount of data increases by 4.2 megabytes per year.
Transaction information is hashed, which makes it possible to obtain their identifiers. Transfer data is stored in hexadecimal format. The root hash represents all the transactions of the block. To find the latter, a Merkle tree is built. The data is processed according to the algorithm:
- There are hashes (identifiers) of transactions included in the block: hash(L1), hash(L2), hash(L3) and so on. They form the leaves of the tree.
- The hashes from the sum of two adjacent identifiers are placed on the next level: hash(hash(L1) + hash(L2)). In a binary tree, the number of nodes at each level must be even. Otherwise, the hash from the last cell is duplicated and placed in an additional element.
- The process of hashing the sum of the data is repeated until a Merkle root is obtained.
The resulting hash confirms the validity of each transaction. When forming the blockchain, the nodes use only the headers of the previous blocks.
In August 2017, an update to the Segregated Witness protocol was implemented. The latter uses a different structuring of transactional data, which makes it possible to reduce the block size. The adoption of the update has reduced the load on the bitcoin blockchain.
What are the benefits of a Merkle tree?
Hash trees make it easier to check whether a transaction belongs to a specific block and ensure the integrity of information during its transmission. The method is required for simplified payment verification. Satoshi Nakamoto suggested using SPV in bitcoin white paper.
If the node has significant computing resources, it can download all the blocks and find the hash of each transaction. After that, Merkle trees are formed. The latter allow you to check the integrity of the data and the validity of each operation. If a node’s resources are limited, it can only request a block header and transaction IDs.
Light Clients download the header and authentication path (Merkle proof) for the transaction being verified. They request information from the full node. The authentication path includes hashes from each pair of tree nodes that are on the path from the top to the transaction.
To verify the operation, it is necessary to find the Merkle root. The transaction is confirmed if the received hash matches the string contained in the block header. It is almost impossible to select the required Merkle root from another data set, which guarantees the validity of the operation.
The SPV method allows a light client to effectively interact with the blockchain and reduces the amount of information downloaded. For example, a block size with five transactions can be 500 kilobytes, while a merkle proof is only 140 bytes.
What hash tree is used in Ethereum?
A binary Merkle tree is well suited for an array represented as a list. Its structure remains unchanged, which is convenient for hashing transactions. Ethereum uses a different way to represent data – a prefix tree.
The latter allows you to store information in an associative array. Rows are keys that define the positions of its elements. To form the structure, the branches of the tree are denoted by different symbols so that the element key uniquely identifies it.
Unlike Bitcoin, the Ethereum blockchain uses three hash trees:
- transactions;
- states;
- a tree containing data on the results of transactions.
The block header includes three root hashes. Ethereum allows you to create lightweight clients that can perform a basic set of operations:
- check the presence of a transaction in the block;
- confirm the existence of a specific address;
- determine the user’s balance;
- find out the result of an operation or a smart contract.
These actions are carried out without full loading of blocks. Hash trees simplify calculations, allowing you to run lightweight clients on personal computers, laptops, and smartphones.
The algorithm for processing transactional data for Ethereum is similar to that for bitcoin. Interaction with the state tree is more complex. The array element key defines the user’s address, and its value represents the account balance.
A feature of a hash tree is the need for frequent data updates and the need to add and remove addresses. To implement the algorithm, a mutable structure is required. Its parameters are limited in order to prevent a DDoS attack that allows an attacker to create a tree that is too deep. Otherwise, updating the structure and conducting operations will take a significant amount of time.
The root of the tree must be determined solely by the data and not depend on its parameters. Accordingly, it is necessary to be able to make updates in any order.
The prefix tree allows us to solve these difficulties. In Ethereum, each structure element has 16 children. This approach is optimal for encoding nodes in hexadecimal format.
In a prefix tree, to obtain a key, you must specify the characters in a row corresponding to the branches. They define the path from the root to the selected element. The key value is dynamic, allowing new nodes to be added or removed.
Found a mistake in the text? Select it and press CTRL+ENTER
CryptoNewsHerald Newsletters: Keep your finger on the pulse of the bitcoin industry!
Conclusion
A Merkle Tree is an essential data structure in blockchain technology, allowing for efficient and secure verification of data. It is a powerful tool that is used to store and transfer data in a secure and efficient manner, while ensuring the integrity of the data. Merkle Trees are used in many different applications, such as digital signatures and financial transactions. With its ability to provide security, reliability, and scalability, Merkle Trees are an essential component of the blockchain technology.
Comments (No)