Ethereum State Data Structures

Ethereum state is stored in four different modified merkle patricia tries (MMPTs): the transaction, receipt, state, and storage tries. At each block there is one transaction, receipt, and state trie which are referenced by their root hashes in the block Header. For every contract deployed on Ethereum there is a storage trie used to hold that contract's persistent variables, each storage trie is referenced by their root hash in the state account object stored in the state trie leaf node corresponding to that contract's address.

Trie Node IPLD

This is the general MMPT node IPLD schema used by all the below MMPTs (everything below except the AccountSnapshot). The different tries are broken up and explicitly typed below in order to demonstrate the different purposes and contents of these trie structures.

# TrieNode IPLD
# Node IPLD values are RLP encoded; node IPLD multihashes are always the KECCAK_256 hash of the RLP encoded node bytes and the codec is dependent on the type of the trie
type TrieNode union {
    | TrieBranchNode "branch"
    | TrieExtensionNode "extension"
    | TrieLeafNode "leaf"
} representation keyed


# The below are the expanded representations for the different types of TrieNodes: branch, extension, and leaf
type TrieBranchNode struct {
    Child0 nullable Child
    Child1 nullable Child
    Child2 nullable Child
    Child3 nullable Child
    Child4 nullable Child
    Child5 nullable Child
    Child6 nullable Child
    Child7 nullable Child
    Child8 nullable Child
    Child9 nullable Child
    ChildA nullable Child
    ChildB nullable Child
    ChildC nullable Child
    ChildD nullable Child
    ChildE nullable Child
    ChildF nullable Child
    Value  nullable Value
}

# Value union type used to more explicitly type the different values stored in leaf nodes in the different tries
type Value union {
    | Transaction "tx"
    | Receipt "rct"
    | Account "state"
    | Bytes "storage"
} representation keyed

// Child union type
// The type of value is a (CID) link to a TrieNode excepnt in the case where the RLP-encoding
// of the TrieNode is smaller than the hash the link is derived from (32 bytes)
// In practice this is only possible for a storage trie, where a leaf node (partial path + value) can
type Child union {
    | Link &TrieNode
    | TrieNode TrieNode
} representation kinded

type TrieExtensionNode struct {
    PartialPath Bytes
    Child Child
}

type TrieLeafNode struct {
    PartialPath Bytes
    Value       Value
}

Transaction Trie IPLD

This is the IPLD schema type for transaction trie nodes.

# TxTrieNode is an IPLD block for a node in the transaction trie
type TxTrieNode TrieNode

Receipt Trie IPLD

This is the IPLD schema type for receipt trie nodes.

# RctTrieNode is an IPLD block for a node in the receipt trie
type RctTrieNode TrieNode

State Trie IPLD

This is the IPLD schema type for state trie nodes.

# StateTrieNode is an IPLD block for a node in the state trie
type StateTrieNode TrieNode

State Account IPLD

This is the IPLD schema for a state account in the Ethereum state trie.

# Account is the Ethereum consensus representation of accounts.
# These objects are stored in the state trie leafs.
type Account struct {
    Nonce    Uint
    Balance  Balance
    
    # CID link to the storage trie root node
    # This CID links down to all of the storage nodes that exist for this account at this block
    # This CID uses the EthStorageTrie codec (0x98)
    # If this is a contract account the multihash is a KECCAK_256 hash of the RLP encoded root storage node
    # If this is an externally controlled account, the multihash is a KECCAK_256 hash of an RLP encoded empty byte array
    StorageRootCID &StorageTrieNode
    
    # CID link to the bytecode for this account
    # This CID uses the Raw codec (0x55)
    # If this is a contract account the multihash is a KECCAK_256 hash of the contract byte code for this contract
    # If this is an externally controlled account the multihash is the KECCAK_256 hash of an empty byte array
    CodeCID &ByteCode
}

type ByteCode bytes

Storage Trie IPLD

This is the IPLD schema type for storage trie nodes.

# StorageTrieNode is an IPLD block for a node in the storage trie
# The root node of the storage trie is referenced in a Account by the StorageRootCID
type StorageTrieNode TrieNode