Intro to Advanced Data Layouts

Advanced Data Layouts ("ADLs") are an IPLD convention for customizing how to see and interact with some data.

A slightly more technical definition is: an ADL is some code which is applied to some Data Model data in order to make it look like another Node; or when writing, an ADL presents a single NodeBuilder which transforms any input into other Nodes (or possibly even several Nodes, perhaps even across several blocks when serialized!).

One of the most common uses of ADLs is sharded datastructures. However, other uses are possible. (For example, IPFS uses ADLs to make UnixFS's user-facing pathing work. Some people have researched using ADLs as part of encryption system design. More examples will be discussed below!)

There are some forms of loose standardization for ADLs that are commonly used. It's worth looking for existing ADLs that do what you need before rolling your own! (For example: If you're looking for a sharded, scalable map -- you're not the first! That's just one example of code you'll find you can share with others.)

Learn briefly about where ADLs fit and what problems they solve in these sections:

Then, learn more about the details of what ADLs are and the boundaries of their interface in these sections:

Quick Examples

Where are ADLs in the big picture?

Read the docs about the Data Model first, if you haven't already. ADLs build upon the concepts that are introduced and standardized by the Data Model.

ADLs appear at a middle level of most stacks, if they're present.

ADLs are also entirely optional parts of IPLD: they're useful, but they're not the first thing you need to implement if building a new IPLD library in a new language.

Schemas vs ADLs

Both Schemas and ADLs can be described as "lenses" for data, but they have different purposes and scopes. Schemas only allow very specific "lenses", are designed to be fast, and are mostly intended for structuralizing data and validating it. ADLs have a much broader scope: ADLs allow arbitrary plugins, can contain complex data transformations, and can even trigger multiple data load and store operations internally (as they do when used for sharding algorithms).

As described above, neither Schemas nor ADLs depend on the other. Both are optional parts of IPLD. Both can be used together or independently.

Applications of ADLs

ADLs are generally used to make some complex system simpler, or more legible.

Because ADLs make complex data structures readable and writable as "just" a Node, it means all the features of IPLD that work over regular Nodes work over ADLs, too.

For example:

This reusability makes a ton of features possible for building systems with ADLs, and makes it work with a minimum of development effort.

In particular, the Selectors story is quite powerful, because it has no fallback. Having a Selector walk over the inner state of an unknown datastructure (let's take a HAMT as an example, though the principle is general) is only possible if you know the load factor of the state structure, or other specific details of its internal state. For many applications of Selectors -- especially, say, the user of Selectors to ask someone else on the internet to send you data that you don't already have -- this would make Selectors all but useless. However, by running Selectors over an ADL, things work out nicely.

How ADLs Work

ADLs make nodes look like another node

  • emphasis on one node as the result: whether it be map or list or bytes or etc, one.
  • include concrete example of what kind of transformation you'd be better off doing with schema.

ADL interior data is still Data Model

  • clarify that without ADL code activated, the raw data can still be read and even traversed... just differently.
  • clarify that codecs and ADLs compose, there's a clear layering there.

ADLs use code

ADLs use code, and some sort of plugin system is needed in IPLD libraries to support this.

How exactly those plugin systems work, and what kind of format the code needs to be authored in, and exactly what interfaces need to be adhered to: these will all vary per IPLD library and the language the IPLD library is in.

(Someday, a system for portable ADL code would be neat. However, we currently consider that a research problem: some notes can be found in open-research/ADLs-we-can-autoexecute.)

How do I know when to use an ADL to read data?

We call this "the signalling problem".

It has it's own page: go to ADL signalling to learn more.