Token Flow
Token Flow
The complexity of storage data is why so many crypto-novices ask the same question over and over again: “How can I iterate over hash maps (e.g. balanceOf of any ERC20 token)?” and are so surprised to hear: “You cannot!”.
For many use cases this is the only way to go. Let’s explore the options – you could build local databases or use one of existing data providers like The Graph or Dune Analytics to query the history of events. If you are a blockchain data analyst you know this pattern well: “I want a history of state so I have to aggregate historical events to recreate smart contract values at a given point in time”.
Ethereum Data Warehouse provides an easier way to do this. Its major and breakthrough feature is the detailed history of state and storage changes or state and storage diffs. It enables new types of queries that answer some basic questions in the simplest and most reliable way possible – by directly using variables of smart contracts.
Let’s see how we answer some frequent questions.
Question 1 - What is the full list of token holders at a point in time?
The traditional events-based approach takes in all the transfers and looks for unique addresses at the end. Depending on the token, this can be a massive undertaking. This would be something like this:
- Analyze the contract source code to fully understand its logic.
- Identify all types of events that can be used to identify new holders (typically it is enough to use Transfer events but, not always. WETH for example can also emit a Deposit event instead of Transfer so you have to use both of them).
- Make sure that new holders cannot be added without emitting any event (in such case you probably should also aggregate relevant internal calls).
- Select all identified events since the contract deployment and aggregate them to get a distinct list of holders.
Now with TokenFlow’s decoded storage diffs this task becomes much easier:
- Identify the variable that acts as the register of token holders and their balances. In this case it’s the variable balanceOf, which is a mapping (address => uint)
- Query all distinct keys used in this mapping.
A sample query to get the full list of holders of DAI would be:
select
distinct(hashmap_keys[0]) as token_holder
from
storage_diffs
where
contract_address = '0x6b175474e89094c44da98b954eedeac495271d0f'
and variable = 'balanceOf'
and not reverted
The decoded data makes this task trivial! You can just inspect the storage variable names for any contract and then query them to get their values.
Question 2 - How can I see the total supply of a token in time?
Assuming you preprocessed the events and calls data as before to get a list of transfers for a specific token, which can be a challenge of itself, you will then have to do the following:
- Understand that mints are USUALLY (not always) encoded as transfers from zero (0x0000…0)
- Understand that burns are USUALLY (not always) encoded as transfers to zero address
- Aggregate everyting
A sample query for SHIBA INU on our own TRANSFERS table:
with daily_changes as (
select date(block_timestamp) as date,
sum(iff(sender = '0x0000000000000000000000000000000000000000',amount,
iff(receiver = '0x0000000000000000000000000000000000000000', -amount, 0))) as daily_change
from transfers
where token_address='0x95ad61b0a150d79219dcf64e1e6cc01f0b64c4ce'
group by date)
select distinct date, sum(daily_change) over (order by date) as balance
from daily_changes
order by date desc;
This only works if the specific token is only created or destroyed by mints and burns. If a token has some sort of yield or interest mechanism, this approach will not give a comprehensive overview! Sometimes, contracts can use any arbitrary address for mints and burns, ie. 0x00000…1, 0x00000…2, in addition to 0x0. To get a complete overview, you should check for all the possible “burn” addresses.
With the storage diff approach, this is as easy as reading the most recent value for the _totalSupply (in the case of the SHIBA INU token) storage variable of the contract!
And this will ALWAYS give the correct answer provided the contract code increases or decreases total supply with every creation or destruction of a token unit as it should.
We will make use of the last_value window function to make things easier.
select
distinct(date(block_timestamp)) as day,
(
last_value(curr_value) over (
partition by date(block_timestamp)
order by block_number, order_index)
):: INTEGER as balance
from storage_diffs
where
contract_address = '0x95ad61b0a150d79219dcf64e1e6cc01f0b64c4ce'
and variable = '_totalSupply'
and not reverted
order by day desc
Question 3 - Which contracts store or use my address?
With the amount of hacks we saw in the past, that should be a basic security question, yet the traditional approach does not offer us any useful tooling here.
Let’s use TokenFlow instead then – simply look for all variables that use my address as a key and storage values that are equal to my address.
select
distinct(contract_address)
from storage_diffs
where
array_contains('0x0035fc5208ef989c28d47e552e92b0c507d2b318'::variant, hashmap_keys)
or curr_value='0x0035fc5208ef989c28d47e552e92b0c507d2b318'
Closing thoughts
When looking at the examples above, it’s quite obvious that with currently available tooling, there’s a huge chunk of Ethereum data that is entirely inaccessible to most data analysts.
TokenFlow’s EDW aims to change this paradigm so that:
- Entire history is accessible to everyone with basic SQL knowledge by providing a semantic, decoded layer on top of raw data
- Complex queries are 1) possible to execute and 2) cheaper/faster to run thanks to using storage history versus just events
- Most useful master datasets like dictionaries of contracts, tokens, pools, vaults, etc. are distilled for ease of use and standardisation of queries.
Excited to know more? Talk to us and let’s work together!
More articles
Raw data trap. The hidden value in decoded data.
Raw blockchain data is a public good, objective and true. It’s only natural to start building your analytics with it, but very quickly you realise

Focus on your use case. Leave the analytics to us.
Token Flow provides emerging teams with fully hosted data pipelines and an analytics team that works with and for them. It delivers more than just a product, it answers questions.

Understanding Blockchain State and Contract Storage: A Simplified Guide to Diffs
Understanding Blockchain State and Contract Storage: A Simplified Guide to Diffs Blockchain technology, while revolutionary, can often seem like a maze of complex concepts and