Many people inquire us about what fundamentally sets Web3 data analysis apart from conventional data analysis. Isn't it all just data? The answer is not straightforward. While at its core, it involves data processing and interpretation, the collection, storage, and usage of data in Web3 are different from traditional methods.
In this post, we’re not going to dive into all the differences between Web2 and Web3 terminology. Instead, we’re going to focus on major differences with respect to analyzing data on-chain (on a blockchain) vs data that lives off-chain (in a database). What makes on-chain data particularly interesting is that it is public and transparent, making it possible for anyone to do on-chain data analysis. This is the reason why Crypto Twitter is full of amazing market, project and smart contract analysis.
The Differences Between Web2 and Web3 Data Analytics
Centralized vs Decentralized
In Web2, data analytics is typically centered around collecting as much data as possible from users and storing that data in a centralized database. The goal is usually to analyze user behavior or sales data in order to drive sales and advertisements.This typically works well for use cases in E-commerce, CRM and financial services. A frequent topic is how data is owned by big companies and used without user consent, which raises privacy concerns.
In Web3, transparency and ownership of assets are important ethos. The data is stored publicly on-chain and cannot be removed as it is processed in a decentralized manner (blockchain nodes are spread worldwide).
Web3 data is the result of users interacting with DApps (decentralized applications), smart contracts (self-executing pieces of code on a blockchain) or when transferring cryptocurrencies. Usually this data is used to analyze market trends, user/wallet behavior or DApp performance (track smart contract activity, volume, etc).
How Does a Typical Web3 Data Stack Look Like?
In Web2, analytics rely heavily on trackers, centralized data platforms and SaasS.Well-known examples include Google Analytics and Salesforce. We won’t go further into specific Web2 analytic stacks, as there is a lot of content on this available. We will focus on the specifics of Web3 data stacks.
In Web3, data is public, anyone can query data and run a blockchain node to get data, which is a significant shift from the centralized data in Web2. Nodes are essentially computers that connect to a blockchain network, allowing them to participate in verifying and recording transactions. Running a node gives direct access to real-time blockchain data.
However, running a node requires technical expertise and resources, as blockchain data is vast and continuously growing. For this reason, many projects opt for hosted node solutions provided by specialized services. Nodes are useful to get raw blockchain data, but often this data is not standardized, cleaned or decoded to a human readable format. Hence, handling raw blockchain data is providing another challenge for many users and developers.
With the recent rise of NFTs and DeFi, there is a need to provide data for DeFi dashboards and displaying NFT metrics. This is why data indexers have emerged that provide structured indexed data through easy to use interfaces (Etherscan, Dune Analytics) and APIs (The Graph).
How is Web3 Data Analysis Done?
Web3 data analysis differs not only in how data is stored and processed but also in the type of data itself. The challenge with Web3 data is the complexity, scale, pseudo-anonymity and diversity of the data across different blockchain networks. Networks like Bitcoin, Ethereum, and other virtual machines each come with their own data structures and formats.
Blockchain data consists mainly of blocks with transactions. Each transaction will contain a certain set of fields that a user has set, such as a receiver, amount of tokens or gas. Gas refers to the fee required to conduct a transaction or execute a smart contract. Many data fields are also encrypted in a hexadecimal representation, which also need to be decrypted into a human readable format. Blockchain explorers like Etherscan help users structure this data to lookup their on-chain transactions.
Explorers are good for single transactions. For market analysis there are various Web3 data analysis platforms that have indexed and structured blockchain metrics to make it easier for analysts to analyze market metrics (e.g. Messari, Nansen, etc). Also popular are self-service data platforms where anyone can query their own blockchain data by using SQL (Dune, Flipside), which data analysts are familiar with from Web2.
The big challenge with blockchain data comes from interpreting transactions, events, traces, smart contracts and how a sequence of events and actors tie together. That is a topic we can dive deeper into in a next blogpost.
How Do DApps Process Their Data?
The previous data solutions suit analysts and users doing one-off data analysis. For DApps that process their data real-time, this is a different case. It is important to understand that on a blockchain the data layer is shared. Unlike in Web2, where every application has its own separate database, on a blockchain all projects data is stored together, whether it’s a transfer, an NFT project or a DeFi lending protocol. So how do DApps filter out their own project data in this sea of data?
For real-time data retrieval or operational metrics, many DApps resort to data providers or popular GraphQL solutions, like The Graph, to return filtered data based on their smart contract address and their interested metrics. The Graph allows DApps to define their own APIs known as subgraphs, and to retrieve specific metrics from the blockchain to analyze and display to their end-users.
Conclusion
Analyzing blockchain data can be quite daunting at the beginning, but equipped with the right fundamental knowledge about blockchains, smart contracts and data analysis skills, these gaps can be overcome. The Web3 data space is still emerging, a lot of tools are not mature yet, but that’s what makes this space exciting to us. The fact that all data is open and public, means anyone can roll up their sleeves and start analyzing data!
We are constantly helping companies in both Web2 and Web3 educate themselves about data analytics, blockchains and its intersection Web3 data analytics.
If you want to know more, leave a message and we’ll get in touch!
We can’t wait to see which data insights you are building!
Kommentarer