Analyzing the Bitcoin Blockchain using the Hadoop Ecosystem – A first Approach

Bitcoin and other crytocurrencies have drawn a lot of attention of companies, public organizations and individuals. While many use cases exists there is still a long road ahead to make them part of everybody’s life.

The recently released first version of the open source hadoopycryptoledger library is a first attempt to make this happen. It currently allows analyzing the Bitcoin blockchain together with any data using Hadoop ecosystem tools. The Bitcoin blockchain is a distributed ledger containing all transactions executed over the Bitcoin network.

Hence, virtually all use cases related to analysis of the Bitcoin blockchain are possible. Some examples:

  • Predict Bitcoin exchange prices by analysing the Blockchain together with pricing information from Bitcoin exchanges
  • Explore relationships between counterparties in the blockchain
  • Explore impact of Bitcoin miners on the Bitcoin ecosystem
  • Trace Bitcoin money flows around the network
  • Link news events with Bitcoin blockchain data
  • Link economic data with Blockchain transactions

Currently the library provides a Hadoop File Format to analyze the Blockchain with any Hadoop application. For example, one can develop a Hadoop MapReduce, Spark or TEZ josb.

There are several enhancements planned for the library over the coming weeks, such as

  • Provide an example how to use Spark with the hadoopcryptoledger library
  • Integration of Blockchain data into Hive to enable end users to use SQL queries to analyze the blockchain
  • A flume source for receiving new Bitcoin blocks
  • Adding support for more crypto ledgers, such as Ethereum
Advertisements