Spark+Scala+Graphx: Analyzing the Bitcoin Transaction Graph

The hadoopcryptoledger library provides now an example how you can generate a Bitcoin Transaction Graph using the Big Data graph analysis technologies Spark+Scala+Graphx. Basically it demonstrates how to read the Bitcoin Blockchain from HDFS, transform it into a graph with Bitcoin addresses as vertices and transactions between them as edges. The example returns the 5 top bitcoin addresses having the most input transactions. This could indicate that they belong to Mixing services that try to obfuscate transactions between two addresses. The graph exemplified in the following figure showing four vertices with transactions between them:

transactiongraph

Of course this is just one example. You can think about numerous of other analysis related to this graph using algorithms such as strongly connected components or PageRank. Particularly if you connect it with other data that you collect related to the blockchain. You can also use this graph to do visual analytics on it.

In the coming weeks, further extensions are planned to be published:

  • Some common analytics pattern to analyze the Bitcoin economy

  • Some technical patterns, such as Bitcoin block validation

  • A flume source for receiving new Bitcoin blocks including Economic and technical consensus (storing and accessing it in the Hadoop ecosystem, e.g. in Hbase)

  • Adding support for more crypto ledgers, such as Ethereum

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s