DEV Community

Cover image for How to become a Blockchain Developer: A Complete Guide
Cameron Wilson for Educative

Posted on • Originally published at educative.io

How to become a Blockchain Developer: A Complete Guide

Blockchain is still very much in its infancy but the opportunities for developers to get involved are not only exciting but abundant.

Blockchain has injected itself into many different industries including supply chain, automotive, and financial, but it hasn’t been without its faults. Where we really started to see a use case for Blockchain is when cryptocurrency came around, and in particular Bitcoin. In fact, the first Bitcoin blockchain was created in 2009, by the pseudonym of Satoshi Nakamoto.

While Blockchain may be new and somewhat skeptical, it is a very sought after skill-set by companies of all sizes. If you’re able to develop and deploy Blockchain networks and have experience with Hyperledger Fabric, it’s safe to say you’ll be in high demand. But how do you go about becoming a blockchain developer? How do you build Blockchain applications? Let’s find out how you can get started with Blockchain development in 2020 and beyond.

Want to build your first blockchain application and deploy a fabric network? Check out our newest course on blockchain with Hyperledger Fabric.

Here’s what will be covered:

  • What is Blockchain Technology?
  • Pre-reqs for understanding Blockchain
  • The big picture: How is data stored in Blockchain?
  • The different Blockchain platforms
  • Most popular Blockchain programming languages
  • Introduction to Hyperledger Fabric
  • Why specialize in Hyperledger Fabric
  • Getting started with Blockchain development

Let’s dive in!

What is Blockchain Technology

Blockchain is a term used to describe DLT, or Distributed Ledger Technology. Blockchain is used to create a storage system for data in a distributed and immutable manner.

These are the key features from a technological standpoint.

Immutability

This means that once data is written to a blockchain data store or ledger, it cannot be changed – it’s there forever. In contrast, in a standard Relational Database, no matter how much security you implement, the data can be accessed and modified on the file system on which the data is persisted. This could be done by a corrupt admin or a hacker.

A blockchain system ensures that even if a bit of data is changed at any level on the ledger, the entire system reports an invalid state. And since the data is distributed on multiple systems, the actual data with valid state can be recovered from one of the systems.

Distribution

As long as you see data on a blockchain and it’s in a valid state on a majority of distributed nodes, you can trust that data to be accurate. This trust is key. This trust is achieved in a blockchain system by replicating the datastore on a number of peers (hosts) on the internet. If one of the misbehaving peers goes in an invalid state, the other peers can filter it out. As long as there is a majority of peers agreeing to a common valid state, you can completely trust the data that is stored on that system. This replication also guarantees high availability.

This trust is vital! No other system in the past has been able to develop this by design.

Pre-reqs for understanding Blockchain

Let’s dive into some prerequisites that you’ll need to really understand Blockchain. Here, we’ll be discussing Hash Functions and Public Key Cryptography.

Hash Functions

A hash value for data is X which is actually a one-way function:

HASH(X) = Y

Such that:

  • No other X’ can have HASH(X’) equal to Y. It’s one to one mapping.
  • The size of Y is fixed and the size of X can be arbitrary.
  • Given Y you can not calculate X. It’s a one-way function!

Alt Text

Hash Functions for Checking Integrity

This means that I can take a huge text file and compute its unique digest using a hash function. If I send that file and its computed hash along with it to a receiver, let’s say Bob, then Bob can go on to recompute the hash to ensure that file’s contents did not get corrupt in the transmission. When we download a file from the internet it uses the same hash functions to verify its integrity.

SHA256 Hash Function

There are multiple standardized hash function implementations that are used, and most notably SHA256.

You can find libraries that implement SHA256 hash in all technologies so you never have to write your code for SHA256 implementation.

Here's an example of data entered and the SHA256 value:

Alt Text

Quick notes on Hash Functions:

  • HASH(x) = HASH(y) implies x=y
  • SHA256 produces a 256 bit (8 byte) hash value, so there are 2^256 possible Hash values
  • Each input has a unique hash value

Public Key Cryptography

Public key cryptography is a cryptographic system used for encryption/decryption of data. It is not a one way like a hash function. That means the data, once encrypted, can be decrypted if you have the required key.

You start by generating a special, related pair of keys. These keys can be generated only together as an output of single execution of a key generation algorithm.

Alt Text

Key usage:

Any key can lock, or encrypt, the data. To unlock, or decrypt, we need the other key. The only way to decrypt and make the data readable is by having the corresponding key.

Alt Text

Quick notes on Public Key Cryptography:

  • If a person possesses a private key, the person cannot generate the corresponding private key using a key-pair generator.
  • The key difference between encrypting data using a hash function as compared to the public and private key in public key cryptography is that the input can be re-generated from the encrypted-output using public key cryptography

The big picture: How is data stored in Blockchain

In a blockchain system, data is stored in blocks of transactions. The most common definition is as follows:

A timestamped log of transactions that is replicated on peer networks.

Distributed Consensus

Now, some of these peers might be evil and intentionally report a tampered version of blockchain data. The entire network uses democracy to come to a consensus on the current state of data and any non-conformant or outlier peers are ignored/blocked.

This means that in order for a blockchain network to be fair and valid, most of the nodes have to be fair. If 51% of the nodes are compromised the network is hacked. Since the networks are globally distributed, this is not a possibility.

Data Storage in Blocks

Let’s see how blockchain stores data in blocks. Each block stores a data blob (which is usually a list of transactions), its block number, and hash of the previous block:

Each block is represented by a JSON object here for better understanding:

Alt Text

Alt Text

Let’s look at the previous block hash, referenced as “prevHash” in the above diagram.

When creating a new block, the hash of the previous block is calculated and added to the next block. Now if the previous block is altered later in time, the next block will be invalid as the prevHash stored in it will not match the actual hash of previous block.

Using one-way hash functions, the data in the blockchain is safeguarded from tampering.

You have seen how we can store data in a structure called blockchain. In a distributed blockchain network:

Multiple peers have a process(peer.exe) running that maintains a ledger on their local storage.

Alt Text

This process connects to other process instances running on other machines to receive updates on new blocks, transactions, health and validity checks etc to keep itself updated.

Alt Text

A transaction/block can be appended by any peer and is then broadcasted to all peers.

Alt Text

Since multiple peers are adding transactions/blocks simultaneously, the consensus protocol along with the ledger implementation ensures “validity” and ordering of transactions in blocks forming the blockchain.

The different Blockchain platforms

There are many platforms available to professional blockchain developers, so let’s explore the two most notable.

Ethereum

At its simplest, Ethereum is an open software platform based on blockchain technology that enables developers to build and deploy decentralized applications or dapps. Ethereum uses a smart contract on the Ethereum Virtual Machine for different applications to use decentralization and make it useable for mass consumption.

Ethereum is a generic platform with a smart contracts engine. So, you can apply it almost anywhere. However, as it’s permissionless and provides full transparency, it would cost you privacy and scalability.

Hyperledger Fabric

Hyperledger targets building enterprise blockchain applications with an emphasis on security. When compared to Ethereum, you won’t be seeing an issue of privacy and scalability because of the permissioned nature.

Hyperledger is designed to provide a framework with which enterprises can put together their own, individual blockchain network that can scale and is more secure.

Most popular Blockchain programming languages

Blockchain developers have a slew of languages that are available to them, so let’s explore a few.

Solidity

If you’re looking to specialize in Ethereum, then you want to become familiar with Solidity. This programming language is specifically used to write smart contracts. It is object-oriented, statically typed, and was designed around the ECMAScript syntax to make it more familiar for web developers.

Go

Go is ideal for blockchain development, especially for building hyperledger fabric. The statically-typed yet compiled language has the performance level needed by a blockchain coding language. It’s one of the best programming languages for blockchain when it comes to developing a system that is not only efficient but also fast. To get started with Go, you can visit, An Introduction to Programming in Go.

Python

What makes Python one of the best blockchain coding languages is its open source support. You can find third-party Python plugins and libraries for almost every problem you encounter when developing your blockchain project. Here’s a great free Python course that’ll help you learn Python from scratch.

C++

C++ is a great blockchain programming language for reasons such as its precise control over memory, advanced multi-threading capabilities, and core object-oriented features. The object-oriented feature of this blockchain coding language gives developers the ability to bind the data and the methods together, just like how blockchain binds blocks with chains. Learn C++ for free today and get started with Blockchain development.

JavaScript

Thanks to Nodejs, developers now can build blockchain applications with JavaScript.

Why choose JS over the other languages?

Well for one, it’s already running on most systems. That means you do not need to worry about integration and can concentrate exclusively on the application logic. Learn to code in JavaScript today.

Introduction to Hyperledger Fabric

Hyperledger Fabric is a “blockchain platform for the enterprise” created by IBM and is under the Linux Foundation. It is open source and modular - allowing different modules to be used, plug and play style. This enables a wide variety of enterprise requirements. It is designed to provide speed and scalability that is lacking in public chains due to proof of work requirement, which is essentially nuance mining.

Hyperledger fabric is ideal for building a permissioned, private blockchain business network. By private, it means that it should not be publicly open for everyone to run a peer or transact on the network. For enterprises, this a big requirement that Hyperledger fabric meets. Enterprises need more control on their data access policies. They also need a permissioned network so they can implement access control as per their own requirements.

At a high level, this is how a Hyperledger network works. The permission issuer issues or revokes permissions for all participants and infrastructure components of the network. This permission or access control in Fabric is based on X509 PKI infrastructure. Which means there is a trusted certificate authority that issues certificates to all participants.

Smart contracts hold logic that defines who can change what on the ledger. And participants write transactions on the ledger by invoking smart contracts.

Why specialize in Hyperledger Fabric

By understanding and working with Hyperledger principles, you’ll have the opportunity to contribute to the Hyperledger Fabric project which is owned by the Linux Foundation. It’s a global collaboration that moves across industries, so your knowledge and expertise would have a considerable impact, and as more industries make the move to blockchain, this project will only become more prescient. We’re already seeing companies like Walmart, McDonalds, Nestle, and Dole implement this technology. Here are some other notable projects going on.

The Linux Foundation is striving to standardize the process with this project and the opportunities to get involved are both exciting and abundant. Distributed ledger technology is going to be one of the biggest transformations of the digital world. Business transactions, the internet of things, and our resources could be safer and more secure using blockchain solutions.

Getting started with Blockchain development

While there was a lot covered in this post, there is still much more to learn and get your hands on.

In this Blockchain course, Hands-on Blockchain with Hyperledger Fabric, you’ll build upon the concepts mentioned in this post. You’ll also get to deploy your own blockchain network, deploy chaincode on it and create an application that invokes your chaincode running in fabric network. You will also learn to manage fabric user identities in your application using wallets.

This course is a great starting point for engineers looking to develop expertise in blockchain technology with Hyperledger Fabric specialty.

Happy learning!

Top comments (0)