DEV Community

Cover image for Demystifying Public Key Cryptography: Exploring Ethereum, Digital Signatures, and Wallet Management
Sahil Sojitra
Sahil Sojitra

Posted on

Demystifying Public Key Cryptography: Exploring Ethereum, Digital Signatures, and Wallet Management

Ethereum relies on a powerful technology called cryptography, which uses math and computer security to keep things safe. Cryptography is more than just secret writing (encryption). It can prove knowledge of a secret without revealing it and ensure data is genuine. These tools are essential for Ethereum and other blockchain systems, and they are widely used in Ethereum applications.

Cryptography provides mathematical tools called cryptographic proofs. These tools allow us to show we know a secret without telling anyone what it is. For example, a digital signature can prove we own a secret key without showing the key itself. Digital fingerprints, called hashes, help us check if data is real and hasn't been tampered with.

Contrary to what you might think, Ethereum doesn't use encryption right now. This means that all communications and transactions in Ethereum are readable by everyone. It's done this way to make sure everyone can verify what's happening and reach an agreement. However, in the future, more advanced cryptographic tools may be used to encrypt some calculations while still keeping everyone on the same page.

Public key cryptography (PKC) is crucial in Ethereum for controlling who owns funds. It uses a pair of keys: a public key for encrypting and a private key for decrypting. We'll go over the basics of PKC and understand how it helps with ownership in Ethereum.

Now that we've explored the foundations of cryptography in Ethereum, we're ready to uncover more practical aspects. In the next post, we'll dive into wallets and see how they help manage keys effectively. This way, we can fully embrace Ethereum's decentralized world.

Keys and Addresses

In Ethereum, there are two types of accounts: externally owned accounts (EOAs) and contracts. EOAs are owned by individuals and are linked to digital private keys, Ethereum addresses, and digital signatures. Private keys are essential for interacting with Ethereum, as they uniquely determine Ethereum addresses. It's important to note that private keys should be kept private and never shared or stored on the Ethereum network.

Accessing and Controlling Funds

Digital signatures, created using private keys, play a crucial role in Ethereum transactions. A valid digital signature is required for a transaction to be included in the blockchain. Control over an account and its funds is granted to anyone who possesses the private key. Digital signatures in Ethereum transactions prove ownership of funds by demonstrating ownership of the private key.

Public and Private Keys

In Ethereum, keys come in pairs: a private (secret) key and a public key. The public key is similar to a bank account number, while the private key acts as a secret PIN, providing control over the account. Ethereum users rarely encounter private keys directly, as they are typically stored in encrypted files managed by Ethereum wallet software.

Ethereum Addresses

In Ethereum transactions, the recipient is represented by an Ethereum address, similar to the beneficiary account details in a bank transfer. Ethereum addresses for EOAs are generated from the public key portion of a key pair. It's worth noting that not all Ethereum addresses represent public-private key pairs; some can represent contracts, which don't have private keys.

Public Key Cryptography and Cryptocurrency

Public key cryptography (also called "asymmetric cryptography") is a core part of modern-day information security. The key exchange protocol, first published in the 1970s by Martin Hellman, Whitfield Diffie, and Ralph Merkle, was a monumental breakthrough that incited the first big wave of public interest in the field of cryptography

The Power of Mathematical Functions

Public key cryptography relies on mathematical functions with unique properties. These functions are easy to compute, but their inverse calculation is extremely challenging. For example, multiplying two large prime numbers together is simple, but finding the prime factors of the product (prime factorization) is extremely difficult.

Trapdoor Functions and Secrets

Certain mathematical functions can be easily reversed if you possess secret information. If I tell you that one of the prime factors is 2,003, you can trivially find the other one with a simple division: 8,018,009 ÷ 2,003 = 4,003. Such functions are often called trapdoor functions because they are very difficult to invert unless you are given a piece of secret information that can be used as a shortcut to reverse the function.

Elliptic Curve Cryptography

Elliptic curve arithmetic is a special kind of math used in cryptography. It involves doing calculations on a curved line. Multiplication on this line is easy, but division (finding the opposite) is really hard. This is called the discrete logarithm problem, and there aren't any known ways to solve it quickly. Because of this, elliptic curve cryptography is used a lot in modern computer systems. It's the basis for things like private keys and digital signatures in cryptocurrencies like Ethereum.

Public and Private Key Pair in Ethereum

In Ethereum, a public-private key pair is generated using public key cryptography. The private key is kept secret and controls access to an Ethereum account. The public key, derived from the private key, represents the account's address and allows for secure transactions and interactions with smart contracts.

The Role of Digital Signatures

A digital signature is like a special code used to sign messages. In Ethereum transactions, the transaction details are used as the message. Elliptic curve cryptography, a mathematical method, combines the transaction details with a private key to create a unique code called the digital signature. When sending a transaction on Ethereum to access an account or perform actions with smart contracts, it must be accompanied by a digital signature created using the private key associated with the Ethereum address.

Verification and Trust

The great thing about elliptic curve mathematics is that anyone can verify the validity of a transaction by checking if the digital signature matches the transaction details and Ethereum address. The verification process doesn't require knowing the private key; it remains private. However, the verification process ensures that the transaction could only come from someone who possesses the private key corresponding to the public key behind the Ethereum address. This is the "magic" of public key cryptography.

There is no encryption as part of the Ethereum protocol—all messages that are sent as part of the operation of the Ethereum network can (necessarily) be read by everyone. As such, private keys are only used to create digital signatures for transaction authentication.

Private Keys

A private key is a randomly chosen number that holds the utmost importance in Ethereum. It is the foundation of user control over funds associated with their Ethereum address and access to authorized contracts. The private key's primary role is to generate signatures required for spending ether, providing proof of ownership in transactions. It is crucial to keep the private key secret to prevent unauthorized access to the associated funds and contracts. Additionally, backing up and protecting the private key is essential to prevent permanent loss of funds in case of accidental misplacement.

The private key used in Ethereum is essentially a number. A simple method to generate private keys at random is by using a coin, pencil, and paper. By flipping a coin 256 times, you can obtain a sequence of binary digits that can be used as a random private key for an Ethereum wallet. From this private key, the corresponding public key and address can be derived.

Generating a Private Key

The process of generating a private key involves finding a secure source of randomness or entropy. The private key is essentially a number chosen between 1 and 2^256. The method used to select this number should be unpredictable and non-deterministic. Ethereum software relies on the random number generator of the underlying operating system, which is typically initialized by a human source of randomness. This may involve actions like moving the mouse or pressing random keys on the keyboard. Cosmic radiation noise on the computer's microphone channel can also serve as an alternative source of entropy.

The private key itself can be any nonzero number up to a massive value slightly less than 2^256. To create a private key, a 256-bit number is randomly chosen and checked for validity within this range. Typically, a larger string of random bits collected from a cryptographically secure source is fed into a 256-bit hash algorithm like Keccak-256 or SHA-256. If the resulting number falls within the valid range, it is considered a suitable private key. Otherwise, the process is repeated with another random number.

Important Considerations

Generating a private key is an offline process that doesn't require communication with the Ethereum network or anyone else. True randomness is crucial to ensure the uniqueness of the private key. Choosing a number yourself or using a weak random number generator increases the risk of someone else guessing the key and gaining access to your funds. Therefore, the private key should be truly random and unguessable. Remembering the private key is not necessary, allowing for the best approach in picking it: true randomness.

Public Key

An Ethereum public key represents a point on an elliptic curve, defined by a set of x and y coordinates that satisfy the equation of the curve.

In simpler terms, An Ethereum public key consists of two numbers that are derived from the private key through a one-way calculation. It is easy to calculate a public key from a private key, but it is impossible to calculate the private key from the public key.

Calculation and Irreversibility:

The public key is obtained by multiplying the private key with a constant point called the generator point using elliptic curve multiplication. This process is practically irreversible. The resulting public key (K) is calculated as K = k * G, where k is the private key, G is the generator point, and * represents the elliptic curve "multiplication" operator. It is important to note that this multiplication is different from regular integer arithmetic.

In simpler terms, To obtain the public key, the private key is multiplied by a generator point on the elliptic curve. However, there is no division operation, so it is not possible to calculate the private key by simply dividing the public key by the generator point. This irreversible mathematical function ensures the security of public key cryptography in Ethereum and other cryptocurrencies.

Elliptic curve multiplication is a "one-way" function used in cryptography. It's easy to perform multiplication, but impossible to reverse it through division. By using this function, the private key owner can generate a public key and share it, without worrying about anyone calculating the private key from the public key. This mathematical property enables the creation of secure digital signatures, proving ownership of Ethereum funds and control over contracts.

Elliptic Curve Cryptography Explained

Both Elliptic Curve Cryptography (ECC) and RSA serve the purpose of generating public and private keys for secure communication between parties. However, ECC offers certain advantages over RSA. For example, a 256-bit key in ECC provides similar security to a 3072-bit key in RSA. ECC also requires significantly less storage space and bandwidth compared to RSA, making it suitable for resource-constrained systems like smartphones, embedded computers, and cryptocurrency networks.

The Trapdoor Function in ECC:

One of the unique aspects of ECC is the trapdoor function, which sets it apart from RSA. Visualizing the trapdoor function can be likened to a mathematical game of pool.

ECC_ROUNDS

Here's a simplified explanation of the algorithm:

  1. Start at an arbitrary point on the curve (Point A).
  2. Use the dot function to find a new point (Point B).
  3. Repeat the dot function, hopping around the curve, until reaching the final point (Point E).

This algorithm follows a specific pattern:

  1. A dot B = -C (Line from A to B intersects at -C).
  2. Reflect across the X-axis from -C to C.
  3. A dot C = -D (Line from A to C intersects at -D).
  4. Reflect across the X-axis from -D to D.
  5. A dot D = -E (Line from A to D intersects at -E).
  6. Reflect across the X-axis from -E to E.

Answers To Some ECC's Questions

How is the second point determined? If the dot function involves connecting two points with a line, isn't it necessary to have an initial point and another point to begin with?

To find the second point in the elliptic curve arithmetic, we can perform a dot function operation. This operation involves drawing a line between two points. However, in this case, we don't need a separate second point to begin with.

Let's assume the first point is called P. The dot function of P with itself, denoted as P dot P, results in a second point, which we will call -R.

In other words, P dot P = -R.

What does P dot P represent? It is actually the tangent line of point P. This can be visualized in the following diagram:

ECC

So, by performing the dot function on a point P with itself, we obtain the tangent line at P, which gives us the second point -R.

If the number of hops represents the private key, wouldn't it be possible to iterate through the hops until reaching the endpoint by counting them?

No, it would not be feasible to simply count the hops until reaching the endpoint if the number of hops is extremely large, such as 2^256. The computational effort required to compute each hop individually, like p dot p dot p dot p..., would be impractical and time-consuming.

However, if you possess the knowledge of the specific number of hops, there is an alternative technique utilizing an exponentiation trick that enables a much faster determination of the endpoint. This method leverages the properties of elliptic curve operations. For instance, by using the concept of doubling, if you are aware that 2P is equivalent to P dot P, you can calculate 4P as 2P dot 2P. This approach allows for exponentially faster calculations even for exceedingly large values.

Cryptographic Hash Functions

Cryptographic hash functions play a crucial role in Ethereum and various cryptographic systems. They are widely recognized as the workhorses of modern cryptography due to their extensive use in securing data. In this section, we will explore the basic properties of hash functions and understand why they are so indispensable in Ethereum and other secure platforms. Hash functions are essential in transforming Ethereum public keys into addresses and creating digital fingerprints for data verification.

Understanding Hash Functions

In simple terms, a hash function is a special type of function that converts data of any size into a fixed-size output, known as the hash. Cryptographic hash functions, specifically, have specific properties that make them valuable for securing platforms like Ethereum.

Key Properties of Cryptographic Hash Functions

  1. Determinism: A specific input always generates the same hash output.
  2. Verifiability: Computing the hash of a message is efficient.
  3. Noncorrelation: Even a minor change in the input data should result in an extensively different hash output, making it unrelated to the original message.
  4. Irreversibility: It is practically impossible to compute the original message from its hash, requiring a brute-force search.
  5. Collision protection: It should be extremely difficult to find two different messages that produce the same hash output. Hash collisions are exceptionally rare in Ethereum.

Applications and Uses of Cryptographic Hash Functions

Cryptographic hash functions serve various security purposes, including:

  1. Data fingerprinting: Creating unique identifiers for data.
  2. Message integrity: Detecting errors or tampering in messages.
  3. Proof of work: Supporting consensus algorithms, like Ethereum's mining process.
  4. Authentication: Protecting passwords and enhancing key security.
  5. Pseudorandom number generation: Generating random values for cryptographic operations.
  6. Message commitment: Ensuring commitment and later revealing of messages.
  7. Unique identifiers: Assigning distinct labels to different entities.

Ethereum’s Cryptographic Hash Function: Keccak-256

Ethereum relies on the Keccak-256 cryptographic hash function for various purposes. Keccak-256 was chosen as the winner of the SHA-3 Cryptographic Hash Function Competition conducted by the National Institute of Standards and Technology (NIST) in 2007. It later became standardized as Federal Information Processing Standard (FIPS) 202 in 2015.

Keccak-256 and the NIST Controversy

During Ethereum's development, NIST was in the process of finalizing the standardization of Keccak-256. However, concerns arose regarding NIST's potential modifications to Keccak-256 due to revelations made by whistleblower Edward Snowden. These revelations suggested that the National Security Agency (NSA) had influenced NIST's random-number generator standard, raising doubts about its integrity.

The Ethereum Foundation's Decision

Given the controversy and uncertainty surrounding NIST's modifications, the standardization of SHA-3 experienced delays. Consequently, the Ethereum Foundation made the decision to implement the original Keccak algorithm proposed by its inventors, rather than using the modified SHA-3 standard endorsed by NIST. This choice aimed to ensure the security and reliability of Ethereum's cryptographic operations.

While Ethereum documents and code may mention "SHA-3," they often refer to a specific variant called Keccak-256 instead of the officially standardized FIPS-202 SHA-3. The slight implementation differences mainly involve padding parameters. However, these distinctions are significant because Keccak-256 and FIPS-202 SHA-3 produce different hash outputs for the same input.

Ethereum Addresses

Ethereum addresses are unique identifiers in the form of hexadecimal numbers. They are derived from the Keccak-256 hash of the public key, specifically the last 20 bytes.

Difference from Bitcoin Addresses

Unlike Bitcoin addresses, Ethereum addresses lack built-in checksums in their user interfaces. The reason behind this choice was the expectation that higher layers of the Ethereum system would handle checksums if needed, as addresses would be abstracted through name services.

Challenges Faced

However, the development of these higher layers progressed slower than expected. This led to issues in the early days of Ethereum, including funds lost due to mistyped addresses and input validation errors. Additionally, the adoption of alternative encodings by wallet developers was slow due to the delayed implementation of Ethereum name services.

Inter Exchange Client Address Protocol (ICAP)

The Inter Exchange Client Address Protocol (ICAP) is a method for encoding Ethereum addresses that offers convenience, error-checking, and compatibility with the International Bank Account Number (IBAN) format. ICAP provides a flexible and interoperable way to represent Ethereum addresses and registered names.

IBAN and ICAP

IBAN is a widely used standard for identifying bank account numbers in international wire transfers. ICAP serves as a decentralized alternative to IBAN, specifically designed for Ethereum addresses.

Structure and Components

Similar to IBAN, ICAP uses alphanumeric characters (letters and numbers), with a maximum length of 34 characters. It introduces a special country code, "XE," which stands for "Ethereum." ICAP includes a two-character checksum and three variations of an account identifier.

ICAP Variations:

  1. Direct: This encoding represents the 155 least significant bits of an Ethereum address. It can handle addresses starting with zero bytes. It aligns with IBAN requirements but has limitations in representing the complete Ethereum address. Example: XE60HAMICDXSV5QXVJA7TJW47Q9CHWKJD (33 characters).
  2. Basic: Similar to the Direct encoding, but allows representation of any Ethereum address. However, it is not compatible with IBAN field validation. Example: XE18CHDJBPLTBCJ03FE9O2NS0BPOJVQCU2P (35 characters).
  3. Indirect: This variation encodes an identifier that links to an Ethereum address through a name registry provider. It includes an asset identifier, a name service, and a human-readable name. Example: XE##ETHXREGKITTYCATS (20 characters), where the "##" should be replaced by the computed checksum characters.

Hex Encoding with Checksum in Capitalization (EIP-55)

Ethereum Improvement Proposal 55 (EIP-55) offers a solution to enhance the accuracy of Ethereum addresses by introducing a checksum. This proposal addresses the slow implementation of ICAP and name services, providing a simple and effective way to prevent errors when working with Ethereum addresses.

EIP-55 Checksum

To ensure the integrity of Ethereum addresses, EIP-55 modifies the capitalization of certain characters in the address. By changing the capitalization, a checksum is embedded within the address itself. This checksum acts as a safeguard against mistakes during address input or reading.

Implementation and Benefits

Wallets and applications that support EIP-55 can verify the checksum within an Ethereum address. This enables them to detect and alert users about potential errors, improving address accuracy. However, wallets that do not support EIP-55 will still accept and process addresses without considering the checksum.

EIP-55 is quite simple to implement. We take the Keccak-256 hash of the lowercase hexadecimal address. This hash acts as a digital fingerprint of the address, giving us a convenient checksum. Any small change in the input (the address) should cause a big change in the resulting hash (the checksum), allowing us to detect errors effectively. The hash of our address is then encoded in the capitalization of the address itself. Let’s break it down, step by step:

  1. Hash the lowercase address, without the 0x prefix:

Keccak256("001d3f1ef827552ae1114027bd3ecf1f086ba0f9") =
23a69c1653e4ebbb619b0b2cb8a9bad49892a8b9695d9a19d8f673ca991deae1

  1. Capitalize each alphabetic address character if the corresponding hex digit of the hash is greater than or equal to 0x8. This is easier to show if we line up the address and the hash:
Address: 001d3f1ef827552ae1114027bd3ecf1f086ba0f9
Hash   : 23a69c1653e4ebbb619b0b2cb8a9bad49892a8b9...
Enter fullscreen mode Exit fullscreen mode

In our address, there is a letter 'd' in the fourth position. We compare it to the fourth character of the hash, which is the number 6. Since 6 is less than 8, we keep the 'd' in lowercase. Moving on, the sixth character in our address is the letter 'f'. We compare it to the sixth character of the hash, which is the letter 'c'. Since 'c' is greater than 8, we change the 'f' in our address to uppercase 'F'. We continue this process, making changes based on the comparison with the corresponding characters in the hash. It's important to note that we only use the first 20 bytes (40 characters) of the hash as a checksum because our address can only accommodate 20 bytes (40 characters) for the appropriate capitalization.

Check the resulting mixed-capitals address yourself and see if you can tell which characters were capitalized and which characters they correspond to in the address hash:

Address: 001d3F1ef827552Ae1114027BD3ECF1f086bA0F9
Hash   : 23a69c1653e4ebbb619b0b2cb8a9bad49892a8b9...
Enter fullscreen mode Exit fullscreen mode

Detecting an error in an EIP-55 encoded address

Now, let’s look at how EIP-55 addresses will help us find an error. Let’s assume we have printed out an Ethereum address, which is EIP-55 encoded:

0x001d3F1ef827552Ae1114027BD3ECF1f086bA0F9
Enter fullscreen mode Exit fullscreen mode

Now let’s make a basic mistake in reading that address. The character before the last one is a capital F. For this example let’s assume we misread that as a capital E, and we type the following (incorrect) address into our wallet:

0x001d3F1ef827552Ae1114027BD3ECF1f086bA0E9
Enter fullscreen mode Exit fullscreen mode

As you can see, even though the address has only changed by one character (in fact, only one bit, as e and f are one bit apart), the hash of the address has changed radically. That’s the property of hash functions that makes them so useful for checksums!

Now, let’s line up the two and check the capitalization:

001d3F1ef827552Ae1114027BD3ECF1f086bA0E9
5429b5d9460122fb4b11af9cb88b7bb76d892886...
Enter fullscreen mode Exit fullscreen mode

It’s all wrong! Several of the alphabetic characters are incorrectly capitalized. Remember that the capitalization is the encoding of the correct checksum.

The capitalization of the address we input doesn’t match the checksum just calculated, meaning something has changed in the address, and an error has been introduced.

Top comments (1)

Collapse
 
utsavdesai26 profile image
Utsav Desai

This article seems intriguing! Exploring public key cryptography in the context of Ethereum, digital signatures, and wallet management promises to shed light on an essential aspect of blockchain security. Looking forward to unraveling the mysteries behind these concepts.