What is BIP39? And Why Should You Care?
Last Updated on 12. August 2024 by Martin Schuster
The Bitcoin Improvement Proposal 39 (BIP39) introduced so-called mnemonic code or phrases. A mnemonic code describes a group of easy-to-remember words that are used for the generation of deterministic wallets. With these words, it is possible to regenerate private and public keys of a Bitcoin wallet. For the user, it is important to keep these mnemonic phrases a secret from anybody else. This article describes how mnemonic creation works and how the private key is generated from it.
The process of generating a wallet with the BIP39 method is split into two parts.
- First, generating the mnemonic phrase.
- Second, converting this phrase into a binary seed. The seed is then used to generate the wallet details.
The reason for the introduction of the BIP39 was that exchanging these phrases with another person or another Wallet service is easy and less prone to error than exchanging a binary or hexadecimal seed. The goal was to find a way to transport computer-generated randomness with human-readable transcription.
Part 1: Generating a mnemonic phrase
The first step consists of generating an entropy. An Entropy describes a series of random values (typical 0’s and 1’s). In BIP39 mnemonic phrases are 12, 15, 18, 21, or 24 words long; because of this, it is determined that the entropy is at least 128 bit and at most 256 bits long, and it always must be a multiple of 32. The next step consists of generating a checksum. For this, the first “multiple of the length of entropy divided by 32” bits of the entropy SHA256 hash is determined. This checksum is then appended to the end of the initial entropy. The sequence of bits is then split into groups of 11 bits. These groups of 11 bits determine the number of words the mnemonic phrase consists of. These words are picked from a wordlist of 2048 words because 11 bits equals a decimal range from 0 to 2047. The table below shows the mnemonic phrase length for the possible allowed entropies.
Entropy (bits) | Checksum (bits) | Entropy + Checksum (bits) | Mnemonic phrase (words) |
128 | 4 | 132 | 12 |
160 | 5 | 165 | 15 |
192 | 6 | 198 | 18 |
224 | 7 | 231 | 21 |
256 | 8 | 264 | 24 |
Wordlist
A word list in BIP39 describes a list of 2048 carefully curated words that obey the following rules.
- Smart selection of words: It should suffice that the first four letters of a word are typed to identify a word unambiguously
- Similar words avoided: Pairs of words like “quick” and “quickly” or “build” and “built” are avoided because they make remembering the sentence more difficult
- Sorted wordlist: For an efficient lookup of the words, the wordlist is sorted.
Most wallets only support the English wordlist, so it is recommended to use this one. The wordlist might contain native characters as long as they are encoded in UTF-8 and use Normalization Form Compatibility Decomposition (NFKD).
Example creating a mnemonic phrase
To better understand what has been explained above, we’re going to create a 12-word mnemonic phrase ourselves. First, we need to create an entropy of 128 Bits. You can write down 128 randomly chosen ones and zeroes or make a coin toss or use a tool to generate the entropy.
Example Entropy:
10101000100010101101010101001010101111010000001111111001010000111110100101011010101010101010101011111000111000011010111000111001
Here, we have a 128-bit entropy. We now divide 128 by 32 to get the number of bits for the checksum. Since 128 divided by 32 is 4, we need the first 4 bits from the SHA256 hash of our entropy. To get the first 4 bits, we first need to use the SHA256 hash on our entropy. To avoid errors when using the SHA256 hash, we recommend converting the binary entropy into its hexadecimal equivalent. You are welcome to use our number system converter for this. After the conversion, please consider using our Hash calculator tool for the SHA256 hash. Make sure SHA256 is selected as the method and hexa is selected as the encoding method.
Example SHA256 Hash:
252c48d5d7c566c35b79b255ad4f186cfed7315ee8abc532e851405ca85d9c6f
SHA256 is typically output in hexadecimal, but we require the first 4 bits, since you may know every hexadecimal character represents 4 bits. We just need to take the first character and transform it into binary.
First character of the SHA256 in Binary:
0010
Now we can append these 4 bits to the end of our initial entropy.
Entropy with checksum:
101010001000101011010101010010101011110100000011111110010100001111101001010110101010101010101010111110001110000110101110001110010010
The next step is to split these 132 Bits into groups of 11 bits each, so we get the length of our mnemonic phrase and can then easily look up the corresponding words. In this example, we use this English wordlist. Be cautious when converting the 11 binary bit values to decimal that you remember that the numbers go from 0 to 2047, so you will have to add always one more to find the correct word in the list.
# | Bitgroup | Number (0 – 2047) | Word |
1 | 10101000100 | 1348 | possible |
2 | 01010110101 | 693 | find |
3 | 01010010101 | 661 | famous |
4 | 01111010000 | 976 | key |
5 | 00111111100 | 508 | display |
6 | 10100001111 | 1295 | peanut |
7 | 10100101011 | 1323 | pistol |
8 | 01010101010 | 682 | fetch |
9 | 10101010111 | 1367 | priority |
10 | 11000111000 | 1592 | shove |
11 | 01101011100 | 860 | high |
12 | 01110010010 | 914 | inch |
Mnemonic phrase:
possible find famous key display peanut pistol fetch priority shove high inch
Finally, we now have our mnemonic phrase that can be used to create our Bitcoin wallet details. For this, we need to transform these words into a seed. We are going to explain this in the next section of this article.
Part 2: Transforming mnemonic phrase to seed
We use the Password-Based Key Derivation Function 2 (PBKDF2) to generate the seed from the mnemonic phrase. PBKDF2 describes a normed function to derive a key from a password. The generated mnemonic is the password in this case, and if the user chooses, he can give an additional password called salt in this process. This function is executed multiple times. In BIP39 it is 2048, additionally HMAC-SHA512 is used as a pseudo-random function. Simply speaking, 2048 times is the mnemonic as password and “mnemonic” + “passphrase” as salt repeatedly hashed. The length of the derived seed is 512 bits or 64 bytes. This seed can then be used with a method like BIP32 to generate a deterministic wallet.
Seed for our example:
884d90448cf16e09a6285ba44cf0b2bb0e8b517cf704bd97282040cabe32b3148dcb10fe969ea3d8f87640ad7212cb852b70fd026fe6f60a5b3d1df2e8242469
This concludes our explanation regarding BIP39, we also have a convenient tool if you want to experiment with this a bit further.
Deterministic Wallet Generator
Disclaimer: The addresses are for demonstration purposes only, for productive use it is recommended to use more secure generation algorithms.Source: https://github.com/bitcoin/bips/blob/master/bip-0039.mediawiki