Mastering Monero: Notes

2020-08-13

These are notes on Mastering Monero. They are by no means exhaustive. I have skipped portions here and there based on my prior knowledge and what I wanted to record. The book itself isn't long, so any portion that I have skipped here can quickly be referenced in the source.

Introduction to Cryptocurrencies and Monero

In traditional banking there is no actual movement of assets. Two banks simply edit their databases to reflect that funds have been transferred.

• This system requires trust in banks to keep ledgers honest and to actually redeem IOUs held by users and that txns are legitimate.

• Funds can take days to settle and the entire process is opaque.

Blockchains are a technology allowing networks to establish decentralized consensus. Sharing a ledger makes it possible to build currencies on top of a blockchain.

• This is a trustless, decentralized, and immutable technology.

• Traditional banking requires multiple txns, separate ledgers, and trust in more than one bank just for a single txn. Bitcoin on the other hand uses a single ledger thatr allows for simplicity, no third-party risks, and pseudonymity.

• People can now store their wealth or transfer money without having to trust external institutions.

But pseudonymity is not ideal. When you receive a crypto payment, you don't learn the sender's name and only see their address. However, since anyone can access a complete copy of the entire blockchain on which every account balance and history is public, you can indirectly find out quite a bit of information about the sender. This is worse than the traditional banking system, in which you cannot simply find out a person's balance and entire history of txns.

• If you go to a merchant and provide them with your address, they can immediately look up your balance. The merchant can then manipulate the price or even rob you.

• You could be surveilled by any person who at some point sends crypto to your address which can lead to discrimination and all kinds of troubles.

• You could end up owning tainted coins which are blacklisted due to how they were previously obtained (perhaps by a previous owner).

• You generate a ton of sensitive info every day. All of this is recorded by scores of different entities, resulting in vast centralized troves of all your personal and sensitive dtails. This is concerning due to data breaches and also due to personalized manipulation by tech companies that monitor you.

Nicolas van Saberhagen published the CryptoNote protocol in 2013 and it became the foundation for the anonymously created Bytecoin. The anonymous user thankful_for_today revealed Bytecoin to have already been mostly mined and thus potentially dangerously centralized and so he incorporated the features into a new currency Monero launched in 2014.

Getting Started: Receiving, Storing and Sending Monero

A wallet handles all the cryptography so that you only need to manage a seed and address(es). The wallet uses the seed to locate and spend moneroj and to generate addresses. If you lose your seed, there is no way to recover access to your moneroj.

A hot wallet holds small amounts for day-to-day use, and a cold wallet is more secure and used for long-term savings or large amounts. You can connect your wallet to a remote node instead of storing the entire blockchain on your device. Nodes are computers that have downloaded the entire blockchain and sync others' wallets and relay others' txns. Running a node is different from mining.

Your address is never stored on the blockchain thanks to stealth addresses, and each seed can generate multiple subaddresses that all deposit to the same wallet. Each Monero account has a single primary address. Wallets may wait up to 20 minutes for confirmation before marking funds as received.

You can share a view-only version of your wallet, which can see incoming txns but cannot send or view outgoing funds (good for transparency). This involves sharing a secret view key.

You can check transaction keys in order to check proof of payment.

Operational security in keeping your funds safe:

• Never say how much Monero you own.

• Keep your seed safe and backed up.

• Before sending a large txn, first test the address by sending a smaller amount first.

How Monero Works

When you set up a wallet for the first time, it generates a new seed that is kept secret and used to access moneroj on the blockchain. Initialization is done on the device (can be offline).

• The wallet calculates a set of private keys and a set of public keys generated together.

• When someone sends you Monero, they broadcast a txn that transfers moneroj into a new entry on the ledger that only your private keys can unlock.

• The output of the txn is stored on the blockchain for you to access. Receiving moneroj means gaining another output. Spending moneroj means consuming an output as an *input* and generating a new output for someone else. The moneroj that you own are just outputs on the blockchain that your private keys unlock, and your wallet balance is their total sum.

A digital signature is a method for confirming the authenticity and source of data or a message.

• Signatures can be verified against the public key to confirm the identity of the signer and verify that the message is complete and unmodified.

• To spend an output, you compose a message to the network, sign the message with your private key, and then broadcast the resulting txn. The network checks the validity of the signature.

In Bitcoin, each message explicitly declares which outputs are being spent. This is useful for maintaining a record of unspect txn outputs (UTXOs) which can be valid inputs for new txns. But such a straightfforward proof of ownership is bad for privacy. Monero has a lot of different privacy technologies:

• RingCT: conceals txn amount. Allows the sender to prove that they have enough moneroj for a txn without revealing the value of that amount by using commitments and range proofs.

• Sending moneroj involves committing the amount in a private way, revealing enough info for the network to confirm txn legitimacy while not disclosing amount itself.

• Range proofs ensure that the committed amount is between zero and a certain number.

• Ring signatures: obfuscate which output was spent (so that sender is protected).

• One member of a group digitally signs a message on behalf of the group (minimum 7 members as of 2018), while mixing in the public keys of the other members, so that it is unclear which group member signed the message.

• You can verify that one of the ring members signed the message, but you can't determine exactly which member.

• In other words, the keys from multiple outputs are blended in order to hide which output is actually being spent. The other outputs (decoys) are semi-randomly chosen from past outputs on the blockchain not belonging to the sender.

• So, an outside observer cannot prove that an output has been spent. The fact that an output appears in a ring signature is inconclusive, since that output could be truly spent in that txn or it could be a decoy.

• But this brings up the problem of double spending. An output can appear in a signature before and after it has been spent. Key images are generated and recorded with each txn, uniquely derived from the actual output being spent. The network doesn't know which ring member maps to the key image, but they can check whether the key image has been used before.

• Stealth addresses: ensure that the recipient's address is not recorded on blockchain. Each txn is sent to a unique disposable one-time address.

• Wallet address is a 95-character string incorporating the public view key and the public spend key, both of which are derived from the seed.

• When someone sends you funds, they use this address along with some randomness to generate a unique one-time public key (stealth address).

• Kovri: based on the Invisible Internet Project (I2P), which uses routing to create private networks distributed across the Internet. Kovri obfuscates broadcast origin and conceals network signs of Monero activity (hides your physical location).

• Why is this necessary? The IP address of a device can easily be linked to a physical location and identity. A node broadcasting to the network reveals its IP address, so a receiver can identify the location of the sender. Network traffic through nodes is also visible to ISPs. Multiple txns from one IP address could infer that they are connected.

The Monero network

When your wallet broadcasts a message, the network temporarily stores it in a list of pending txns called the memory pool. Miners collect these unconfirmed txns into blocks. Each block contains a set of txns, a hash pointer to the previous block, and a nonce which is a special string given by the miner.

A hash proves that each block is directly linked to an unaltered version of the previous block, so the smallest modification attempt by an attacker will raise a red flag on every subsequent block. The nonce is extremely computationally difficult to find. It is impossible to plan ahead for calculating the nonce.

Nodes are the backbone of the peer-to-peer network. All nodes are equal participants and are hosted on all sorts of computers. A newly initialized node downloads the entire blockchain and verifies the validity of each txn and block. It receives these transmissions from many peer nodes.

• A wallet needs to be able to access a copy of the blockchain, and since it cannot find the relevant unspent outputs it must communicate with a synchronized node before crafting txns.

• Running a local node means locally storing/verifying the entire blockchain so your wallet can interact with your own copy of the ledger.

• Most wallets use a remote node instead, since running a local node requires more than 60 GB disk space.

Miners collect unconfirmed txns from their memory pool, check their validity (proof, signatures, key image), draft a list of txns to include in a block, include a hash of the previous block, and work to find the nonce to complete the block. Upon finding the nonce, the miner announces their version of the block and the rest of the network appends it to their copy of the blockchain.

Network latency can cause momentary splits in the blockchain as two miners can independently complete two different block versions at the same height.

• Because of this, all miners agree to a decentralized consensus protocol in which they work on mining the next block in the longest chain.

• So, even if there are two different chains with a difference at the most recent block, another miner will soon solve a subsequent block and add it to their chain, thus causing all other miners to adopt that chain and discard the orphaned block. Txns that were only included in the orphaned block remain in the memory pool and will be mined in a subsequent block.

Finding the nonce that completes the block is a hard puzzle that is only solvable by brute force. A miner who successfully mines a block is paid in a block reward and in a txn fee.

• Block reward adds a freshly-minted coinbase txn to the miner's address (coin emission).

• Coin emission rate will smoothly go down to 0.6 XMR per 2-minute block in 2022 and then stay constant.

• The annual supply increase of 0.6 XMR tail emissions is less than a percent a year. This ensures that miners stay incentivized to mine.

• Bitcoin on the other hand will remove the block reward once all 21 millions bitcoins have been mined around the year 2140.

• Fees included with the txns are collected. A larger fee means a greater likelhood that the txn is included sooner by the miner.

A Proof of Work (PoW) system is one that couples important network functions with the search for a useless nonce. It enforces decentralization by requiring validation to be submitted with a nonce.

• Miners measure how quickly they can work toward mining blocks in their hashrate (hashes per second). The network hashrate refers to the total hashrate of all miners.

• PoW prevents censorship or preferential treatment of txns and prevents double spending attack (unless there is a 51% attack).

• To add a new block every 2 minutes (in order to quickly confirm txns) even as more miners join the network and, the network hashrate increases and the nonce puzzle is adjusted with time to be more difficult.

• Bitcoin was launched with the vision that anybody with a computer could mine and earn rewards. But mining in Bitcoin has become centralized with farms of extremely expensive ASICs dominating and putting most miners out of business. This nullifies the benefits of decentralization such as universal access to mining, censorship resistance, netwrork resilience.

• Monero doesn't use Bitcoin's "CPU-hard" SHA-256 hash algorithm but rather a "memory-hard" algorithm called CryptoNight. As of 2018, even CPU mining is feasible for Monero. When ASICs were discovered to be secretly mining a large percentage of the hashrate, there were immediate steps taken to tweak the algorithm in a way that made ASICs useless (possible since ASICs are made to do only one thing and cannot be reprogrammed). Monero now changes the mining algorithm slightly at each network update.

A Deep Dive Into Monero & Cryptography

Monero prints final addresses and keys in base-58 (similar to base-64 but modified to avoid ambigious characters).

The elliptic curve that Monero uses is called Twisted Edwards or Ed25519, which is the birational equivalent of the Montgomery curve Curve25519. It is expressed as $$-x^2+y^2=1-(\frac{121665}{121666})x^2y^2$$ which is the same as the general elliptic curve equation with parameters $$a=-1, b=\frac{121665}{121666}$$.

Elliptic curve discrete logarithm problem: After adding the curve generator point to itself many times, the resulting point cannot be used to determine how many times the operation occured.

Whereas Bitcoin uses asymmetric encryption with just two keys (private key and public key), Monero uses a framework with four keys:

1. Public view key: verify addresses validity

2. Private view key: viewing balance, fees, txn amounts

3. Public spend key: txn verification

4. Private spend key: txn signing (i.e. sending moneroj)

The public address is a representation of the public view key and the public spend key, whereas Bitcoin uses a hash of the single public key.

• The address also contains a checksum and a network byte, which is used to differentiate between different cryptocurrencies and networks (s.a. testnet, stagenet, or mainnet).

• The checksum catches mistakes or typos to help the sender enter a valid address.

• There are 4 components of the address, totalling to a 69 byte string that is encoded into base-58 and increases length to 95 characters:

1. 1 byte for the network and address type identifier

2. 32 bytes for the public spend key

3. 32 bytes for the public view key

4. 4 bytes for the checksum (hash created with Keccak function on previous 65 bytes, then shortened to first 4 bytes)

A (collision-resistant) hash function is necessary for generating addresses and keys. Monero uses the CryptoNote hash algorithm which is built on SHA3 (designed by non-NSA engineers). CryptoNote is Keccak-256 with 32-byte output for both txn and block hashing. The user's operating system provides the initial seed/entropy source to which Monero repeatedly applies Keccak hashing, with each output serving as the input for the next hash.

A seed is a unique 256-bit integer, often represented as a 64-bit base-16 number or as a phrase which is a 24-"digit" base-1626 "number".

• Multiply the private keys by elliptic curve generator point to yield the public spend and public view keys (deterministic method).

• How to derive the private view key? Hash the seed with Keccak-256 to produce a 256 bit integer that the sc_reduce32 function ensures is compatible with the elliptic curve.

• Seed is the private spend key.

• If you set up a wallet with just the private view key but not the private spend key, then you make a view-only wallet that can see all incoming txns but cannot spend or see outgoing txns. Use case: charity shares public view key to ensure transparency, and a donor can verify that their funds have been received. Important not to be able to see outgoing txns in order to hide whether an output has been spent (if this weren't the case, then all future and previous ring signatures with a output that has been revealed to be spent would be known to be decoys)

Even though Monero has built-in privacy with ring signatures, stealth addresses, and RingCT, data can still be collected "off-chain" (from other sources).

• The privacy of Monero is circumvented if two senders communicate and discover that they have sent moneroj to the same address. This can be avoided by generating multiple subaddresses and sharing a unique one with each sender.

• Subaddresses are derived from the same keys as the primary address, so they route to the same wallet balance, but they are cryptographically unlinkable.

• Recall that for the primary address we have $$(PubV_\theta,PubS_\theta)=(PrivV_\theta,PrivS_\theta)G$$ For each index, the subaddress public spend key is $$PubS_i=H(PrivV_\theta || i)G+PubS_\theta$$ and the public view key is $$PubV_i=PrivV_\theta*PubS_i$$

• Subaddress public keys are encoded in base-58 in the same way, but they all begin with the digit 8 (network byte). This is important because sending to a subaddress is handled differently.

• For a subaddress, the private key is instead multiplied by the public spend key.

• To receive an output $$X$$ with public txn key $$R$$, a wallet needs to scan a txn to ascertain whether it belongs to the owner.

• For a primary address, checks if $$X=H(PrivV_\theta*R)G+PubS_\theta)$$

• For a subaddress, checks if $$PubS_i=X-H(PrivV_\theta*R)G$$

• There are 3 different methods of private key derivation:

1. Original (non-deterministic): private spend key and private view key are independently and randomly chosen (no longer recommended).

2. Mnemonic (deterministic or Electrum): all keys derived from single private spend key aka the seed (as described in above notes: seed hashed to get private view key).

3. MyMonero: shorter seed phrase which is hashed to get the private spend key and hashed again to get the private view key.

Calculating the stealth address to which to send according to CryptoNote: $$X=H(r*PubV|i)G+PubS$$ where $$r$$ is the txn private key (256 bit scalar only known to sender) and $$i$$ is the output index. The recipient scans the blockchain for txns belonging to him, takes the public txn key $$R$$, and checks for which output $$X=H(PrivV*R|i)G+PubS$$.

As an auditing feature, coinbase txn amounts are not masked. All other txns use RingCT which has two components:

• First, the amount is encrypted by a key derived from public information in the recipient's address so that only the recipient can read it.

• Then, the amount is integrated into a Pedersen commitment, from which nobody can retrieve the amount but anybody can verify that the outputs balance the inputs.

• The sender proves that the masked number can be generated as the sum of positive powers of two without revealing the powers (range proof).

Ring signatures cannot be cryptographically examined to determine the true signer. Every ring signature produces a key image derived from the output actually being spent, and the key image does not reveal the true signer. Since the network cannot identify which outputs are spent, it instead keeps track of the key images.

• Let $$H_S,H_p$$ be hash functions returning scalars in the field and points in the curve group, respectively, and $$G$$ be a publicly known fixed point. Consider a simple example of signing a message $$M$$ with 3 ring members (in reality 11 required).

• Suppose the wallet has randomly selected to put true source of funds in slot 2, so you retrieve public keys $$P_1, P_3$$.

• Choose random $$u$$ and form commitment $$c_3 = H_S(M,uG,uH_p(P_2))$$

• Choose random $$s_3, s_1$$ and form commitments $$c_1=H_S(M,s_3G+c_3P_3,s_3H_p(P_3)+c_3p_2H_p(p_2)),$$ $$c_2=H_S(M,s_1G+c_1P_1,s_1H_p(P_1)+c_1p_2H_p(P_2))$$

• Define $$s_2=u-c_2p_2$$ and key image $$J=p_2H_p(P_2)$$.

• Send signature $$(c_1,s_1,s_2,s_3,J)$$.

• Note that because $$P_2=p_2G, u=s_2+c_2p_2$$ we have that $$c_3=H_S(M,s_2G+c_2P_2,s_2H_p(P2)+c_2p_2H_p(P_2))$$ and this looks just like the other commitments because of $$u$$.

Monero stores its blockchain with the Lightning Memory Mapped Database (LMDB).

Blocks have three components: header, base txn, and list of txn IDs.

1. Block header: major_version, minor_version, timestamp, prev_id, nonce

2. Base txn (single txn for the coinbase reward): version, unlock_time, input_num=1, input_type=0xff, height, output_num, outputs

3. Txn ID list: number of identifiers followed by the Keccak hashes of the txn bodies

4. The block ID is a hash of the size of block_header, block_header, Merkle root hash, and number of txns. The Merkle root hash keeps the txns of a block tamper-free.

The block size is dynamic, unlike Bitcoin whose 1 MB fixed block size has caused scaling issues leading to bottlenecks that cause high fees and delayed processing. Monero on the other hand allows miners to use larger blocks to accomodate increased traffic but includes a penalty function that decreases the block reward for oversized blocks. For the median size of the last 100 blocks $$M_N$$ and block size $$B$$, if $$B>M_N$$ and the block size is greater than 300 kB then the penalty is the base reward multiplied by $$(\frac{B}{M_N}-1)^2$$.

Larger txns incur a higher fee. Per kB, the fee is $$\frac{R}{R_\theta}*\frac{M_\theta}{M}*F_\theta*\frac{60}{300}*4$$ for the base reward, the reference base reward (10 XMR), the block size limit, the minimum block size limit (300 kB), and .002 XMR, respectively. This takes into account the increase in median block size relative to minimum block size.

Skipped

Monero integration for developers

Zooko's triangle: difficulty of designing name systems that are simultaneously decure, decentralized, and human-meaningful. Because Monero addresses are not human-meaningful, Monero Core Team released OpenAlias--a text DNS record containing a prefix and a recipient address--as a human-readable way to communicate addresses.

Developers integrating Monero can use Monero's C++ API or the remote procedure call (RPC) interface which is accessible with HTTP requests. RPC is simple to implement and allows generating addresses and subaddresses and transferring funds, but it does not scale effectively for big applications.

Plenty of development specifics skipped here. Refer to the source.