## Bitcoin And Cryptocurrency Technologies: Notes

2020-08-13

These are notes on the Princeton Bitcoin book by Narayanan et al. There is a Coursera course that accompanies the book, and all course content is free. I was deciding between this book and Mastering Bitcoin. Though this book was pretty good, I imagine the latter being more comprehensive from a development standpoint. I have merely printed my own notes, meaning that I have skipped certain portions that did not interest me.

### Introduction

In today’s financial system, we use a mixture of credit and cash.

• We measure debt in the amount of cash it would take to settle it.

• A cash system needs to be bootstrapped by an initial supply, but it avoids the possibility of defaulting on a debt.

• Cash is anonymous and enables offline txns.

• Bitcoin is somewhat anonymous but not as much as cash. Also, it does not work fully offline but relies on a P2P network instead of a central server.

In online payments, providing credit card details directly to a seller can be a security risk.

• PayPal is an intermediary architecture that erases the need to provide a seller with your credit card details or even with your identiy.

• A competing architecture to PayPal was SET, which failed because it required all users to get a certificate (from a certificate authority). Bitcoin sidesteps this problem by requiring only a public key for a user’s identity.

Double spending problem: if you receive a piece of data representing a unit of virtual cash, you can make copies of it and pass it on to different people. David Chaum’s innovation (blind signature):

• Alice issues new note to Bob.

• Bob picks serial number.

• Bob writes serial number on note and hides it from Alice.

• Alice signs note without seeing serial number.

• If Bob picks a number that’s already been picked, he ends up with a note that can’t be spent since the blind signature can be verified against the original message.

Digicash and others relied on USD or gold. To make a currency out of thin air, it needs to be scarce. This is the objective of Bitcoin mining. Desired properties of computational puzzles:

• Puzzle should be specific to a message (sender, receiver, contents, and timestamp).

• Receiver can check solution w/o having to repeat solving process.

• Puzzle should be independent of other puzzles.

• Recipients can adjust difficulty of puzzle solutions they will accept (using hash functions).

The idea of Bitcoin’s blockchain is to collect documents in a tree structure into blocks and link the blocks (Bitcoin does this and also gets rid of signatures).

“Bitcoin is an implementation of Wei Dai’s b-money proposal on Cypherpunks in 1998 and Nick Szabo’s Bitgold proposal” – Satoshi

### Ch 1: Introduction to Cryptography & Cryptocurrencies

Cryptocurrencies make heavy use of cryptography for securely encoding the rules of the system in the system itself.

A hash function is a mathematical function with variable size input and fixed size output (assume 256 bits) and is efficiently computable. A cryptographic hash function has some additional properties:

• Collision resistance: $$H$$ is a collision resistant hash function (CRHF) if it’s infeasible to find distinct $$x,y$$ such that $$H(x)=H(y)$$.

• Collisions are guaranteed, and we just need to examine $$2^{256}+1$$ inputs to find one. Due to the birthday paradox, we are pretty much guaranteed to have a collision with just $$2^{130}+1$$ randomly chosen inputs.

• But such a brute force attack taking advantage of the birthday paradox is infeasible.

• No hash function is proven to be collision resistant.

• Application (message digests): store a hash of a file locally and upload the file to a server, then compare the local hash to the computed hash of the file once downloaded in order to verify absence of tampering.

• Hiding: Having chosen a secret $$r$$ from a distribution with high min-entropy, given $$y=H(r||x)$$ it is infeasible to find $$x$$.

• Min-entropy measures the predictability of an outcome. High means the distribution is very spread out.

• Application (commitments): Commitment scheme has two algorithms: 1) $$com(msg,nonce)$$, which is a cryptographic hash function $$H(nonce||msg)$$, and 2) $$vrfy(com,msg,nonce)$$, and has the properties of hiding (commitment doesn’t reveal message) and binding (can’t change your mind). Generate nonce, apply $$com$$ to the message, and publish the commitment. If we want to reveal the value, publish the nonce and message so anyone can apply $$vrfy$$.

• Puzzle friendliness: For every possible $$n$$-bit output $$y$$, having chosen $$k$$ from a distribution with high min-entropy, it is infeasible to find $$x$$ s.t. $$H(k||x)=y$$ in time significantly less than $$2^n$$.

• Application (search puzzle): The puzzle is to find an input to a hash function (with a value selected from a high min-entropy distribution) whose output falls within a target set. Puzzle-friendliness implies there is no better strategy than trying random values.

SHA-256, the hash function that Bitcoin uses, takes $$768$$-bit input and produces $$256$$-bit output. It uses the Merkle-Damgard transform and pads the input so that the length is multiple of $$512$$ bits.

A hash pointer consists of a pointer to where some info is stored in addition to a cryptographic hash of the info, so that you can both retrieve info and verify that the retrieved info has not changed.

A blockchain is a linked list of blocks containing data as well as a hash pointer to the previous block. So, you have the location of the value of the previous block and a digest of that value. You store the head of the list (ie the hash pointer to the most recent block).

A Merkle tree is a binary tree using hash pointers. Group the leaves, which are the blocks, into pairs and then for each pair make a structure with a hash pointer to each block. Keep building hash pointers up the tree until the root. Tampering with a block can be detected by storing the hash pointer at the root. Proof of membership for a block in the tree required only $$log(n)$$ blocks out of $$n$$ for verification.

A digital signature scheme has three algos: 1) $$(sk,pk)=generateKeys(keysize)$$, 2) $$sig=sign(sk,msg)$$, and 3) $$isValid=vrfy(pk,msg,sig)$$, and requires that $$vrfy(pk,msg,sign(sk,msg)) = 1$$ and that signatures are existentially unforgeable. Except verification, the algorithms should be randomized.

Bitcoin uses a signature scheme called ECDSA (elliptic curve digital signature algorithm) over the elliptic curve secp256k1 which provides ~$$128$$-bit security.

We treat a public key from a digital signature scheme as an identity of a person. This means that you can make a new identity whenever you want. The probability of someone else generating the same key as you is negligible (given a good source of randomness). But using the same identity, however random-looking, over time makes a series of statements that can be used to identify the real-world person.

Key challenges faced by a cryptocurrency to solve: double spending, removing central trusted authority, minting new coins.

### Ch 2: How Bitcoin Achieves Decentralization

Decentralization is not all or nothing. While SMTP is an open standard and anyone can set up an email server, the market is dominated by a handful of centralized providers. While Bitcoin is decentralized, Bitcoin exchanges and wallets may be centralized. We can measure Bitcoin’s decentralization with 5 questions:

1. Who maintains the ledger?

2. Who has authority over transaction validity?

3. Who creates new bitcoins?

4. Who determines how the rules of the system change?

5. How do bitcoins acquire exchange value?

We want to achieve a distributed consensus protocol, ie a protocol of $$n$$ nodes (which are either honest or malicious/faulty) with input values that 1) must terminate with all honest nodes in agreement on the value and 2) the agreed-upon value must have been generated by an honest node.

• A transaction is broadcast to all nodes that comprise the network, but the parties involved in the transaction need not run a node.

• The nodes must agree on which transactions were broadcast and their order of broadcast.

• Consensus is achieved block-by-block. At any given point, each of the nodes have a ledger of a sequence of blocks on which they’ve reached consensus.

• Each node has a pool of outstanding transactions for which consensus has not yet been achieved. Each node might have a different version of the outstanding transaction pool.

• Two obstacles: imperfections in the network (s.a. latency or nodes crashing), and malicious nodes (refer to Fischer-Lynch Paterson impossibility result).

By incorporating incentives and randomness, Bitcoin violates certain assumptions in traditional models of consensus which were built for distributed databases.

A consequence of pseudonymity in Bitcoin is a Sybil attack: a malicious adversary creates copies of nodes to look like there are a lot of different participants.

A simplified Bitcoin consensus algorithm based on implicit consensus: assuming that it is possible to randomly select one node, and if one person has multiple nodes then to only select one of those nodes, in each round a random node is selected and proposes the next block and other nodes implicitly accept (only if all txns are valid) or reject the block by either extending the chain on that block (including the block’s hash in the next block they create) or by ignoring that block.

• Cannot steal bitcoins or deny service to a person, but can double spend:

• Alice sends Bob some bitcoins, and also sends them to a different address she controls. Only one of these transactions can be included in the blockchain, and other nodes don’t have a moral heuristic to see which is the malicious one. The node that chooses the next block will decide on one branch to extend. Alice can increase the likelihood of the double spend being approved by network latency or by bribing the next block. If the double spend block is chosen and the next node also chooses that branch and so on, over time the block containing the txn to Bob that is ignored is called an orphan block

• Bob, if he were cautious, would not provide the paid-for service to Alice even after her transaction to him was included in one block. He should wait to provide his service only if the next several nodes build on the block with that transaction. Honest nodes always extend the longest valid branch they see, so the double-spend probability decreases exponentially with the # of confirmations.

Protection against invalid transactions is cryptographic, but protection against double-spending is purely by consensus.

Nodes need an incentive for behaving honestly. It’s hard to punish a malicious node, so instead we reward honest nodes. There are two mechanisms to do this:

1. Block reward: the node that creates a block can include a create coin transactionn to itself in that block (the reward value halves every 210,000 blocks or roughly every 4 years according to a geometric series with sum 21 million). This node can only collect the reward if the block in question ends up on the long-term consensus branch. Need another incentive for when new bitcoins eventually run out.

2. Transaction fees: creator of transaction can choose to make output value less than the input value and then collect the difference.

Proof of work (PoW): approximate the selection of a random node by instead selecting nodes in proportion to a resource that is hoped nobody can monopolize, e.g. computing power. This is sort of a tax on identity creation and therefore on the Sybil attack

PoW in Bitcoin is done using hash puzzles. The node proposing a new block must find a number ("nonce") such that $$H(nonce||prev\_hash||txn||txn||...||txn)<target$$

• If the function is puzzle friendly, then the only way to succeed is to try nonces one by one. This means the process is probabilistic (Bernoulli trials), which can be approximated by a continuous probability process called Poisson process).

• Rather than picking a random node, nodes are independently competing to solve these puzzles (mining).

• Three properties of hash puzzles:

1. Difficult to compute the nonce, since the target space is $$10^{-20}$$ of the size of the output space)

2. Adjust the target space for more miners so that a new block is still produced every 10 minutes.

3. It's trivial to verify that a node has computed PoW correctly.

Proof of stake (PoS): selecting nodes in proportion to their ownership of the currency.

A miner profits if the the block reward + transaction fees are greater than the hardware cost + operating costs.

Ownership of bitcoins is nothing more than other nodes agreeing that a given party owns those bitcoins.

A cryptocurrency needs bootstrapping to get off the ground so that an adversary cannot take over 50 percent of the new block creation.

51 percent attack: consensus fails and one attacker controls a majority of the mining power. What can he do?

• He can’t steal coins from an existing address

• He can suppress some transactions but he can’t suppress the transactions from being broadcast to other nodes. So even if the attack succeeds, the other nodes will be aware of it.

• He can’t change the block reward.

• He can destroy confidence in Bitcoin and cause the exchange rate to plummet.

### Ch 3: Mechanics of Bitcoin

Bitcoin transactions (abbreviated to txns):

• Txns specify a number of inputs (coins being consumed, ie created in a previous txn) and a number of outputs (coins being created). In a txn that mints a new currency, no coins are consumed.

• Each txn has a unique ID.

• The entirety of a txn output must be consumed by another txn. So if Alice received 25 BTC in a previous txn, and if she wants to send 17 of those BTC to Bob, she must create a new txn that sends 17 BTC to Bob and 8 BTC to herself.

• To verify that the txn above is valid, we need to look up the txn output Alice references and then make sure it has a value of 25 BTC and that it hasn’t already been spent.

• Txns can have many inputs and many outputs, so it is possible to receive money in two different txns and then create a txn that consolidates the values with both as inputs.

• Txns can have two inputs owned by two people, so it is possible to do joint payments. But this requires two signatures, one from each of the senders.

• Three parts to a txn (it’s just text with fields and values):

1. Metadata: txn size, number of inputs, number of outputs, hash of txn (which is the unique ID), lock time

2. Inputs: array of inputs, each containing a hash of a previous txn, an index of particular outputs from the previous txn, and a signature script

3. Outputs: array of outputs, each containing a value and a script that includes the hash of the public key of the recipient. The sum of output values must be at most the sum of input values.

Bitcoin scripts:

• A txn output specifies a script containing an address and saying “this can be redeemed by a public key that hashes to this address, along with a signature from the owner of that public key.” In the scripting language, this is:

OP_DUP
OP_HASH160
69e02e18...
OP_EQUALVERIFY
OP_CHECKSIG

• The input, too, contains a script with the signature. Concatenate the input script scriptSig and the output script scriptPubKey and the result must run successfully for the txn to be valid. The combined script is:

<sig>
<pubKey>
--------
OP_DUP
OP_HASH160
<pubKeyHash?>
OP_EQUALVERIFY
OP_CHECKSIG

• The scripting language is stack-based, was built just for Bitcoin, and is similar to Forth.

• There are 256 instructions, each represented by one byte. Special instruction OP_CHECKMULTISIG checks multiple signatures.

• Executing the script just means pushing or popping on the stack. Data instructions push onto the stack. Opcodes perform a function and often take input data that is on top of the stack.

• 99.9 percent of the scripts that have been used in the history of Bitcoin are exactly the same script – this one!

• Proof-of-burn: script that can never be redeemed and is used to establish that coins have been destroyed

• Pay-to-script-hash: Instead of telling the sender “send your coins to the hash of this public key” (complicated), the receiver can tell the sender “send your coins to the hash of this script.”

• Used in multisig. Sender need not worry that receiver is using multisig; she just needs to send to the receiver’s P2SH address and the receiver must specify the fancy script

• Escrow transactions: using third party when Alice wants to receive goods first and Bob wants to receive payment first. Multisig involved.

• Green addresses: Alice wants to pay Bob, who is either offline or can’t wait for an hour to have 6 blocks confirm the txn, so Alice goes to a 3rd party (s.a. a bank) who deducts money from Alice’s account with the bank and sends the money to Bob.

• Bob accepts a real-world guarantee from the bank that the bank will not double spend. This is not a Bitcoin-enforced guarantee.

• Efficient micro-payments: Alice wants to continually pay Bob small amounts of money. A separate txn for each payment will create too many txn fees. We want to combine the small payments.

• Alice starts with a multisig txn that pays the maximum amount she would need to spend and requires both parties to sign.

• For each micro-payment, Alice signs a txn with an incrementing amount of money paid to Bob and the rest to herself

• Once Alice is done, she stops sending txns and Bob publishes and signs the last txn. The previous txns get discarded.

• What if Bob doesn’t sign the last txn and Alice loses the full value she paid at the beginning? Solve this with lock time.

• Lock time: Before the micro-payments begin, Alice and Bob sign a txn refunding Alice’s money back to her, but the refund is locked until some time in the future. The lock time is specified in a parameter in the txn metadata.

• Smart contracts: the applications discussed so far are contracts for which we have some degree of technical enforcement in Bitcoin (as opposed to in centralized authorities s.a. laws or courts) via scripts, miners, and txn validation.

• The blockchain has two hash-based data structures:

1. Hash chain of blocks: Each block has a header, a hash pointer to some txn data, and a hash pointer to the previous block.

• Header: has info for the mining puzzle; hash of header starts with a lot of 0s; also has a nonce, a timestamp, and a value indicating how difficult the block was to find. Header is the only thing needed for verifying a block

2. All txns in a block are arranged in a Merkle tree.

• A coinbase transaction is one that creates new coins.

Bitcoin network:

• All nodes are equal and can join at any time.

• Publish a txn through a flooding algorithm (gossip protocol): Alice wants to pay Bob, so her client creates a txn and her node sends it to all its peer nodes that check the txn and pass it in turn to their peer nodes and so on. Nodes put the txn in a pool of txns that aren’t on the blockchain yet. Nodes don’t duplicate txns within their pool.

• In case of conflicting txns, the default behavior is for nodes to hang onto whichever txn they hear first (leading to race conditions in memory pools).

• Validating a block involves validating the block header and every txn in the block.

• A node forwards only those blocks that build on the branch that is the longest according to that node's perspective of the blockchain.

• Propagation time of the flooding algorithm is proportional to the size of the block (0 to 350 KB).

• Lightweight nodes: also called Simple Payment Verification (SPV) clients, these only store the pieces of the blockchain that they need to verify specific txns that they care about

• Can’t check that every txn in a block is actually valid because the node only downloads the block headers and txns but not the txn history and doesn’t know the set of unspent txn outputs

• Trust full nodes to have validated all the other txns

• Bitcoin's limitations: average time per block, size of blocks, number of signature operations in a block, divisibility of the currency, total number of bitcoins, and block reward structure

• Hard coded limit of 1 MB blocks. Each txn is >= 250 bytes, so at most 4000 txns per block. Blocks found every 10 minutes means 7 txns per second. This is a very low number (e.g. Visa typically handles 2000 txns per second). Also, increasingly it becomes common for the 1 MB space to be filled up withinn the 10 minutes, resulting in many txns having to wait additional blocks to make their way into the blockchain

• Hard fork: new features are introduced and would recognize blocks as valid that the old software would reject. Most nodes upgrade and some don’t, so the longest branch will have blocks that un-upgraded nodes consider invalid. The un-upgraded nodes work on a branch that excludes blocks with the new feature. The chain splits to never join, and every node in the network is on exactly one side.

• Soft fork: new features are introduced and would make validation rules stricter. Provided that most nodes upgrade, these nodes will be able to enforce the new, tighter, rules. Blocks considered valid by new miners are also considered valid by old miners. Old miners will figure out that their blocks are being rejected. There will be many small temporary forks but no permanent split.

### Ch 4: How to Store and Use Bitcoins

Storing bitcoins is really all about storing and managing Bitcoin secret keys.

There are three goals in storing keys: being able to spend when you want to; no one else being able to spend your coins; spending being convenient to do.

• Storing keys on local device: most convenient, but least secure (hot storage).

• Instead store on a wallet software (cold storage) where coins are stored offline

Keys can be communicated as a text string (key bits converted to base 58) or a QR code.

Cold storage doesn’t have to be online to receive coins; the hot storage just needs to know the cold address.

Hierarchical wallet: the cold side can use an unbounded number of addresses and the hot side can know these addresses, but only with a short, 1-time communication between the two sides (ECDSA)

• In key generation, generate “address generation info” and “private key generation info” instead of directly a single address and private key. The info is used to generate a sequence of addresses. There is a similar process for the private key sequence. The private keys and addresses match up.

• Address generation info doesn’t leak private key generation info.

• Private key gen info: $$k,x,y$$

• $$i$$-th secret: $$x_i=y+H(k||i)$$

• Address gen info: $$k,g^y$$

• $$i$$-th public key: $$g^{x_i}=g^{H(k||i)}*g^y$$

• $$i$$-th address: $$H(g^{x_i})$$

How to store cold information?

• Store in a device that is locked (thumb drive, laptop locked in a safe)

• Paper wallet

• Tamper resistant device

• Brain wallet: access to coins controlled simply with a passphrase. Passphrase is hashed to obtain key pair. Need to be able to turn passphrase into key pair.

• Adversary can try many passphrases, generate addresses with them, and if he finds unspent txns at those addresses he can transfer them to itself

• In practice, use a slow function to derive the private key from the passphrase so that it takes as long as possible for the attacker to try all possibilities (key stretching)

These methods of cold storage all involve a single point of failure. This can be solved using secret sharing:

• Idea: divide secret into $$N$$ pieces in such a way that any $$K$$ pieces, but no fewer, allow a reconstruction of the secret.

• We can do this with any $$N,K$$ as long as the latter is no more than the former. We always need to use a polynomial of degree $$K-1$$. Lagrange interpolation allows reconstruction of a polynomial of degree $$K-1$$ from any $$K$$ points on its curve. Even if an adversary learns $$K-1$$ shares, we are safe. We can tolerate the loss of up to $$N-K$$ shares.

• But when we bring the shares together from different devices, we recreate the single point of vulnerability. We can solve this with threshold cryptography, in which a threshold signature is produced in a decentralized fashion without ever reconstructing the private key on any single device (take a key, split it into shares, store them separately, and sign txns without reconstructing the key)

• Can also avoid the failure with multisig (split control of an address between multiple independent keys)

Ways in which you can use other people’s services to store and manage your bitcoins:

• Online wallets: your own wallet stored in the cloud, accessed with a web interface. The service can access your keys, which are ideally encrypted (but this involves trusting them not to leak your keys or your password)

• Exchanges: like a bank, the exchange promises to pay back your currency on demand. You can deposit bitcoins or fiat currency, you can make/receive payments, and you can exchange bitcoins/fiat by finding a customer who wants to do the opposite exchange as you. There are a few risks with exchanges:

1. Bank run: everyone wants their money back at once, which is a problem because of fractional reserves

2. The bank is a Ponzi scheme.

3. The bank gets hacked.

Payment services make it easy for merchants to accept, process, and exchange Bitcoin. They handle receiving the bitcoins and making deposits. They absorb all of the risk. Because of high volume, it needs to actively participate in exchange markets. Example of a Bitcoin purchase:

• User picks item on merchant website. Merchant delivers webpage to pay with bitcoin which includes txn ID and amount to pay.

• User clicks pay with bitcoin button to trigger HTTPS request to payment service, passing merchant ID, txn ID, and amount.

• Payment service initiations interaction with user, letting him know details of how to pay. User initiates bitcoin payment through their wallet

• Payment service redirects browser to merchant, passing on message that everything looks okay so far (txn broadcast to network but hasn’t received confirmations yet)

• Merchant is sent a confirmation, meaning that the payment service owes the merchant money. Merchant ships goods to user

### Ch 5: Bitcoin Mining

Miners validate every txn, build and store all the blocks, and reach a consensus on which blocks to include in the blockchain. Nodes have to perform 6 tasks in order to mine:

1. Listen for txns and validate them.

2. Maintain blockchain by listening for new blocks and validating them.

3. Assemble a candiate block.

4. Find a nonce that makes your block valid (most difficult part).

5. Hope your block is accepted.

6. Profit?

A miner compiles valid txns from his pool into a Merkle tree, up to the block size limit. He then creates a block with a header pointing to the previous block. He keeps trying different nonces looking for the one that causes the block’s hash to be under the target ($$2^{32}$$ possibilities).

• If none of the tried nonce values work, the miner increments the nonce in the coinbase txn and then once again starts searching nonces in the block header (no better strategy then brute force).

• Changing the coinbase nonce changes the entire Merkle tree, so this is an expensive operation.

No two miners work on the same puzzle unless they share a public key, which would only happen if they are in the same mining pool.

The mining difficulty changes every 2016 blocks (~every 2 weeks) and is adjusted based on how efficient the miners were over the period of the previous 2016 blocks.

• Generally, as more miners come online and mining hardware gets more efficient, blocks are found faster and the difficulty is increased so that it always takes about ten minutes to find a block.

• But within the 2 week period, the time to find a block would be expected to decrease.

CPU mining: Miners simply ran a small piece of code on a general purpose CPU to search over nonces linearly. This had the speed of 20 million hashes per second, which meant several hundred thousand years at the $$2^{67}$$ difficulty level to find a valid block.

GPU mining: Miners began to mine with their graphics card with a language called OpenCL.

• Graphics cards are designed for parallelism so they have many Arithmetic Logic Units (ALUs) that can be used for simultaneous SHA-256 computations.

• They can be overclocked, which is profitable since the few errors from overheating can be more than offset by running the chip faster.

• One motherboard/CPU running the node can drive many graphics cards.

• Disadvantages: draws large amount of power, and doesn’t have good cooling. Also, not fast enough (on average 200 million hashes per second, so 300 years to find a block at the $$2^{67}$$difficulty level).

FPGA mining: Field Programmable Gate arrays. At one billion hashes per second, only a marginal improvement over GPUs.

ASIC mining: Bitcoin mining is dominated by application specific integrated circuits, chips designed specifically for Bitcoin. But the cost of electricity and cooling means individual miners are unlikely to profit.

Professional mining: Centers buy ASICs newer and more efficient than the ones available to the public.

• Future considerations: Is there a way to re-incorporate small miners? Does professional mining violate Satoshi’s original vision?

Landauer’s principle: says any non-reversible computation (ie those that lose information) uses a minimum amount of energy. Flipping a bit in a non-reversible way requires using a minimum number of joules. SHA-256, being non-reversible, leads to a lot of energy consumption.

A Poisson distribution ($$N$$ independent trials each with a $$\lambda/N$$ chance of success as $$N$$ approaches infinity) approximates the expected number of blocks found by a miner.

• There is high variance, as it is very likely to find no blocks or 1 block, and less likely to find more blocks in a year. This means there is high uncertainty.

• Mining pools: mutual insurance for miners. A group of miners attempt to mine a block with a designated coinbase recipient (pool manager) who receives the rewards and distributes them to the pool. The pool manager knows each person’s contribution to the pool with mining shares (ie nearly valid blocks) to prove they are indeed working.

• In 2014, one pool (Ghash dot io) got so big that it had a majority of the entire network capacity. This sort of a situation is still a concern.

• Forking attack: Malicious miner sends txn to Bob and receives some good (Bob even waits for 6 confirmations). Miner forks the chain to create a longer branch with a double spend. The fork can become the consensus chain if the miner has a majority of the hash power.

• Temporary block-withholding attack: You find a block, but don’t disclose it to the network immediately. You do some more mining in the hopes of finding two blocks in a row before the rest of the network finds even one.If you’re ahead by two blocks, all the mining effort of the other miners will be wasted (selfish mining). If another miner announces a block while you look for a second block, you announce the first block and the network must decide which one to mine on. This involves a race, since you want other miners to hear about your block first. With some assumptions, this is actually an improvement over the default strategy.

### Ch 6: Anonymity Basics

Is Bitcoin anonymous? Do we want a cryptocurrency that is truly anonymous?

Bitcoin only achieves pseudonymity. If something has both pseudonymity and unlinkability (different interactions should not be able to be tied to each other) then it has anonymity.

• Bitcoin’s pseudonymity is usually not enough.

• A given address’ entire txn history can be looked up by anyone.

• Businesses will often want real life info such as a shipping address.

• Even if you send a txn through a circuitous route of txns in order to obscure the ultimate recipient, it is possible to infer something from the fact that $$x$$ bitcoins left one address and almost the same amount of bitcoins ended up at another address.

Instead of desiring complete unlinkability, we only try to maximize a txn’s anonymity set, which is the set of txns which an adversary cannot distinguish from the txn.

• Distinguishing between “good” uses of anonymity and “bad” uses of anonymity are impossible for miners to accomplish.

• Shared spending is evidence of joint control. If Alice has bitcoins at three separate addresses and combines them as inputs into a single txn, then anyone can infer that all of those inputs are under the control of the same person. In this way an adversary can find clusters of addresses.

Deanonymization techniques:

• Transaction graph analysis (hard to solve): analyzing graphs of txns in the blockchain.

• Clustering addresses: transact with a person, identify one of their addresses, and then tag their entire cluster of addresses.

• Network-layer deanonymization (easy to solve): If sufficiently many nodes on the network collude, they can figure out the first node to broadcast any txn (presumably a node run by the user who created the txn) and then link the txn to the node’s IP address.

• Easy to solve with Tor + some tweaks.

Mixing is a way to counter transaction graph analysis by using an intermediary to swap bitcoins.

• Online wallets can be mixers but they also maintain records that link your deposit to your withdrawal and also may record your real identity.

• Dedicated mixers are services where you send bitcoins to a provided address and tell them the destination address. The mixer sends other bitcoins to the specified address.

• Guidelines for improving anonymity and security of mixes:

• Use a series of mixers.

• All mixers should agree on a fixedsize for incoming mix transactions so that all txns going through any mix would look identical.

• Client side should be automated.

• Mixing fees should be all-or-nothing and applied probabilistically so that either the whole chunk is swallowed with a small probability or it is returned in its entirety.

• Unfortunately, no robust mix ecosystem exists.

• Coinjoin is an example of a decentralized mixer. It uses a P2P protocol by which a group of users can mix their coins (no theft possible):

1. Find peers who want to mix.

2. Exchange input/output addresses with peer group in such a way that even the peers do not know the input/output mappings (use anonymous communication protocol s.a. Tor).

3. Construct the txn and collect signatures.

• One user constructs a txn corresponding to these inputs and outputs.

• The unsigned txn is passed around with each user verifying correctness of input/output addresses and then signing.

• A user can refuse to sign, preventing the protocol from completing (denial of service attack)

Zerocoin and Zerocash incorporate mixing at the protocol level so that you don’t need to trust mixes, peers, intermediaries, or even miners. They are not compatible with Bitcoin. To understand Zerocoin, first consider Basecoin which is a currency that uses Zerocoin as a trading mechanism.

• A zerocoin can be redeemed for a new basecoin.

• Zerocoins are minted using a commitment by generating a serial number$$s$$ and random secret $$r$$, computing $$Commit(s,r)$$, publishing the commitment, and thus burning a basecoin to create a zerocoin.

• To spend zerocoin and redeem basecoin, use a zero-knowledge proof to thus avoid revealing the serial or secret.

• With a ZKP, you can prove a statement without revealing any additional information that leads to that statement being true. E.g. you can prove the statement "I know $$r$$ s.t. $$Commit(S,r)$$ is one of the commitments on the chain" without revealing the secret.

• Create spend txn containing S and the ZKP

• Miners verify ZKP

• Miners check that S has never been used before (to avoid double spend)

• Output of spend txn is new basecoin

• There is a one-time setup of public parameter$$N$$, where $$N=pq$$ and $$p,q$$ are large primes. The public parameter is used by everyone for lifetime of the system.

Zerocash builds on Zerocoin and uses zk-SNARK which is a way of making ZKP verification more efficient so that there is no need for basecoin. As a result, txn amounts are no longer inside the commitments or visible on the chain. The ledger only records existence of txns and proofs, not addresses and not values.

### Ch 7: Community, Politics, and Regulation

The genius of Bitcoin is consolidating consensus about rules, about history, and that coins are valuable.

Bitcoin Core is the most widely used Bitcoin software. Substaintial changes go through a process called Bitcoin Improvement Proposals. Bitcoin Core developers, (as of 2015, are Gavin Andresen, Jeff Garzik, Greg Maxwell, J van der Laan, Pieter Wuille. If the lead developers behave badly, the community could choose to fork the software.

Stakeholders in Bitcoin:

• Core developers: write the rulebook.

• Miners: write history and decide on txn validity.

• Investors: decide on value.

• Merchants and customers: generate demand.

• Payment services: handle txns.

• Advocacy groups: fund developers and talk to governments.

Roots of Bitcoin:

• Cypherpunks: libertarian movement surrounding public-key cryptography that believed in strong privacy.

• Chaum's digital cash.

• Satoshi: released Bitcoin whitepaper in 2008, was active until mid-2010 when he handed over control to other developers, and mined a huge number of bitcoins that have never been moved.

### Ch 8: Alternative Mining Puzzles

Bitcoin’s mining puzzle is a partial hash-preimage puzzle.

Essential puzzle requirements:

• quick to verify
• progress-freeness: chance of winning in a unit of time should be proportional to the hash power used (should be memoryless)

ASIC-resistance means a puzzle design for which existing general purpose CPUs are already the cheapest and most efficient devices. This is impossible, so we just want to reduce the gap between ASICs and general purpose CPUs.

• Memory-hard puzzles require large amounts of memory. Scrypt is a memory-hard hash function used in Litecoin, but there are Scrypt ASICs available.

• Any altcoin with a new mining puzzle is likely to have an initial “ASIC honeymoon” phase in which GPU/FGPA mining will be profitable because the large upfront costs of designing ASICs will deter manufacturers from making an investment on new, unproven cryptocurrencies.

Virtual mining: Rather than spending money on power and equipment to prove who has invested the most in mining, simply allocate mining power directly to all currency holders in proportion to how much currency they hold.

• Pros:

• Removes the wasteful aspect of mining where the miner purchases and consumes equipment and electricity.

• May reduce trend towards centralization.

• All miners will be stakeholders and will thus have incentive to do things that would benefit the system as a whole(PoS).

• Nothing-at-stake problem: a miner launching a failed attack by creating a fork will not have the opportunity cost of earning mining rewards because the miner can mine the longest chain while simultaneously trying to create a fork.

### Ch 9: Bitcoin as a Platform

Bitcoin as an append-only-log:

• Secure timestamping: To prove we know $$x$$ at time $$T$$, we publish $$H(r||x)$$ at time $$T$$, and at some later point reveal $$r,x$$.

• Do this with a OP_RETURN <H(data)> txn.

• Illicit content: people have published links to illicit content on the chain to make it dangerous to downlaod the chain and run a full node.

• Overlay currency: use Bitcoin as it exists today and write all data we need for a new currency directly onto the Bitcoin blockchain.

Bitcoin as smart property:

• This is possible due to non-fungibility.

• Colored coins: “Stamp” some bitcoins with a “color” (ie add some metadata) to be able to track them as they change hands.

• Can apply colored coins to owning stock in a company, to claims on real-world property, or to DNS (Namecoin).

Secure multi-party lotteries:

• Parties each pick a large random string, publish the hash, check that the hashes are distinct, reveal the string, check that the revealed strings match the hashes, and then compute the final outcome (according to the protocol).

• Fairness: we want all the parties to be forced to reveal their commitment within some time limit. Instead of hash commitments, use timed hash commitments so that a txn can't be claimed before the time specified in the nLockTime parameter.

Bitcoin as public randomness source:

• Cryptographic beacon: a service that provides a public source of randomness by continuisly publishing new random data at a regular rate.

• Sunspots and cosmic background radiation are very slow and require unrealistic measurement precision for all observers to get the same value, and stock prices involve trusting an exchange.

• Perhaps no one can predict or influence what the next block hash will be withut actually mining it. The first few bits will be zero, but the remaining bits are only predictable by finding a winning block. So we apply a randomness extractor to the value of the block header for every block in the chain.

### Ch 10: Altcoins and the Cryptocurrency Ecosystem

Every altcoin needs a reason to exist. In the simplest case, the altcoin just changes some of the parameters of Bitcoin.

• Launching an altcoin: create a new reference client (typically by forking an existing code base), bootstrap adoption of the coin, and attract stakeholders (miners especially important in order to have enough hash power to prevent double-spending and forks).

• Pump-and-dump: exaggerate coin’s potential and drum up interest, then sell.

• Initial allocation: there are alternatives to Bitcoin mining.

• Developers may “pre-mine” the currency or set up a currency “pre-sale”

• One motivation for alternatives is to avoid the centralization of mining that is present in Bitcoin.

• Proof-of-burn: users claim units of altcoin in proportion to a quantity of bitcoins they provably destroy. Called a “one-way peg” or “price ceiling”, and ensures an altcoin will be worth at most one bitcoin.

• Can compare altcoins in terms of market capitalization and mining power.

• Altcoin infanticide: Very powerful miners can easily carry out an attack against a small altcoin, causing forks and enough havoc to kill it.

Namecoin: Decentralized version of DNS.

• Data entries are name/value pairs with unique names (except for updates).

• Only the user who created a name entry can update that name. Users can transfer control of names to others.

• Rationale: existing DNS puts too much control over a critical component of Internet into the hands of a single entity.

Litecoin: Payment system launched shortly after Namecoin that has been forked more times than even Bitcoin.

• Has a memory-hard mining puzzle in order to have GPU resistance (you could mine on a CPU when you needed a GPU for Bitcoin), but this has transitioned to ASICs by now (so this was a failed goal).

• Has some minor parameter changes (e.g. blocks arrive every 2.5 minutes).

Peercoin: First altcoin to use PoS mining.

• Administrators have a trusted public key to use for assigning “blessed” blocks every so often. This means it’s not really decentralized.

Dogecoin: A close fork of Litecoin distinguished not by tech but by community values of generosity and humor.

Merge mining: mining on multiple currencies simultaneously, without dividing mining resources. Leads to greater mining centralization in altcoins.

Cross-chain atomic swaps: exchanging between quantities of different coins, despite txns one coin being entirely independent of a txn on another coin. This involves commitments and time-locked deposits, but it’s also complex and doesn’t prevent denial of service.

Side chains: altcoins whose prices are bilaterally pegged to the price of Bitcoin. They use Bitcoin as a sort of reserve currency to remove volatility during the bootstrapping phase. The naive way to do this is encode all the sidechain’s rules into Bitcoin. A better way is SPV proofs:

• SPV nodes are used by lightweight clients to verify block headers, just to look for evidence that the txn they care about is in the longest branch, valid or not, and that it has received some confirmations.

• The idea is to extend Bitcoin’s script with an instruction to verify a proof that a txn happened in the sidechain. The nodes doing the verification would be fully validating as far as Bitcoin is concerned, but they would also be doing lightweight SPV verification for the sidechain.

• But Bitcoin nodes would still have to connect to the sidechain’s network and track all the sidechain block headers.

• We want that when a txn tries to convert from sidechain to Bitcoin, it contains all the information needed by Bitcoin nodes to verify its legitimacy.

• To reference a sidechain txn in Bitcoin, a user provides proof of 1) inclusion of the sidechain txn in a sidechain block, and 2) sidechain block headers showing that this block has received a certain number of confirmations (PoW).

• Bitcoin nodes verify these claims, and wait to allow other users to present evidence that the presented headers are not on the longest branch. If such evidence is presented, the txn is not accepted.

Bitcoin’s scripting has a small instruction set that isn’t Turing complete, requiring new new altcoins for each new application-specific functionality. Ethereum is a single altcoin that aims to provide a Turing-complete programming language for writing scripts or “contracts”.

• An Ethereum contract is a program uploaded to the blockchain for a small fee and is executed by Ethereum’s virtual machine.

• Solidity is the high-level language.

• Ethereum uses gas to prevent infinite loops. Executing each virtual machine instruction costs some gas.

• Gas is paid for by Ethereum’s currency ether. Every txn specifies the gas price, ie how much ether it will pay per unit of gas consumed (similar to Bitcoin’s txn fee).

• Also, every call specifies how much gas it is willing to spend (gas limit). If limit is reched, execution halts, all changes to program state are undone, and the miner pockets the gas anyway.

• Whereas Bitcoin has a txn-based ledger, Ethereum has an account-based ledger.

• Every block contains a digest of the current state (balance and txn count) of every address as well as the state (balance and storage) of every contract ($$2^{264}$$ bytes of storage).

### Ch 11: Decentralized Institutions: The Future of Bitcoin?

The key difference between Bitcoin and the previous attempts at digital cash is decentralization. The core innovation of Bitcoin that enables decentralization is the blockchain.

Can we decentralize everything? We need to ask four questions:

1. What is being decentralized?

2. What is the level of decentralization?

3. What blockchain is deployed?

4. What security mechanism does it use?