Bitcoin Basic Research

Whitepaper

Some questions from my reading of the original bitcoin whitepaper.

It seems like the bitcoin distributed network is very dependent on sequencing of events, ordering + linearizability. For example, the proof-of-work concept + “work” on a block. Once the work has been completed to satisfy the proof-of-work for that block, the block cannot be changed without redoing the work. Blocks are chained on after it when the work is done, re-doing that block would mean also re-doing all of the blocks that come after it. How the fuck in the world is this possible to make concurrent or parallel? If I start work on a block, how do I know no one else is working on that block? Does the rest of the network have to wait for me to be done with that block?
- The system is highly competitive, the output of the most-recent block acts as the input to the next (currently unsovled/non-existant) block’s Proof of Work challenge. This most recent block is subject to any miners in the system, who will all work on it in parallel. They will be working on it in parallel.
- Follow up question: okay so the PoW is parallelized, multiple miners can be working on the next block in parallel in which only one will “win”, but is the acceptance of new blocks synchronous? There, presumably, can only be one new block added at a time, right?
  - Yes, but even the acceptance/verification process is parallelized. Ultimately conflicts are resolved with the “longest chain wins” rule. If two conflicting blocks are added, the network chooses the longer block. Temporary conflicts (”forks” in the blockchain) do arise, but they are resolved with the longest chain rule, eventually resulting in a linearizable blockchain. Kind of like eventual consistency, but eventual linearizability.
Does the original conception of “honest nodes” and “malicious nodes” still apply? Mainly in regard to the idea that in order to spoof a transaction or compromise the blockchain, a malicious actor would need to procure enough compute power to outpace all of the “honest” nodes. It seems like it would be trivial in today’s age of GPU compute, cloud, and extremely fast benchmarks for some modern day languages like Rust.
- This is 100% still relevant today, and has kind of stood the test of time even as the bitcoin network has evolved. It’s theoretically possible that an attacker could compromise the blockchain with something known as a “51% attack”, in which they gain a majority of the network’s nodes to rewrite or manipulate the blockchain. However, in practice, we can consider that extremely unlikely due to the following reasons:
  1. Hardware: Bitcoin mining has gone from relying on CPU, to GPU, to now a very specialized purpose-built hardware for bitcoin mining called Application-Specific Integrated Circuits - ASIC. Without bitcoin-mining-specific ASICs as your hardware, it’s essentially impossible to keep up. There’s a limited amount of these in the world and it’d be infeasible to procure enough to perform a 51% attack, let alone procure enough electricity to do so.
  2. Network Scale: See point one re: hardware + electricity limitations, but with an additional note that the 51% attack would need to overcome the massive amount of other nodes in the network, which is somewhere between 11k-20k nodes.
  3. Economic Disincentives: The design of Bitcoin is all about incentives, and that’s also baked into its security model. Acting maliciously, such as attempting a 51% attack, would undermine the network’s integrity and, consequently, the value of the attacker’s investment and mined Bitcoins. Thus, it’s in miners’ best interest to support the network’s stability and security.
- While 51% attacks on the bitcoin network are infeasible in practice, the same cannot be said of smaller altcoins, in which a combination of small network scale + hardware cloud rentals have led to successful attacks.
How many nodes are there on the network?
- Hard to say for sure but it’s probably between 11k and 20k. Can check bitnode for an estimate.
What is the mempool and who manages it?
- Every node in the bitcoin network listens to the network for new unconfirmed transactions to be broadcast. These new transactions are unconfirmed, un-worked-on transactions waiting to be baked into a block. There is no one copy of the mempool, each node maintains its own “copy” of the mempool as new transactions are added.
How does the network verify completed blocks? Is that somehow also included in the PoW challenge?
What is the network’s difficulty target? How is it decided?
- The difficulty target is a threshold that needs to be met for new blocks to be added to the network. New block’s hash must be below the threshold, as the lower the difficulty target the harder it is to meet, lower = more difficult. Interestingly, the difficulty target is re-calculated every 2ish weeks. The algorithm to find the difficulty target is based trying to on the throughput over the last 2016 blocks, with the goal of maintaining a frequency of one new block every 10 minutes. So, 10 minutes, 2 weeks, this yields 2016 blocks. There’s no one “authoritative” source of the difficulty target — instead it’s an algorithm that any node on the network can run.
- Also, the hash of a new block is not deterministic. The hash is a combination of the transactions in it, a nonce determined by the miner, and the previous blocks hash. Miners essentially guess different nonce values (among other potential variations in the block) until they find a combination that produces a valid hash. This process is highly parallelizable, and the chance of finding a valid hash increases with the amount of computational power a miner can apply to the problem.
I’m a bit confused about how nodes “interact” with the “network”. For example, when an unconfirmed transaction is brodcasted to the network, how does that client publishing that message make it available to the network at large? Is there a network interface? Is there a protocol? Does the client query one single node in the network and take that as the source of truth at the network at large? The broadcast of unconfirmed transaction is just one example, I’m curious to learn how nodes interact with the network/query the network, and what protocol is used. HTTP? Is there a URL or something they can use to connect to the network?
- Remember that Bitcoin is a peer-to-peer network. This is fundamentally different from a client-server model, which is necessitated by the use of HTTP and URLs, for example. Instead, Bitcoin nodes use a custom protocol on top of TCP to ferry binary between eachother.
- There are, however, DNS seed servers that will direct you to a node or nodes. A node “joins” the network by either querying the DNS seed servers for another node’s IP address or by connecting first to a known node, e.g., a hardcoded IP address.
- A node may maintain many connections to multiple nodes on the network for redundancy/resiliency if one node drops.
- Bitcoin nodes do not use URLs or HTTP to communicate, instead they use IPs and ports to identify eachother and ferry payloads between nodes using TCP, as mentioned. The default port for Bitcoin is 8333.
- All nodes must independently verify transactions and blocks, and may rely on conflicting information from the multiple nodes it’s connected to in order to verify or reject blocks from other nodes. This means that even if a node receives incorrect or fraudulent information, it will not accept or relay this information if it violates the rules.

Last modified: February 19, 2024