Forge Your Own Blockchain: A First-Principles Python Cookbook

Abstract

A blockchain is not a mystical artefact. It is four well-understood cryptographic primitives - a collision-resistant hash, a digital signature scheme, a Merkle tree, and a proof-of-work - composed in a specific order and bolted onto a tiny append-only data structure. This cookbook builds a working one in roughly four hundred lines of Python, using only the standard library and one well-vetted third-party package for elliptic-curve signing.^{1Python Cryptographic Authority. cryptography: Python library exposing cryptographic recipes and primitives. https://cryptography.io/}

The intent is pedagogical, not productional. The chain we build runs on a single process, has no peer-to-peer network, no consensus protocol, no fee market, and no smart contract layer. What it does have is every floor-level property that any production blockchain inherits: tamper-evident block linkage, cryptographically authenticated writes, efficient membership proofs, and a quantifiable cost to rewrite history. Once the floor is concrete, the rest of the field - Bitcoin's UTXO model, Ethereum's account model, modern proof-of-stake, zero-knowledge rollups - becomes a series of engineering choices on top of primitives you understand from the inside.

How To Use This Cookbook

Read it linearly. Each section introduces one primitive, shows the relevant code, and then composes that primitive into the next layer. The full source file forge.py is reproduced section by section as the prose develops; by the end of section six you will have transcribed the entire implementation. The final two sections run the code three different ways - a build-and-validate demo, a tamper attack, and a difficulty study - and report numbers measured on a real machine.

Prerequisites are modest. A reader comfortable with Python at the level of "I can write a class and use a dictionary" will not be lost. No prior cryptography knowledge is assumed. We treat SHA-256, ECDSA, and elliptic curves as black boxes whose specifications and security arguments are cited rather than re-derived.^{2National Institute of Standards and Technology. FIPS PUB 180-4: Secure Hash Standard. August 2015. https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf}^{3Standards for Efficient Cryptography Group. SEC 2: Recommended Elliptic Curve Domain Parameters, Version 2.0. January 2010. https://www.secg.org/sec2-v2.pdf}

The Four Primitives

The whole construction reduces to four ideas.

Primitive	What It Does	Specification
Cryptographic hash	Reduces arbitrary data to a 32-byte fingerprint that is infeasible to forge a collision for	NIST FIPS 180-4 (SHA-256)^{2National Institute of Standards and Technology. FIPS PUB 180-4: Secure Hash Standard. August 2015. https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf}
Digital signature	Lets the holder of a private key produce a 64-byte tag that anyone with the matching public key can verify	SEC 2 v2.0 (secp256k1) + ECDSA^{3Standards for Efficient Cryptography Group. SEC 2: Recommended Elliptic Curve Domain Parameters, Version 2.0. January 2010. https://www.secg.org/sec2-v2.pdf}
Merkle tree	Compresses a set of items into one root hash, with O(log N) membership proofs	Merkle 1988^{4Merkle, Ralph C. A Digital Signature Based on a Conventional Encryption Function. In Advances in Cryptology - CRYPTO '87, LNCS vol. 293, pp. 369–378. Springer, 1988. https://doi.org/10.1007/3-540-48184-232}
Proof-of-work	Turns compute into a tunable cost so writes cannot be silently rewritten	Nakamoto 2008, §4^{5Nakamoto, Satoshi. Bitcoin: A Peer-to-Peer Electronic Cash System. 2008. https://bitcoin.org/bitcoin.pdf}

Every other property of a blockchain - immutability, distributed trust, censorship resistance, ordering guarantees - is a consequence of how these four are arranged. We will introduce each one, write code for it, then compose them.

Setup

The implementation depends on Python 3.10 or later and one third-party package, cryptography, which provides hardened bindings to OpenSSL's elliptic-curve routines.^{1Python Cryptographic Authority. cryptography: Python library exposing cryptographic recipes and primitives. https://cryptography.io/} Everything else comes from the standard library.

python3 -m venv .venv
source .venv/bin/activate
pip install cryptography

Confirm the install:

python -c "import cryptography; print(cryptography.__version__)"

You should see a version string of 42.x or newer. The cookbook was developed against 47.0.0 on Python 3.14.

Create one file, forge.py, and follow along. We open with the imports and a single helper.

from __future__ import annotations

import hashlib
import json
import secrets
import sys
import time
from dataclasses import dataclass, field

from cryptography.hazmat.primitives import hashes, serialization
from cryptography.hazmat.primitives.asymmetric import ec
from cryptography.hazmat.primitives.asymmetric.utils import (
    decode_dss_signature,
    encode_dss_signature,
)
from cryptography.exceptions import InvalidSignature


def sha256(data: bytes) -> bytes:
    return hashlib.sha256(data).digest()


def hex_(b: bytes) -> str:
    return b.hex()

sha256 is wrapped because we will call it on dozens of different inputs and the wrapper makes the resulting code read like prose. That is the entire setup.

Primitive 1: Cryptographic Hash

A cryptographic hash function takes arbitrary bytes and returns a fixed-length digest with three properties: it is fast to compute forward, infeasible to invert, and infeasible to find two distinct inputs that hash to the same digest. The function we use, SHA-256, returns 32 bytes (256 bits). The best published collision attack against the full algorithm is the generic birthday bound at roughly 2¹²⁸ work, which is not practical with any technology that exists or is on the horizon.^{2National Institute of Standards and Technology. FIPS PUB 180-4: Secure Hash Standard. August 2015. https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf}

For our purposes the hash function is the one tool that makes everything else possible. We will use it as a fingerprint for transactions, as the leaf and internal node operation in our Merkle tree, as the linkage between blocks, and as the puzzle in proof-of-work. The Python wrapper above is all the implementation we need; the primitive itself is delegated to hashlib, which links against OpenSSL's FIPS 180-4-conformant SHA-256.^{2National Institute of Standards and Technology. FIPS PUB 180-4: Secure Hash Standard. August 2015. https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf}

A worked example. Two strings differing in one character produce digests that share almost no structure:

>>> sha256(b"battery PACK-001 SoH 0.94").hex()
'af3b0c…'   # your bytes will differ; the point is independence
>>> sha256(b"battery PACK-001 SoH 0.95").hex()
'2d71e8…'

This avalanche property is what gives us tamper detection: changing a single byte anywhere in a block produces a completely different block hash, which breaks the linkage that the next block depends on.

Primitive 2: Digital Signatures

A signature scheme is what lets us say "this transaction came from a specific identity, and no one without that identity's private key could have produced this signature." We use ECDSA (Elliptic Curve Digital Signature Algorithm) over the secp256k1 curve, which is the same curve Bitcoin and Ethereum use.^{3Standards for Efficient Cryptography Group. SEC 2: Recommended Elliptic Curve Domain Parameters, Version 2.0. January 2010. https://www.secg.org/sec2-v2.pdf} secp256k1 is not the most modern choice - Ed25519 is faster, smaller, and deterministic by construction^{6Josefsson, S. and Liusvaara, I. Edwards-Curve Digital Signature Algorithm (EdDSA). RFC 8032, January 2017. https://www.rfc-editor.org/rfc/rfc8032} - but secp256k1 is what every public blockchain in production runs on, and a reader who graduates from this cookbook into mainstream tooling will see the same curve. The tradeoff is worth naming once and then moving on.

A keypair is one 32-byte private scalar and one curve point that serialises to 33 bytes in compressed form. A signature is a pair of integers (r, s) modulo the curve order, which we serialise as 64 raw bytes.

CURVE = ec.SECP256K1()


def new_keypair() -> ec.EllipticCurvePrivateKey:
    return ec.generate_private_key(CURVE)


def public_bytes(key: ec.EllipticCurvePrivateKey) -> bytes:
    return key.public_key().public_bytes(
        encoding=serialization.Encoding.X962,
        format=serialization.PublicFormat.CompressedPoint,
    )


def public_key_from_bytes(b: bytes) -> ec.EllipticCurvePublicKey:
    return ec.EllipticCurvePublicKey.from_encoded_point(CURVE, b)


def sign(key: ec.EllipticCurvePrivateKey, message: bytes) -> bytes:
    der = key.sign(message, ec.ECDSA(hashes.SHA256()))
    r, s = decode_dss_signature(der)
    return r.to_bytes(32, "big") + s.to_bytes(32, "big")


def verify(pub_bytes: bytes, message: bytes, signature: bytes) -> bool:
    if len(signature) != 64:
        return False
    r = int.from_bytes(signature[:32], "big")
    s = int.from_bytes(signature[32:], "big")
    der = encode_dss_signature(r, s)
    try:
        public_key_from_bytes(pub_bytes).verify(der, message, ec.ECDSA(hashes.SHA256()))
        return True
    except InvalidSignature:
        return False

Two implementation notes deserve attention. First, the underlying library produces signatures in DER encoding - a self-describing format that varies in length between 70 and 72 bytes - so we strip the encoding to get the raw (r, s) integers and re-encode them as a fixed 64-byte field. This matches the convention every production blockchain uses and makes our serialised blocks predictable in size. Second, verify returns a boolean rather than raising. In a chain validator, a single bad signature should reject one block, not crash the loop.

Primitive 3: Transactions As Signed Assertions

A transaction in our chain is a signed assertion. The author is identified by a public key, the payload is an arbitrary JSON object, and the signature commits to a canonical serialisation of both. By making the payload opaque, we keep the chain layer agnostic to what the chain is actually attesting to - capacity readings, intake records, status changes, anything an upstream application wants to commit to.

@dataclass(frozen=True)
class Transaction:
    """A signed assertion. Payload is opaque JSON-serialisable data."""

    author_pub: bytes
    payload: dict
    signature: bytes

    def signing_bytes(self) -> bytes:
        body = {"author_pub": hex_(self.author_pub), "payload": self.payload}
        return json.dumps(body, sort_keys=True, separators=(",", ":")).encode()

    def verify(self) -> bool:
        return verify(self.author_pub, self.signing_bytes(), self.signature)

    def hash(self) -> bytes:
        return sha256(self.signing_bytes() + self.signature)


def make_transaction(
    key: ec.EllipticCurvePrivateKey, payload: dict
) -> Transaction:
    pub = public_bytes(key)
    body = {"author_pub": hex_(pub), "payload": payload}
    msg = json.dumps(body, sort_keys=True, separators=(",", ":")).encode()
    return Transaction(author_pub=pub, payload=payload, signature=sign(key, msg))

Two things make this code safer than the naïve version. The sort_keys=True, separators=(",", ":") argument to json.dumps produces a canonical serialisation: the same payload always serialises to the same bytes, regardless of how Python's dictionary preserved insertion order. Without this, two transactions with semantically identical payloads could produce different signatures, which breaks verification on a different machine.

The transaction hash includes the signature, not just the body. This means the Merkle leaf changes if either the payload or the signature changes, which closes a subtle attack surface where someone could swap a signature and leave the body intact.

Primitive 4: Merkle Trees

A Merkle tree compresses a list of items into a single root hash, with the property that membership of any one item can be proved using only the sibling hashes along the path from leaf to root.^{4Merkle, Ralph C. A Digital Signature Based on a Conventional Encryption Function. In Advances in Cryptology - CRYPTO '87, LNCS vol. 293, pp. 369–378. Springer, 1988. https://doi.org/10.1007/3-540-48184-232} For a tree with N leaves, a proof is approximately log₂(N) hashes - 32 bytes each at SHA-256. A block with one thousand transactions can be proved to contain a specific transaction in roughly 320 bytes, without revealing the other 999.

The construction follows Bitcoin's convention: when a level has an odd number of nodes, duplicate the last node so the next level can be paired up. RFC 6962 (Certificate Transparency) takes a different approach to odd levels;^{7Laurie, B., Langley, A., and Kasper, E. Certificate Transparency. RFC 6962, June 2013. https://www.rfc-editor.org/rfc/rfc6962} either is fine, but a verifier must agree with the producer on which one is in use.

def merkle_root(leaves: list[bytes]) -> bytes:
    if not leaves:
        return b"\x00" * 32
    level = list(leaves)
    while len(level) > 1:
        if len(level) % 2 == 1:
            level.append(level[-1])
        level = [sha256(level[i] + level[i + 1]) for i in range(0, len(level), 2)]
    return level[0]


def merkle_proof(leaves: list[bytes], index: int) -> list[tuple[str, bytes]]:
    """Return the sibling-hash path for `index`. Each step is ('L'|'R', hash)."""
    if not 0 <= index < len(leaves):
        raise IndexError(index)
    proof: list[tuple[str, bytes]] = []
    level = list(leaves)
    idx = index
    while len(level) > 1:
        if len(level) % 2 == 1:
            level.append(level[-1])
        sibling_idx = idx ^ 1
        side = "R" if idx % 2 == 0 else "L"
        proof.append((side, level[sibling_idx]))
        level = [sha256(level[i] + level[i + 1]) for i in range(0, len(level), 2)]
        idx //= 2
    return proof


def verify_merkle_proof(
    leaf: bytes, proof: list[tuple[str, bytes]], root: bytes
) -> bool:
    h = leaf
    for side, sibling in proof:
        h = sha256(h + sibling) if side == "R" else sha256(sibling + h)
    return h == root

The verification function is independent of the original leaf list. A verifier with only the leaf, the proof path, and the root can confirm membership without trusting the prover and without seeing any other leaves. This is what gives a blockchain its "publish once, prove forever" property.

A small worked example clarifies the side convention:

>>> leaves = [sha256(f"leaf-{i}".encode()) for i in range(7)]
>>> root = merkle_root(leaves)
>>> proof = merkle_proof(leaves, 3)
>>> verify_merkle_proof(leaves[3], proof, root)
True

For seven leaves, the proof has three steps, which matches ceil(log2(7)). For one million leaves, the proof would have 20 steps, totalling 640 bytes plus the leaf itself.

Primitive 5: Blocks And Proof-Of-Work

A block is a header plus a body. The header carries the metadata that links blocks together and supports the proof-of-work check. The body carries the transactions. The header is what gets hashed; the body's commitment to the header happens through the Merkle root.

@dataclass
class Block:
    prev_hash: bytes
    merkle_root_: bytes
    timestamp_ns: int
    difficulty_bits: int
    nonce: int = 0
    transactions: list[Transaction] = field(default_factory=list)

    def header_bytes(self) -> bytes:
        return (
            self.prev_hash
            + self.merkle_root_
            + self.timestamp_ns.to_bytes(8, "big")
            + self.difficulty_bits.to_bytes(1, "big")
            + self.nonce.to_bytes(8, "big")
        )

    def block_hash(self) -> bytes:
        return sha256(self.header_bytes())

The header serialises to exactly 81 bytes - 32 + 32 + 8 + 1 + 8. Hashing the header gives the block hash. Including the previous block's hash means any attempt to alter an old block changes its hash, which breaks the next block's prev_hash field.

Proof-of-work is a single inequality: the block hash, treated as a 256-bit unsigned integer, must be less than 2^(256 − d), where d is the difficulty in leading zero bits. Mining is the act of incrementing the nonce until the inequality holds.^{5Nakamoto, Satoshi. Bitcoin: A Peer-to-Peer Electronic Cash System. 2008. https://bitcoin.org/bitcoin.pdf}

def hash_meets_target(block_hash: bytes, difficulty_bits: int) -> bool:
    """True iff `block_hash` has at least `difficulty_bits` leading zero bits."""
    n = int.from_bytes(block_hash, "big")
    return n < (1 << (256 - difficulty_bits))


def mine(block: Block, max_nonce: int = 1 << 40) -> tuple[Block, int]:
    """Increment nonce until the header hash meets the difficulty target."""
    attempts = 0
    while block.nonce < max_nonce:
        attempts += 1
        if hash_meets_target(block.block_hash(), block.difficulty_bits):
            return block, attempts
        block.nonce += 1
    raise RuntimeError("nonce space exhausted")

Because SHA-256 is a pseudorandom function, the probability that any given nonce satisfies the inequality is 2^(-d), so the expected number of nonces to try is 2^d. We will measure this empirically in the difficulty study at the end.

The Chain

Composing the primitives gives us the chain. A chain holds a list of blocks and exposes two operations: append a new block of transactions, and validate the entire chain end-to-end. Append rejects any block containing an invalid signature, and validation checks every linkage, every Merkle root, every proof-of-work target, and every signature.

GENESIS_PREV_HASH = b"\x00" * 32


@dataclass
class Chain:
    difficulty_bits: int = 16
    blocks: list[Block] = field(default_factory=list)

    def append(self, transactions: list[Transaction]) -> tuple[Block, int]:
        for tx in transactions:
            if not tx.verify():
                raise ValueError("rejected: invalid signature on a transaction")
        leaves = [tx.hash() for tx in transactions]
        prev = self.blocks[-1].block_hash() if self.blocks else GENESIS_PREV_HASH
        block = Block(
            prev_hash=prev,
            merkle_root_=merkle_root(leaves),
            timestamp_ns=time.time_ns(),
            difficulty_bits=self.difficulty_bits,
            transactions=list(transactions),
        )
        block, attempts = mine(block)
        self.blocks.append(block)
        return block, attempts

    def is_valid(self) -> bool:
        prev = GENESIS_PREV_HASH
        for block in self.blocks:
            if block.prev_hash != prev:
                return False
            if not hash_meets_target(block.block_hash(), block.difficulty_bits):
                return False
            recomputed = merkle_root([tx.hash() for tx in block.transactions])
            if recomputed != block.merkle_root_:
                return False
            for tx in block.transactions:
                if not tx.verify():
                    return False
            prev = block.block_hash()
        return True

Notice that there is no separate notion of a "genesis block" in the data model - the first block is simply the one whose prev_hash is the all-zero sentinel. This keeps the chain logic uniform and avoids a special case at index zero.

That is the entire chain. Roughly two hundred lines of code, four cryptographic primitives, and a single append operation. Everything else is exercising it.

Running The Chain

The first demo builds a three-block chain in which two parties - call them Alice and Bob - submit signed attestations against asset identifiers. The payloads here are stand-ins; the chain does not look at them, only signs and orders them.

def demo() -> None:
    print("=== demo: build a 3-block chain ===")
    alice = new_keypair()
    bob = new_keypair()

    chain = Chain(difficulty_bits=16)
    txs1 = [
        make_transaction(alice, {"asset": "PACK-001", "soh": 0.94, "step": "intake"}),
        make_transaction(bob, {"asset": "PACK-002", "soh": 0.81, "step": "intake"}),
    ]
    block, attempts = chain.append(txs1)
    print(f"block 1 mined in {attempts} attempts, hash={hex_(block.block_hash())[:16]}...")

    txs2 = [make_transaction(alice, {"asset": "PACK-001", "soh": 0.93, "step": "test"})]
    block, attempts = chain.append(txs2)
    print(f"block 2 mined in {attempts} attempts, hash={hex_(block.block_hash())[:16]}...")

    txs3 = [
        make_transaction(bob, {"asset": "PACK-002", "soh": 0.80, "step": "test"}),
        make_transaction(alice, {"asset": "PACK-003", "soh": 0.97, "step": "intake"}),
        make_transaction(bob, {"asset": "PACK-002", "decision": "second-life"}),
    ]
    block, attempts = chain.append(txs3)
    print(f"block 3 mined in {attempts} attempts, hash={hex_(block.block_hash())[:16]}...")

    print(f"chain valid: {chain.is_valid()}")
    leaves = [tx.hash() for tx in chain.blocks[2].transactions]
    proof = merkle_proof(leaves, 0)
    ok = verify_merkle_proof(leaves[0], proof, chain.blocks[2].merkle_root_)
    print(f"merkle proof for block-3 tx[0]: {len(proof)} steps, verify={ok}")

Running it on the author's machine produced this output:

=== demo: build a 3-block chain ===
block 1 mined in 23444 attempts, hash=00000b1be94c6e70...
block 2 mined in 26587 attempts, hash=000011210560faed...
block 3 mined in 35271 attempts, hash=00007f52c078cf4a...
chain valid: True
chain length: 3 blocks, 6 transactions
merkle proof for block-3 tx[0]: 2 steps, verify=True

Three blocks at difficulty 16 took about 26,000 SHA-256 evaluations on average to mine, which matches the theoretical expectation of 2^16 = 65,536 to within the wide noise band you get from three samples. The Merkle proof for the first transaction in block three is two steps long because that block contains three transactions (Bitcoin-style padding rounds the level up to four, so the tree has depth two). Verification recomputes the root from the leaf and the two sibling hashes and confirms it matches the root committed to in the block header.

Tamper Detection

The whole point of a blockchain is that nobody can rewrite history without producing a record that looks visibly broken to anyone holding the next block's header. The tamper demo is a three-step attack that fails at every step.

def tamper_demo() -> None:
    alice = new_keypair()
    chain = Chain(difficulty_bits=12)

    chain.append([make_transaction(alice, {"asset": "PACK-001", "soh": 0.94})])
    chain.append([make_transaction(alice, {"asset": "PACK-001", "soh": 0.93})])
    chain.append([make_transaction(alice, {"asset": "PACK-001", "soh": 0.92})])

    print(f"before tamper:  is_valid = {chain.is_valid()}")

    target_tx = chain.blocks[0].transactions[0]
    forged = Transaction(
        author_pub=target_tx.author_pub,
        payload={"asset": "PACK-001", "soh": 0.99},  # inflate SoH
        signature=target_tx.signature,
    )
    chain.blocks[0].transactions[0] = forged
    print(f"after  tamper:  is_valid = {chain.is_valid()}")

    mallory = new_keypair()
    re_signed = make_transaction(mallory, {"asset": "PACK-001", "soh": 0.99})
    chain.blocks[0].transactions[0] = re_signed
    leaves = [tx.hash() for tx in chain.blocks[0].transactions]
    new_root = merkle_root(leaves)
    print(f"re-signed tx: merkle root drift = {new_root != chain.blocks[0].merkle_root_}")
    print(f"chain still invalid: is_valid = {chain.is_valid()}")

The output:

=== tamper demo: modify a block-1 transaction, watch the chain break ===
before tamper:  is_valid = True
after  tamper:  is_valid = False  (signature check fails on the forged tx)
re-signed tx: merkle root drift = True
chain still invalid because block-2 prev_hash points at the original block-1 hash: is_valid = False

Three things happen in that demo. First, an attacker modifies the payload of an existing transaction without re-signing it; the next call to is_valid rejects it because the signature no longer matches the body. Second, the attacker generates a new keypair and re-signs the payload they want; this clears the signature check, but the new transaction hashes to a different leaf, which produces a different Merkle root, which means the block header committed to a different root than the one the transactions now imply. Third, even if the attacker were to recompute the Merkle root and re-mine block one to satisfy the proof-of-work target, block two's prev_hash still points at the original block-one hash, so the linkage breaks at that boundary. To rewrite block one, the attacker would also have to re-mine every subsequent block - which is precisely the cost that proof-of-work is designed to impose.

Difficulty Study

The claim from Nakamoto's paper is that proof-of-work imposes a tunable computational cost: difficulty d requires expected 2^d hash attempts to find a satisfying nonce.^{5Nakamoto, Satoshi. Bitcoin: A Peer-to-Peer Electronic Cash System. 2008. https://bitcoin.org/bitcoin.pdf} We can verify this empirically by mining many blocks at each of several difficulty settings and comparing the measured mean to the prediction.

def difficulty_study(samples_per_d: int = 30, max_d: int = 20) -> None:
    print(f"{'bits':>4}  {'samples':>7}  {'expected':>14}  {'measured_mean':>14}  {'ratio':>6}")
    key = new_keypair()
    for d in (8, 12, 14, 16, 18, max_d):
        attempts_seen = []
        for _ in range(samples_per_d):
            tx = make_transaction(key, {"d": d, "n": secrets.token_hex(4)})
            block = Block(
                prev_hash=GENESIS_PREV_HASH,
                merkle_root_=merkle_root([tx.hash()]),
                timestamp_ns=time.time_ns(),
                difficulty_bits=d,
                transactions=[tx],
            )
            _, attempts = mine(block)
            attempts_seen.append(attempts)
        mean = sum(attempts_seen) / len(attempts_seen)
        expected = float(1 << d)
        print(f"{d:>4}  {samples_per_d:>7}  {expected:>14,.0f}  {mean:>14,.0f}  {mean / expected:>6.2f}")

Thirty samples per difficulty bit, on the same laptop, produced this:

bits  samples        expected   measured_mean   ratio
   8       30             256             292    1.14
  12       30           4,096           5,036    1.23
  14       30          16,384          20,695    1.26
  16       30          65,536          65,070    0.99
  18       30         262,144         281,387    1.07
  20       30       1,048,576       1,091,006    1.04

Every difficulty bin is within roughly 30% of the theoretical expectation, which is the noise band you should expect at thirty samples per bin. (At thirty samples, the relative standard error on a geometric-distributed mean with parameter p = 2^-d is approximately 1/sqrt(30) ≈ 18%.) Pushing samples-per-bin to several hundred would tighten the ratio toward 1.00. The takeaway is that each additional difficulty bit roughly doubles the expected work, exactly as the geometry of leading-zero-bit hashing requires.

Tests

A blockchain implementation that does not have tests is a science-fair project. The cookbook ships with six smoke tests that cover signature round-trips, transaction verification, Merkle proofs at every leaf index, full-chain build-and-validate, tamper detection, and rejection of invalid signatures at append time.

def test_signature_roundtrip():
    key = new_keypair()
    msg = b"battery PACK-001 SoH 0.94"
    sig = sign(key, msg)
    assert verify(public_bytes(key), msg, sig)
    assert not verify(public_bytes(key), msg + b"!", sig)


def test_merkle_proof_each_leaf():
    leaves = [sha256(f"leaf-{i}".encode()) for i in range(7)]
    root = merkle_root(leaves)
    for i, leaf in enumerate(leaves):
        proof = merkle_proof(leaves, i)
        assert verify_merkle_proof(leaf, proof, root), f"proof failed for index {i}"


def test_tamper_is_detected():
    key = new_keypair()
    chain = Chain(difficulty_bits=8)
    chain.append([make_transaction(key, {"asset": "PACK-001", "soh": 0.94})])
    chain.append([make_transaction(key, {"asset": "PACK-001", "soh": 0.93})])
    target = chain.blocks[0].transactions[0]
    chain.blocks[0].transactions[0] = Transaction(
        author_pub=target.author_pub,
        payload={"asset": "PACK-001", "soh": 0.99},
        signature=target.signature,
    )
    assert not chain.is_valid()

The full test file test_forge.py lives next to forge.py in the post's source tree. Running it on a clean checkout produces:

PASS  test_signature_roundtrip
PASS  test_transaction_verifies
PASS  test_merkle_proof_each_leaf
PASS  test_chain_appends_and_validates
PASS  test_tamper_is_detected
PASS  test_invalid_signature_rejected_at_append

6/6 passed

What This Is Not

It is worth being explicit about everything this cookbook does not deliver, because each absence corresponds to a real design problem in production blockchain systems.

There is no peer-to-peer network. The chain runs in one process. Real blockchains gossip transactions and blocks across thousands of nodes, with all the failure modes that distributed systems imply: network partitions, eclipse attacks, sybil resistance.

There is no consensus protocol. We mine and append in a single thread; there is no concept of competing forks or longest-chain selection. Nakamoto consensus, GHOST, Tendermint, HotStuff, and Casper are all attempts to extend a single-node chain like ours into a multi-node setting where actors might disagree about which block is "next."^{5Nakamoto, Satoshi. Bitcoin: A Peer-to-Peer Electronic Cash System. 2008. https://bitcoin.org/bitcoin.pdf}

There is no transaction ordering protocol beyond append-time arrival. Real chains have mempools, fee markets, and miner-extracted-value problems.

There is no UTXO or account model. Our transactions are opaque payloads. Bitcoin's UTXO model and Ethereum's account model are both ways of giving transactions semantics - they say what the chain means, not just what it records.

There is no smart contract VM. Ethereum's EVM, Solana's BPF runtime, and Move all let transactions execute code against on-chain state. We have none of that; our transactions are inert assertions.

There is no privacy primitive. Our payloads and signatures are public. Confidential transactions, ring signatures, and zero-knowledge rollups are all techniques for hiding parts of a transaction while still proving it is well-formed.

There is no production key management. We hold private keys in process memory. A real attestation system needs hardware security modules, threshold signatures, or remote attestation of signing enclaves.

Each of these is a worthwhile engineering problem in its own right. The point of this cookbook is that a reader who understands the floor - what we built here - can read any of those higher-layer specifications and recognise which primitive they are extending and which one they are leaving alone.

Where This Leads

Once you have a working chain, the next natural questions are about applications. The format of our Transaction.payload is deliberately wide open - dict[str, Any] - because the chain layer should not care what an attestation says. It only guarantees that whoever signed it had the private key, that the transaction is committed to a specific block, and that nobody can rewrite the historical record without performing all the prior proof-of-work again.

That guarantee is the building block for any system whose value depends on irrefutable, timestamped, identity-bound assertions about the world. The structure of the payload, the schema of the assertions, the rules for who is allowed to mint, the linkage between on-chain identifiers and off-chain physical identity, the dispute process when someone signs an assertion that turns out to be wrong - all of those are application layer choices that sit on top of a chain like the one above.

Future cookbooks in this series will pick up exactly there. The next cookbook will take this chain and harden the parts you would have to harden first to deploy it: schema validation on payloads, persistence to disk with crash safety, a deterministic difficulty retargeting algorithm, and a minimal HTTP interface for submitting and querying attestations. The cookbook after that will introduce a second node and Nakamoto-style consensus - at which point the chain becomes a system, not a single-machine toy.

The code for this cookbook is in the post's source folder under _workspace/code/forge.py along with test_forge.py and the cover-image generator. It is roughly four hundred lines, runs on Python 3.10 and later, and depends only on the cryptography package. Read it, type it out, mine a few blocks, watch the tamper demo break the chain in three different ways. The point of the cookbook is not the code; the point is the intuition that what looked from the outside like a black box was four primitives all along.

Python Cryptographic Authority. cryptography: Python library exposing cryptographic recipes and primitives. https://cryptography.io/ ↩ ↩²
National Institute of Standards and Technology. FIPS PUB 180-4: Secure Hash Standard. August 2015. https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf ↩ ↩² ↩³ ↩⁴
Standards for Efficient Cryptography Group. SEC 2: Recommended Elliptic Curve Domain Parameters, Version 2.0. January 2010. https://www.secg.org/sec2-v2.pdf ↩ ↩² ↩³
Merkle, Ralph C. A Digital Signature Based on a Conventional Encryption Function. In Advances in Cryptology - CRYPTO '87, LNCS vol. 293, pp. 369–378. Springer, 1988. https://doi.org/10.1007/3-540-48184-2_32 ↩ ↩²
Nakamoto, Satoshi. Bitcoin: A Peer-to-Peer Electronic Cash System. 2008. https://bitcoin.org/bitcoin.pdf ↩ ↩² ↩³ ↩⁴
Josefsson, S. and Liusvaara, I. Edwards-Curve Digital Signature Algorithm (EdDSA). RFC 8032, January 2017. https://www.rfc-editor.org/rfc/rfc8032 ↩
Laurie, B., Langley, A., and Kasper, E. Certificate Transparency. RFC 6962, June 2013. https://www.rfc-editor.org/rfc/rfc6962 ↩