Architecture Deep Dive

This document provides a detailed technical explanation of how rng.dev achieves verifiable, unbiasable randomness. For a simpler introduction, see How It Works.

System Overview

Data Flow

Entropy Mixing Pipeline

Timing Model

Database Schema

Comparison Architecture

Deployment Architecture

Design Goals

Primary Goals

Goal	Description
Unbiasability	No party can meaningfully influence the output
Unpredictability at commit	No party can predict randomness at the moment we commit
Public Verifiability	Anyone can verify any round using public data
Independence	No third-party dependencies
Transparency	Open source, fully auditable

Secondary Goals

Goal	Description
Simplicity	Minimal moving parts, easy to audit
Perpetual Operation	No expiration date, no chain exhaustion
Forkability	Anyone can run their own instance
Low Latency	New randomness every 1 second

Security Architecture

We implement a multi-layer security architecture:

Layer 1: Multi-Source External Entropy
   └─ 8 independent blockchains (Aptos, Arbitrum, Base, Bitcoin, Cardano, Ethereum, Solana, Sui)
   └─ Different consensus mechanisms, finality times, jurisdictions
   └─ Security requires only ONE honest source among eight

Layer 2: Block Hashes + Transaction IDs
   └─ Block hashes from block producers
   └─ Transaction IDs from external users
   └─ TXIDs provide entropy with different trust assumptions

Layer 3: Time-Delayed Mixing
   └─ Round N randomness uses inputs from Round N AND Round N+1
   └─ We commit to inputs_N BEFORE inputs_N+1 exist
   └─ Blockchains ARE our commitment layer

Why Transaction IDs Add Entropy

Transaction IDs provide additional security because:

External Origin: Transactions come from users, not block producers
Unpredictable Ordering: Which TX lands in position 1 depends on mempool state, fees, timing
No Grinding: Unlike block hashes, TXIDs cannot be "ground" — they're determined by transaction content
Cross-Chain Independence: Manipulating TX ordering on 8 chains simultaneously is impractical

Transaction Selection Rules:

Chain	Selection	Rationale
Bitcoin	2nd transaction (index 1)	Skip coinbase TX (miner-controlled)
Others	1st transaction (index 0)	Standard selection

Important Note on Source Independence

Eight blockchains ≠ eight fully independent entropy sources. They share:

Internet infrastructure and datacenters
Some overlapping validators/operators
Time synchronization dependencies
Market conditions affecting mining/staking

The effective entropy is closer to max(source_entropy) than sum(all_sources).

However, this doesn't weaken security because:

SHA3 mixing ensures any unpredictable input produces unpredictable output
The security claim is "one honest source" not "eight independent sources"
Different chains have different attack surfaces (PoW vs PoS, different finality)

The Core Formula

# Round N: Commit phase (at time T)
inputs_N = fetch_from_blockchains()
commit_N = SHA3(encode(inputs_N))
publish(commit_N)  # Public commitment

# Round N+1: Reveal phase (at time T + 1 second)
inputs_N1 = fetch_from_blockchains()  # These blocks didn't exist at time T
random_N = SHA3(commit_N | encode(inputs_N1))
publish(random_N, inputs_N, inputs_N1)  # Full verification data

Why This Works

At commit time (T): We publish commit_N = SHA3(inputs_N). The blockchain blocks that will form inputs_N+1 do not exist yet.
At reveal time (T + 1s): We fetch inputs_N+1 from new blockchain blocks and compute random_N = SHA3(commit_N | inputs_N+1).
Verification: Anyone can:
- Verify commit_N matches SHA3(inputs_N)
- Verify random_N matches SHA3(commit_N | inputs_N+1)
- Verify inputs against public blockchain explorers

Security Properties

Property	Guarantee	Mechanism
Unbiasability	No one could have biased the output	Requires at least one honest source in round N+1
Unpredictability at commit	Cannot predict round N when commit_N is published	Depends on future blockchain state
Operator Neutrality	Operator cannot predict or bias	Committed before next round's blocks exist
Verifiability	Anyone can verify any round	All inputs public; deterministic algorithm

Critical Security Insight:

The system does NOT require all eight chains to be secure. It requires at least one honest entropy source in round N+1.

Because the final output is:

random_N = SHA3(commit_N || inputs_N+1)

If any component of inputs_N+1 is unpredictable to the attacker, the output is unpredictable.

Understanding the Timing Model

This section clarifies exactly what our security guarantee is — and what it is not.

The Phases of Each Round

Time 0 (Commit):
  ├─ Round N begins
  ├─ Block cache contains data from round N-1
  └─ Outcome is UNDETERMINED
      → Blocks that will be used don't exist yet

Time 0-999ms (Continuous Block Arrival):
  ├─ Fast chains finalize new blocks throughout the round
  ├─ Block cache is continuously updated as blocks arrive
  ├─ A block finalizing at T+999ms WILL be included in round N
  └─ Outcome keeps changing until snapshot
      → Each new block changes the potential output
      → TXIDs are unpredictable (depend on global user transactions)

Time 1000ms (Snapshot + Reveal):
  ├─ We snapshot the latest cached block for each chain
  ├─ Whatever is in the cache at this moment becomes the input
  ├─ We compute and publish random_N
  └─ Outcome is OFFICIALLY PUBLISHED, round N+1 begins

Why Blocks Arriving Late Can't Be Front-Run

Fast chains like Sui and Solana finalize 2-4 blocks per second. A block could finalize at T+999ms and still be included in round N. This might seem like it gives attackers time to react, but it doesn't:

Block producers can't control TXIDs: We use the transaction ID at index 0 (or 1 for Bitcoin). This is determined by which user transaction arrives first in the mempool — not by the block producer.
TXIDs change with every block: Even if an attacker sees a block finalize at T+800ms, a new block at T+950ms will have different transactions, changing the output entirely.
The cache is a moving target: The output isn't "locked in" until T+1000ms. Any new block arriving before the snapshot changes the result.
Prediction requires controlling all 8 chains: Even with perfect knowledge of 7 chains, one unpredictable TXID makes the output unpredictable.

Snapshot Semantics

Critical for verification: Each round uses the latest cached block for each chain at T+1000ms (end of round).

Timing	Block Included?
Block finalized at 950ms	✅ Yes - captured before snapshot
Block finalized at 1000ms	✅ Yes - at snapshot boundary
Block finalized at 1001ms	❌ No - included in next round

This means:

Deterministic - same query gives same result
Verifiable - anyone can query "latest block before timestamp X"
Explicit - we document exactly what we capture

Which Block Do We Use?

Fast chains like Solana (~400ms finality) may finalize 2-3 blocks between round N and N+1. We always fetch the latest finalized block at the moment of the API call — not the first block after the commit.

Chain	Finality	Blocks per Round	Which Block?
Aptos	~900ms	1-2	Latest at snapshot
Arbitrum	~250ms	3-4	Latest at snapshot
Base	~2s	0-1	Latest at snapshot
Bitcoin	~60 min	0	Latest with 6 confirmations
Cardano	~20s	0-1	Latest at snapshot
Ethereum	~15 min	0	Latest finalized
Solana	~400ms	2-3	Latest at snapshot
Sui	~400ms	2-3	Latest at snapshot

Why we ALWAYS take the latest block (not the first):

Taking the latest block at snapshot time is critical for security. If we committed to the "first block after round start," attackers could:

See the block finalize early (e.g., at T+100ms)
Have 900ms to compute the output and take action before we reveal
Front-run the beacon by placing bets, trades, or other actions

By taking the latest block at T+1000ms:

The target keeps moving — new blocks arrive until the snapshot
No advance computation — attackers can't lock in a result early
Maximum unpredictability — uses the most recently finalized consensus
Simple verification — "latest finalized at timestamp X" is deterministic

What We Guarantee

Our security guarantee is UNBIASABILITY, not secrecy.

Guarantee	Description
We DO guarantee	No one could have biased the output at the time it was determined
We DO NOT guarantee	No one can compute the output before we publish it

The output becomes determined by blockchain consensus before we officially publish it. During this window, observers watching the blockchains could compute the same result we will publish.

This is not a bug — it's inherent to any system using public blockchain data.

Why This Is Still Secure

The critical insight: by the time anyone can compute the output, it's too late to change it.

At commit time (T=0), the blocks that will determine the output don't exist
No miner knew what to grind towards
No attacker could have influenced block production to achieve a specific output
By the time the blocks finalize, it's too late — consensus has spoken

Implications for Applications

For lotteries, gambling, and selection processes:

Timing	Security
Close entries BEFORE commit time	Entries locked before outcome is predictable
Close entries AFTER commit time	VULNERABLE — fast observers may compute outcome first

Correct lottery timing:

T = -5 sec:   Entries close (outcome completely unknown)
T = 0:        Round begins (outcome unknown, depends on future blocks)
T = 1.0s:     Snapshot + reveal (outcome published)

The commit is your "randomness lock-in point" — not the reveal.

Comparison to Other Beacons

Beacon	Observation Window	Trust Model
NIST	Output unpredictable until publication (hash chain)	Trust NIST
drand	Output unpredictable until threshold reached	Trust League of Entropy
rng.dev	Output unpredictable at commit; computable after blocks finalize	Trust public blockchains

Our model trades a small observation window for radical independence — no hash chains to exhaust, no threshold committees to trust.

Why Not Hash Chains?

Hash chain precommitment is a powerful technique used by NIST's Randomness Beacon. We seriously considered it and chose not to implement it.

How Hash Chains Work

s_0 = random_seed
s_1 = SHA3(s_0)
s_2 = SHA3(s_1)
...
s_N = SHA3(s_{N-1})

# Publish s_N as commitment
# Reveal s_{N-1}, s_{N-2}, ... backwards
# Each reveal verifiable: SHA3(s_i) == s_{i+1}

Why We Rejected Hash Chains

Issue	Impact
Storage overhead	10M values × 64 bytes = 640MB; must be stored securely
Expiration risk	Chain eventually exhausts; requires regeneration ceremony
Operational complexity	Air-gapped generation, encrypted storage, backup procedures
Single point of failure	If chain is compromised or lost, must restart from scratch
No additional security	Our time-delayed mixing already provides operator neutrality

The Key Insight

Blockchains are our commitment layer.

When we publish commit_N = SHA3(inputs_N), the blocks that will form inputs_N+1 don't exist yet. This provides the same forward integrity guarantee that hash chains offer:

With hash chain: Operator committed to future values at genesis
With our approach: Operator committed to inputs_N before inputs_N+1 existed

The difference: our commitment is renewed every round using fresh, unpredictable blockchain data. No storage, no expiration, no ceremony.

Why Not drand as Primary Source?

drand (League of Entropy) is an excellent randomness beacon that uses threshold BLS signatures. We seriously considered using it as our primary source and chose not to.

How drand Works

Multiple independent organizations run nodes
Each round, nodes contribute partial signatures
Threshold (e.g., 10 of 16) required to produce valid randomness
Output is a BLS signature verifiable by anyone

Benefits of drand

Threshold security: No single party can predict or bias output
Cryptographic proofs: BLS signatures provide strong guarantees
Established trust: League of Entropy includes Cloudflare, EPFL, Protocol Labs

Why We Didn't Use drand as Primary

Issue	Impact
Third-party dependency	Our randomness depends on external infrastructure
Trust requirement	Must trust the League of Entropy participants
Not forkable	You can't easily run your own drand network
Different trust model	drand's security comes from trusted parties; ours from public blockchains

Our Design Philosophy

We wanted radical independence:

No reliance on any institution
Anyone can fork and run their own beacon
Security comes from public infrastructure (blockchains) not trusted parties

This is a philosophical choice, not a technical critique of drand. For many use cases, drand is excellent. For users who want maximum independence, our approach is better suited.

Self-Hosted Option

For self-hosted deployments, users can optionally add drand as an additional entropy source. See Self-Hosted Guide for details.

Sequential Entropy Mixing

We use sequential mixing rather than flat concatenation to provide stronger grinding resistance.

Flat mixing (NOT used):

output = SHA3(input_1 || input_2 || input_3 || input_4 || input_5 || input_6)

Sequential mixing (our approach):

R0 = commit_hash
R1 = SHA3(R0 || aptos_input)
R2 = SHA3(R1 || arbitrum_input)
R3 = SHA3(R2 || base_input)
R4 = SHA3(R3 || bitcoin_input)
R5 = SHA3(R4 || cardano_input)
R6 = SHA3(R5 || ethereum_input)
R7 = SHA3(R6 || solana_input)
R8 = SHA3(R7 || sui_input)
output = R8

Why This Is Stronger

With sequential mixing, entropy gets "locked in" at each step:

After any honest source contributes, the intermediate state is unpredictable
Later attackers cannot compute what they're mixing into
Even if 7 of 8 sources are compromised, one honest source = secure output

Formal security property:

If at least one source has min-entropy, the output is computationally indistinguishable from random.

Canonical Encoding Specification

For deterministic verification, the encoding must be precisely specified. Different implementations producing different encodings = different hashes = verification failure.

Encoding Rules

Rule	Specification
Character encoding	UTF-8, no BOM
Source ordering	Alphabetical by source name (case-sensitive)
Separator	Pipe character `\|` (ASCII 124)
No trailing separator	String ends with last source value
No whitespace	No spaces, newlines, or padding
Null handling	Missing sources omitted entirely (not "null")

Input String Format per Source

All sources use the standardized format: {identifier}:{hash}:{tx_id}

For empty blocks (no user transactions), the format is: {identifier}:{hash}

Source	Format	Example
Aptos	`{block_height}:{block_hash}:{txid}`	`12345678:0xabc...:0xdef...`
Arbitrum	`{block_number}:{block_hash}:{txid}`	`442950038:0x9a8...:0xa4b...`
Base	`{block_number}:{block_hash}:{txid}`	`43507585:0x6f0...:0x63b...`
Bitcoin	`{block_height}:{block_hash}:{txid}`	`831245:00000000...3f:abc...`
Cardano	`{slot}:{block_hash}:{txid}`	`9876543:abc123...:def...`
Ethereum	`{block_number}:{block_hash}:{txid}`	`19234567:0x8a3f2e...b9:0x...`
Solana	`{slot}:{blockhash}:{txid}`	`245678901:5eykt4Uy...:3abc...`
Sui	`{checkpoint}:{digest}:{txid}`	`12345678:abc...:def...`

Note: Bitcoin uses the 2nd transaction (index 1) because the coinbase at index 0 is miner-controlled.

Verification Algorithm

def canonical_encode(sources: dict) -> bytes:
    """
    Canonical encoding for beacon inputs.
    Returns UTF-8 bytes ready for SHA3-256 hashing.
    """
    # 1. Sort sources alphabetically
    sorted_names = sorted(sources.keys())

    # 2. Filter out None/missing sources
    present_sources = [name for name in sorted_names if sources[name] is not None]

    # 3. Join with pipe separator
    combined = "|".join(sources[name] for name in present_sources)

    # 4. Encode as UTF-8
    return combined.encode("utf-8")

Verification Payload

Every round includes a complete verification payload:

{
  "round": 12345,
  "output": "a3f2e8c9d1b4f7e2a9c3d8b1e4f7a2c5d8e1b4f7a2c5d8e1b4f7a2c5d8e1b4f7",
  "die_value": 4,
  "generated_at": "2024-01-15T14:32:07Z",

  "verification": {
    "algorithm": "SHA3-256-sequential",
    "mixing_order": "alphabetical",
    "separator": "|",

    "inputs": [
      {
        "source": "aptos",
        "value": "12345678:0xabc...:0xdef...",
        "verifiable": true,
        "verification_url": "https://explorer.aptoslabs.com/block/12345678"
      },
      {
        "source": "bitcoin",
        "value": "00000000000000000002a7c4...:831245:abc...",
        "verifiable": true,
        "verification_url": "https://mempool.space/block/831245"
      }
      // ... other sources
    ],

    "commit_hash": "b7e2f1a9c3d8...",

    "recompute_steps": [
      "R0 = 'b7e2f1a9c3d8...' (commit_hash)",
      "R1 = SHA3(R0 + '|' + aptos_value)",
      "R2 = SHA3(R1 + '|' + arbitrum_value)",
      "R3 = SHA3(R2 + '|' + base_value)",
      "R4 = SHA3(R3 + '|' + bitcoin_value)",
      "R5 = SHA3(R4 + '|' + cardano_value)",
      "R6 = SHA3(R5 + '|' + ethereum_value)",
      "R7 = SHA3(R6 + '|' + solana_value)",
      "R8 = SHA3(R7 + '|' + sui_value) = output"
    ]
  }
}

What Auditors Can Verify

Check	Method	Trust Level
Hash correctness	`SHA3(combined_input) == output`	Mathematical certainty
Blockchain inputs	Query public explorers via verification URLs	Independently verifiable
Sequence integrity	No gaps in round numbers	Detectable manipulation
Timing consistency	Rounds spaced at expected intervals	Statistical verification

Known Limitations

Limitation	Impact	Mitigation
Observation window	Output computable ~0.5s before official publication	Use round start as lock-in point
Snapshot cutoff	Blocks after T+1000ms excluded from round	Explicitly documented; deterministic
Blockchain dependencies	If all 8 chains fail, no randomness	Extremely unlikely; degraded mode with partial sources
RPC provider trust	Must trust at least one provider per source	Multiple fallbacks; users can verify on explorers
No cryptographic proofs	Statistical validation only	Sufficient for most use cases

Verifier Architecture

Independent verifiers can confirm beacon outputs in real-time using simultaneous hash exchange.

Verifier Protocol

T+1000ms: Both beacon and verifiers snapshot blockchain data
T+1001ms: Both compute hash independently
T+1001ms: Both send their hash simultaneously (cross in flight)
T+1010ms: Both receive and compare locally
T+1000ms: Round N revealed, round N+1 begins

Key properties:

Hashes cross in flight — neither party sees the other's hash before sending
Verifiers watch the same 8 chains with same snapshot semantics
Mismatch detection is immediate; logged for investigation

Verifier Requirements

Requirement	Specification
Data sources	Same 8 chains, same RPC endpoints preferred
Snapshot time	T+1000ms (end of round)
Algorithm	Sequential SHA3-256, alphabetical order
Time sync	NTP synchronized, <50ms drift

Time Synchronization

Accurate time is critical. Both beacon and verifiers must agree on when each round starts.

Recommended NTP configuration (chrony):

server time.cloudflare.com iburst
server time.google.com iburst
server time.aws.amazon.com iburst
server time.nist.gov iburst

maxdistance 0.1    # Max 100ms from source
makestep 0.01 3    # Step if >10ms drift

Monitoring: The beacon logs NTP status on startup and warns if drift exceeds 200ms.

Appropriate Uses

Use Case	Why It Works
Lotteries & gambling	Unbiasable, verifiable, audit trail
Scientific experiments	Reproducible, transparent random seeds
AI/ML training	Consistent random initialization across runs
Governance selection	Fair jury/audit sampling
Games & entertainment	Transparent randomness players can verify
Distributed systems	Coordination requiring shared randomness

Inappropriate Uses

Use Case	Why It Doesn't Work	Alternative
Cryptographic key generation	Not unpredictable enough	Use `/dev/urandom` or hardware RNG
Situations requiring threshold signatures	Different trust model	Use drand

System Overview​

Data Flow​

Entropy Mixing Pipeline​

Timing Model​

Database Schema​

Comparison Architecture​

Deployment Architecture​

Design Goals​

Primary Goals​

Secondary Goals​

Security Architecture​

Why Transaction IDs Add Entropy​

Important Note on Source Independence​

The Core Formula​

Why This Works​

Security Properties​

Understanding the Timing Model​

The Phases of Each Round​

Why Blocks Arriving Late Can't Be Front-Run​

Snapshot Semantics​

Which Block Do We Use?​

What We Guarantee​

Why This Is Still Secure​

Implications for Applications​

Comparison to Other Beacons​

Why Not Hash Chains?​

How Hash Chains Work​

Why We Rejected Hash Chains​

The Key Insight​

Why Not drand as Primary Source?​

How drand Works​

Benefits of drand​

Why We Didn't Use drand as Primary​

Our Design Philosophy​

Self-Hosted Option​

Sequential Entropy Mixing​

Why This Is Stronger​

Canonical Encoding Specification​

Encoding Rules​

Input String Format per Source​

Verification Algorithm​

Verification Payload​

What Auditors Can Verify​

Known Limitations​

Verifier Architecture​

Verifier Protocol​

Verifier Requirements​

Time Synchronization​

Appropriate Uses​

Inappropriate Uses​

Further Reading​