Skip to main content

FNP - Security - Byzantine Fault Tolerance & Threat Model

Summary (Explain Like I’m 5)

Imagine a treasure hunt with 100 kids, but some are liars trying to trick everyone: Problem: You ask a kid “Where’s the treasure?”
  • Good kids tell the truth
  • Bad kids lie to mess things up
  • How do you trust the answer?
Solution - Byzantine Fault Tolerance:
  • Ask many kids and use majority vote
  • Even if 20 kids lie, 80 truth-tellers win
  • No single kid can trick everyone
In FNP: If one server is hacked or malicious:
  • ✓ It can’t forge signatures (Dilithium stops it)
  • ✓ It can’t reorder other users’ edits (LSEQ positions are fixed)
  • ✓ It can’t delete without audit trail (tombstones + Dilithium)
  • ✓ The document stays cryptographically valid even if 1 replica is evil

Technical Deep Dive

Byzantine Fault Tolerance is the ability to tolerate malicious (Byzantine) nodes that:
  • Refuse to participate
  • Send corrupted data
  • Arbitrarily deviate from protocol
  • Collude with other malicious nodes
FNP’s Threat Model (Realistic for collaborative editing):
ThreatAttacker PowerDefense
Signature ForgeryForge Dilithium signaturesCryptographically impossible (2^256 work)
Replay AttackReuse old operationsLamport clock + operation_id uniqueness
Delete ForgeryDelete other user’s editsOnly signer can delete (Dilithium)
Position CollisionTwo edits same LSEQ positionPosition generation impossible to predict
Timestamp ManipulationFake Lamport clock valuesServer checks monotonic increase
DoS AttackSend millions of invalid opsSignature verification before processing
Network SplitPartition replicasEventual consistency handles it
Colluding ReplicasMultiple evil serversDeterministic merge defeats coordination
Metadata LeakageLearn edit patternsBlind server verification (partial)

1. Signature Forgery Impossibility

Attacker goal: Forge Alice's signature on "DELETE all edits"

Attack attempt:
1. Attacker gets Alice's public key (vk_alice) [public]
2. Attacker tries to forge Dilithium signature
   - Must find (c̃, z) such that Verify(vk_alice, msg, (c̃, z)) = accept
   - Requires solving Module-LWE: find short vector in lattice
   - Module-LWE is NIST-hard: 2^256 operations minimum
3. Attacker computes for 100 years... still can't forge

Result: ✗ IMPOSSIBLE for all practical purposes
Why Module-LWE is hard:
  • Finding short vector in high-dimensional lattice = exponential work
  • No quantum algorithm known (contrasts with RSA/ECDSA)
  • Even with quantum computer: still 2^128 work (secure enough)

2. Replay Attack Prevention

Operation format in FNP:

{
  site_id: "alice@example.com",
  operation_id: 42,           ← unique per Alice
  lamport_clock: 1000,        ← monotonically increasing
  timestamp: 2025-11-28T10:00Z,
  content: "Hello",
  position: (1.5, "alice", 42),
  signature: DILITHIUM_SIG
}

Replay attack scenario:
1. Attacker captures operation #42 (INSERT "Hello" at 1.5)
2. Attacker sends it again next week
3. Server checks: operation_id 42 from alice@example.com
4. Server has: "operation 42 already applied" [deduplication log]
5. Server rejects: ✗ Duplicate operation

Prevention:
- Each site_id maintains monotonic operation_id (like Lamport clock)
- Server stores: {(site_id, operation_id) → timestamp}
- Replayed operation: operation_id is old → rejected

3. Delete Operation Security (Cryptographic Audit Trail)

Delete operation structure:

{
  type: "DELETE",
  position_id: (2.0, "alice", 100),  ← Which edit to delete?
  deletion_timestamp: 1001,
  signer: "alice@example.com",
  signature: DILITHIUM_SIG
}

Security properties:
1. Only Alice can delete Alice's edits (her signature required)
2. Delete is timestamped and signed (audit trail)
3. Tombstone remains: (2.0, "alice", 100, visible: false)
4. Bob/server can verify: "Alice deleted this, not server error"

Attack scenario:
- Attacker: "Alice wants to delete all her edits"
- Attacker sends: DELETE messages for all alice edits
- Problem: No Dilithium signature (attacker doesn't have alice's secret key)
- Server: ✗ Rejects all unsigned deletes

Result: Only Alice can delete Alice's edits (cryptographically enforced)

4. Position Collision Impossibility

LSEQ position generation (unpredictable):

position = (base_seq, site_id, clock)

where:
- base_seq: derived from parent position + random bits
- site_id: Alice_ID (constant, unique)
- clock: Lamport timestamp (increases, not guessable)

Collision attempt:
- Attacker wants to create operation at same position as Alice's #100
- Alice's position: (1.5, "alice", 100)
- Attacker generates position: (1.5, "alice", 100)
- Problem 1: Can't use site_id "alice" (attacker is "malicious_replica")
- Problem 2: Even with different site_id, positions compare by (level, digits)
- Problem 3: Parent-based generation means previous history matters

Result: ✗ Position collision requires breaking LSEQ generator

5. Lamport Clock Monotonicity Check

Server-side verification:

Received operation from Alice:
{
  lamport_clock: 1000,
  operation_id: 42
}

Server's state:
{
  last_lamport_from_alice: 999,
  last_operation_from_alice: 41
}

Check:
if (lamport_clock <= last_lamport_from_alice):
  reject("Clock didn't advance—replay or out-of-order")

if (operation_id <= last_operation_from_alice):
  reject("Operation ID didn't advance—replay")

if (lamport_clock != last_lamport_from_alice + 1):
  mark("Gap in clock—acceptable with offline edits")

Accept operation and update:
{
  last_lamport_from_alice: 1000,
  last_operation_from_alice: 42
}

Result: ✓ Prevents replay and out-of-order operations

6. Colluding Replicas Can’t Reorder Edits

Scenario: Two evil servers collude to reorder operations

Evil replica 1: "Reorder edits: (1.0, 'x') then (2.0, 'y')"
Evil replica 2: "Reorder edits: (2.0, 'y') then (1.0, 'x')"

Result when merging:
- CRDT commutativity check: sort positions by (level, digits)
- Position 1.0 < 2.0 always
- Merge produces: (1.0, 'x'), (2.0, 'y')
- Both replicas converge to same order
- Colluding attempt failed!

Why? Positions are immutable after insertion (no "reordering" in CRDT)

Mermaid Diagrams

Key Terms

  • Byzantine Fault → Node deviates arbitrarily (lies, corrupts, delays)
  • Byzantine Fault Tolerance → System works despite malicious nodes
  • Cryptographic Binding → Operation bound to signer via Dilithium
  • Audit Trail → Timestamped, signed record of all operations
  • Eventual Consistency → All honest replicas converge despite partition
  • Monotonic Clock → Clock value never decreases (prevents replay)
  • Signer-Based Deletion → Only message signer can delete their own message
  • Non-Repudiation → Signer can’t deny signing (Dilithium provides this)

Q/A

Q: What if the server is hacked? A: Server can’t forge signatures (attacker doesn’t have users’ secret keys). Can’t reorder edits (CRDT enforces position order). Can’t delete others’ edits (only signer can). Can’t replay ops (Lamport clock check). Hacked server can’t break protocol—at worst, goes offline. Q: Does FNP need a majority quorum like Raft? A: No. FNP uses deterministic CRDT merge—no voting needed. Any subset of replicas converging gives same result. This is weaker than Byzantine consensus (which needs 2f+1 nodes for f faults) but stronger than traditional replication. Q: What if attacker is Alice herself? A: Alice has legitimate signing key. She can delete her own edits (that’s fine). She can forge signatures only on her own operations. She can’t forge Bob’s signatures (different secret key). Design assumes: Users trust their own devices (local security is prerequisite). Q: Can an attacker learn what users are editing? A: Server sees operation positions, timestamps, sizes. Server doesn’t see content (encrypted). Blind server verification means server can’t verify correctness of proof. Attack surface: timing patterns, operation sizes, edit frequency—not operation content. Q: What if two servers disagree on operation order? A: CRDT resolves it! Positions are absolute (not relative to server). Any subset of servers applying operations deterministically converges. Order disagreement is impossible because LSEQ positions create total ordering. Q: How is FNP different from Blockchain? A: Blockchain needs consensus (expensive). FNP uses CRDT (no consensus). Blockchain works with arbitrary adversaries. FNP assumes users keep their keys secret. Blockchain is decentralized; FNP assumes trusted client devices. FNP is much faster and simpler.

Example / Analogy

Notarized Document Analogy: Without Byzantine tolerance (fragile):
  • Server: “Alice edited the document”
  • If server hacked: attacker can forge edits, delete things
  • No way to verify: “Was this really Alice?”
With Byzantine tolerance (robust):
  • Notary (Dilithium): “Alice personally signed this edit”
  • Attacker hacks server: “I’ll say Bob edited this”
  • Problem: Attacker doesn’t have Bob’s signing stamp (secret key)
  • Notary rejects: “Stamp is forged”
  • Document remains cryptographically proven authentic
FNP Real-world:
  • Company document, 50 editors, 1 server hacked
  • Hacker tries: “I deleted all security reviews”
  • Problem: Each review is signed by original author
  • Server: “Which author’s secret key signed this delete?”
  • Hacker: “Uh… I don’t have anyone’s secret key”
  • System: ✓ Rejects all unsigned deletes
  • Document stays secure despite hacked server

Cross-References: Dilithium Signatures, LSEQ CRDT, Lamport Clocks, Security & Zero Trust Category: Security | Distributed Systems | Byzantine Tolerance | Threat Model Difficulty: Advanced ⭐⭐⭐⭐⭐ Updated: 2025-11-28