Skip to main content

RFC 9106: WEAVE - P2P Mesh Network Protocol (Core Specification)

1. OVERVIEW

WEAVE (Wireless Edge-Aware Vector Exchange) is a Latency-Based Causal Broadcast (LCB) protocol for multi-hop mesh networks with strict latency bounds and Byzantine fault tolerance. Key Properties:
  • P99 latency < 8ms (default), configurable 5ms-15ms
  • Causal ordering: receive events after all causally-prior events
  • Byzantine-resilient routing: detects and isolates faulty peers
  • Multi-underlay support: WebRTC/QUIC/BLE/TCP fallback

2. PROTOCOL MODEL

2.1 Network Topology

         ┌─────────────┐
         │   Peer 1    │
         │ (Mobile)    │
         └──────┬──────┘
                │ WebRTC
     ┌──────────┼──────────┐
     │          │          │
┌────▼────┐ ┌──▼──────┐ ┌──▼────┐
│ Peer 2  │ │ Peer 3  │ │Peer 4 │
│(Router) │ │(Desktop)│ │(Mobile)
└────┬────┘ └──┬──────┘ └───┬───┘
     │ QUIC    │ TCP        │ BLE
     └─────────┼────────────┘

          ┌────▼────┐
          │ Peer 5  │
          │(Server) │
          └─────────┘

Each peer can reach others via multiple underlays.
WEAVE selects path dynamically based on latency + reliability.

2.2 Vector Clock Model

Each peer maintains logical timestamp:
VectorClock = Map<SiteID, u64>

Peer A:  { A: 10, B: 5, C: 7 }
Peer B:  { A: 8, B: 12, C: 6 }

Happens-Before Relation:
  VC1 < VC2 iff:
    - VC1[i] ≤ VC2[i] for all i
    - VC1[j] < VC2[j] for some j

Example: { A: 5 } < { A: 5, B: 1 } < { A: 6, B: 2 }

2.3 Causal Ordering

Events at Peer A:
  e1: InsertOp("x") [VC: {A: 1, B: 0}]
  e2: InsertOp("y") [VC: {A: 2, B: 0}] (caused by e1)

Broadcast to Peer B:

Order of delivery:
  1. Peer B receives e2 first (arrives via fast path)
  2. Peer B checks: VC(e2) = {A: 2, B: 0}
  3. Peer B requires: e1 with VC ≤ {A: 1, B: 0}
  4. Peer B queues e2 until e1 arrives
  5. e1 arrives: VC(e1) = {A: 1, B: 0} ✓
  6. Deliver both: e1 then e2 ✓ CAUSAL ORDER PRESERVED

3. LATENCY-BASED CAUSAL BROADCAST ALGORITHM

3.1 Message Broadcast

Broadcast(message):
  1. Increment local VC: vc[self] += 1
  2. Tag message: message.vc = current_vc
  3. Send to all peers via best-latency underlays

Pseudocode:
  vc[self] += 1
  message.vector_clock = vc.copy()
  for peer in reachable_peers:
    underlay = select_underlay(peer)  // WebRTC/QUIC/BLE
    send_async(message, underlay)

3.2 Message Delivery

OnReceive(message):
  1. Extract message VC
  2. Check causal dependencies
  3. Queue if missing causally-prior events
  4. When dependencies met: deliver & update local VC

Pseudocode:
  msg_vc = message.vector_clock
  missing_vc = find_missing_causal_deps(msg_vc)

  if not missing_vc.empty():
    queue_pending(message)
    request_missing(missing_vc)
  else:
    deliver(message)
    vc[sender] = max(vc[sender], msg_vc[sender]) + 1
    check_pending_queue()

3.3 Pending Event Queue

PendingQueue: Vec<(VC, Message)> (sorted by VC)

On deliver(msg):
  vc_local = update_vc(vc_local, msg.vc)

  // Check queue
  loop:
    for (queued_vc, queued_msg) in pending_queue:
      if can_deliver(queued_vc, vc_local):
        deliver(queued_msg)
        pending_queue.remove(queued_vc)
        continue loop
    break

can_deliver(msg_vc, vc_local) -> bool:
  for peer in msg_vc.keys():
    if msg_vc[peer] > vc_local[peer]:
      return false  // Missing causal event
  return true

4. MULTI-UNDERLAY ROUTING

4.1 Underlay Selection

Each peer maintains latency metrics per underlay:
message UnderlayMetrics {
  enum Underlay {
    WEBRTC = 0;
    QUIC = 1;
    BLE = 2;
    TCP = 3;
  }

  Underlay type = 1;
  uint32 p99_latency_ms = 2;
  float packet_loss_rate = 3;
  uint32 bandwidth_mbps = 4;
  bool available = 5;
}

message PeerMetrics {
  string peer_id = 1;
  repeated UnderlayMetrics underlay_metrics = 2;
  uint32 vector_clock_size = 3;
  uint64 last_seen_ms = 4;
}

4.2 Latency-Based Route Selection

SelectUnderlay(target_peer) -> Underlay:
  1. Get all available underlays for target
  2. Sort by P99 latency
  3. Select lowest latency (but with packet loss < 5%)
  4. If primary unavailable, fallback to next

Algorithm:
  available = [
    (WEBRTC, 3ms, 0.1%),
    (QUIC, 5ms, 0.5%),
    (TCP, 20ms, 1%)
  ]

  for (underlay, latency, loss) in sorted(available):
    if latency < config.p99_target and loss < 5%:
      return underlay

  return available[-1]  // Fallback to worst available

4.3 Underlay Characteristics

UnderlayLatencyBandwidthReliabilityUse Case
WebRTC2-5ms10-100 Mbps99.5%LAN P2P
QUIC5-15ms50-500 Mbps99.9%WAN unicast
BLE50-200ms1 Mbps70%Low-power nodes
TCP10-100ms1-1000 Mbps99%Fallback

5. BYZANTINE FAULT DETECTION

5.1 Anomaly Detection

Detect faulty peers by monitoring:
1. Vector Clock Anomalies:
   - Peer reports impossible VC (future timestamp)
   - VC doesn't advance monotonically
   - VC includes unknown peers

2. Latency Anomalies:
   - Message arrives violating causality
   - P99 latency > 3x baseline
   - Sudden packet loss spike

3. Ordering Anomalies:
   - Peer delivers messages out of causal order
   - Duplicate message IDs (replay attack)
   - Invalid cryptographic signatures

5.2 Reputation System

message PeerReputation {
  string peer_id = 1;
  uint32 total_messages = 2;
  uint32 anomalies_detected = 3;
  float reputation_score = 4;  // [0.0, 1.0]
  uint64 last_updated_ms = 5;
}

reputation = 1.0 - (anomalies / total_messages)

if reputation < 0.5:
  action = QUARANTINE    // Stop trusting peer
if reputation < 0.2:
  action = DISCONNECT    // Remove from network

5.3 Byzantine Resilience

With vector clocks:
  - Non-Byzantine peer can reconstruct causal history
  - Byzantine peer can't forge causal order (breaks VC monotonicity)
  - System tolerates f < n/2 Byzantine peers (voting)

Example (n=5 peers, f=2 Byzantine):
  Peer 1 (honest):   VC = {1: 10, 2: 8, 3: 7, 4: 6, 5: 5}
  Peer 2 (honest):   VC = {1: 10, 2: 8, 3: 7, 4: 6, 5: 5}
  Peer 3 (honest):   VC = {1: 10, 2: 8, 3: 7, 4: 6, 5: 5}
  Peer 4 (Byzantine): VC = {1: 100, 2: 8, 3: 7, 4: 999, 5: 5} ✗
  Peer 5 (Byzantine): VC = {1: 1, 2: 8, 3: 7, 4: 6, 5: 5} ✗

  Consensus: Majority of 3 honest peers agree on VC
  Byzantine peers detected and quarantined

6. CONVERGENCE GUARANTEES

6.1 Strong Eventual Consistency (SEC)

Theorem: Two non-partitioned peers eventually reach identical state

Proof:
  1. All broadcast messages reach all peers (eventually)
  2. Causal order delivery ensures deterministic state merge
  3. If M1 sent by peer A and M2 sent by peer B:
     - If causal: both receive in same order (M1, M2)
     - If concurrent: both use CRDT merge (FNP/TNP) → same result
  4. Therefore: lim(t→∞) state(A) = state(B) ✓

6.2 Partition Handling

Before partition:
  Peer A ←─────────→ Peer B
          Partition
  Peer A ──────X────── Peer B

During partition:
  Both continue operating (diverge)
  Each maintains VC independent

After heal:
  Exchange all missed messages (from both sides)
  Causal order determines merge sequence
  State converges ✓

Example:
  A's VC during partition: {A: 100, B: 50}  (B halted at 50)
  B's VC during partition: {A: 50, B: 100}  (A halted at 50)

  After heal, B receives A's events 51-100 (VC > local)
  A receives B's events 51-100 (VC > local)
  Both merge using FNP/TNP deterministically ✓

7. CONVERGENCE TIME ANALYSIS

7.1 Latency Bound

P99 latency = time for message to reach 99% of peers
With LCB:
  Network depth D = max hops to reach any peer
  Underlay latency L = max hop latency
  Queue latency Q = time waiting for causal deps

  P99(delivery) ≤ D×L + Q

Default config:
  D = 3 hops (typical mesh)
  L = 3ms per hop
  Q = 2ms (causal deps)

  P99 ≤ 3×3 + 2 = 11ms (target: 8ms → possible with topology opt)

7.2 Causal Dependency Queue

Best case: Message has no causal dependencies → O(1)
  Deliver immediately

Worst case: Message depends on all other peers
  Queue until all causal events received

Typical case: Message depends on last event only → O(1)
  Usually next message clears queue

8. STATE MACHINE

WEAVE Peer State Machine:

        ┌─────────────────────┐
        │   INITIALIZING      │
        │ • Load local state  │
        │ • Bootstrap VC      │
        └──────────┬──────────┘
                   │ complete
        ┌──────────▼──────────┐
        │    LISTENING        │
        │ • Accept peers      │
        │ • Exchange VC       │
        └──────────┬──────────┘
                   │ peer_connected
        ┌──────────▼──────────┐
        │    SYNCING          │
        │ • Exchange history  │
        │ • Merge causal gaps │
        └──────────┬──────────┘
                   │ sync_complete
        ┌──────────▼──────────┐
        │    OPERATIONAL      │◄─────┐
        │ • Broadcast msgs    │      │
        │ • Deliver causally  │      │
        │ • Monitor metrics   │──────┘
        └──────────┬──────────┘  periodic_sync
                   │ peer_lost
        ┌──────────▼──────────┐
        │ DEGRADED_MODE       │
        │ • Single peer       │
        │ • Limited topology  │
        └──────────┬──────────┘
                   │ peers_restored
                   └─────────────────┘

9. INTEGRATION WITH FNP/VEST/TNP

9.1 Message Content

WEAVE broadcasts messages from upper layers:
TNP Operations:
  Broadcast(TNPOperation) → WEAVE
  WEAVE vector-clock tags → causal delivery
  → FNP merges deterministically
  → VEST audit signatures for non-repudiation

FNP Operations:
  Broadcast(FNPOperation) → WEAVE
  WEAVE ensures causal order (all deps received first)
  → FNP three-way merge produces identical state

VEST Signatures:
  Broadcast(AuditSignature) → WEAVE
  WEAVE ensures temporal ordering
  → VEST audit trail tamper-evident

9.2 Clock Synchronization

WEAVE Lamport Clock ← TNP provides
WEAVE Vector Clock (local)
VEST Roughtime ← external time source

Synchronized:
  VEST timestamp checks message arrival
  TNP Lamport provides total order
  WEAVE Vector provides causal order
  All three coordinated for strong consistency

10. CONFORMANCE REQUIREMENTS

All WEAVE implementations MUST:
  1. ✅ Implement Latency-Based Causal Broadcast
  2. ✅ Support vector clock ordering
  3. ✅ Queue causal dependencies
  4. ✅ Support multi-underlay routing (WebRTC + fallback)
  5. ✅ Monitor and report latency metrics
  6. ✅ Detect Byzantine anomalies
  7. ✅ Handle network partitions correctly
  8. ✅ Guarantee strong eventual consistency
  9. ✅ Achieve P99 latency < config.max_latency_ms

11. EXAMPLES

Example 1: Causal Delivery

Timeline:
  t=0:  Peer A broadcasts "insert X" (VC: {A:1})
  t=1:  Peer B receives "insert X"
  t=2:  Peer C broadcasts "insert Y" (VC: {A:1, C:1})
  t=3:  Peer B broadcasts "insert Z" (VC: {A:1, B:1, C:1})
  t=4:  Peer C receives "insert Y" (from A, VC={A:1})
  t=5:  Peer C receives "insert Z" (VC={A:1, B:1, C:1})

Causal dependencies:
  "insert X" (VC={A:1}) → "insert Y" (VC={A:1,C:1})
  "insert X" (VC={A:1}) → "insert Z" (VC={A:1,B:1,C:1})

Delivery order (all peers):
  1. "insert X"
  2. "insert Y" and "insert Z" (concurrent, FNP merge)

Result: All peers state identical ✓

REFERENCES

  • “Latency-Based Causal Broadcast” (distributed systems literature)
  • RFC 9100: TNP (temporal ordering)
  • RFC 9104: FNP (deterministic merge)
  • RFC 9102: VEST (signatures)

END OF RFC 9106