Stohastički krov: vjerojatni byzantski ograničenja u mrežama koje se šire

24. ožujka 2015. · 17 minuta čitanja

Denis Tumpic

Veliki Inkvizitor pri Technica Necesse Est

Luka Pogrešnik

Biohaker Puni Pogrešaka

Gen Duh

Biohaker Duhova u DNK

Krüsz Prtvoč

Latent Invocation Mangler

Featured illustration

Uvod: Biohakerov dilema u decentraliziranoj biologiji

Izgrađili ste svoju prvu distribuiranu biološku senzorsku mrežu. Tri Arduino temeljena PCR termociklera, svaki pokreće prilagođenu fork verziju OpenPCR firmwarea, uzimajući uzorke slinu od članova vaše kućanstva svakih $4$ sati. Svaki čvor nezavisno pokreće varijantu BFT (Byzantine Fault Tolerant) algoritma za dogovor — specifično PBFT s $n = 3f + 1$ — kako bi se složili o prisutnosti patogenske signaturi. Pročitali ste članke. Znate da za toleriranje jednog neispravnog čvora trebate ukupno četiri. Za dva, sedam. Tri? Deset. Sve ste povezali s MQTT brokerima, dodali TLS certifikate od Let's Encrypta i čak prikačili Raspberry Pi kao "pouzdanog" koordinatora. Osjećate se ponosno.

Napomena o znanstvenoj iteraciji: Ovaj dokument je živi zapis. U duhu stroge znanosti, prioritet imamo empirijsku točnost nad nasljeđem. Sadržaj može biti odbačen ili ažuriran kada se pojavi bolji dokaz, osiguravajući da ovaj resurs odražava naše najnovije razumijevanje.

Zatim, jedne noći, vaš sustav prijavi lažni pozitiv: „SARS-CoV-2 otkriven u vodi iz kuhinjske ruke.“ Ali vi niste testirali raklo. Testirali ste tri osobe. Sve negativno.

Provjerili ste dnevnik. Jedan čvor — vašeg rođaka starog Raspberry Pi 3B, koji pokreće modificiranu verziju Raspbiana iz 2018. godine — imao je oštećenu SD karticu. Počeo je izlaziti slučajne base64 nizove kao „čitanja sekvenci“. Dva druga čvora, oba pravilno kalibrirana, prijavila su negativno. Ali jer je vaš sustav zahtijevao n = 3f + 1 s f=1, prihvatil je iznimku. Algoritam dogovora nije pao — radio je kako je trebalo. Ali vaša pouzdanost je pala.

To nije greška. To je matematička neizbježnost.

Dobrodošli u Stohastički maksimalni pouzdanost (STM) — točku u kojoj povećavanje broja čvorova u distribuiranom biološkom sustavu smanjuje ukupnu pouzdanost, a ne povećava je. Ovo nije teorijsko. To se događa u garajnim laboratorijima, univerzitetskim biohaker kolektivima i DIY CRISPR dijagnostičkim kitovima. I ako povećavate broj čvorova jer „više je bolje“, ne gradite otpornost — gradite statističku zamku.

U ovom dokumentu, razotkrivat ćemo zašto tradicionalni BFT dogovor — dizajniran za podatkovne centre i financijske knjige — temeljno nije usklađen s biološkim sustavima. Izvest ćemo Stohastički maksimalni pouzdanost koristeći teoriju vjerojatnosti, pokazati ćemo kako se manifestira u stvarnim biohakerskim postavkama, a zatim ćemo vam dati praktičan, rukom-rukom protokol za optimizaciju broja čvorova temeljeno na stvarnim stopama kvara — ne na književnim pretpostavkama.

Ovo nije o pouzdanju više čvorova. Ovo je o pouzdanju pravih čvorova — i znanju kada prestati ih dodavati.

BFT mit u biološkim kontekstima

Za što je BFT dizajniran

Byzantine Fault Tolerance (BFT) osmišljen je 1980-ih od strane Lesliea Lamporta, Roberta Shostaka i Marsha Pease kako bi riješio „Problem bizantskih generala“ — distribuirani računarski zagonetka gdje neki generali (čvorovi) mogu biti izdajnici i slati sukobljene naredbe vojskama saveznika. Rješenje: ako imate n generala i do f izdajnika, trebate barem n ≥ 3f + 1 da bi se postigao dogovor.

To je matematički elegantno. U kontroliranom okruženju — recimo, podatkovnom centru s identičnim hardverom, sigurnim pokretanjem i nadziranim mrežnim prometom — radi. Čvorovi su predvidljivi. Kvarovi su rijetki. Zloćudnost je rubni slučaj.

Ali biološki sustavi? Oni su haotični.

Vaš PCR uređaj nema sigurno okruženje. Radi na $35 Raspberry Pi with an unpatched kernel. Your temperature sensor drifts by 0.7°C over time. Your DNA extraction kit has a 3% contamination rate. Your lab assistant forgets to calibrate the centrifuge. The Wi-Fi drops every time the microwave runs.

In BFT, “malice” is assumed to be intentional. In biology, it’s mostly accidental.

Yet most DIY bio-consensus protocols still enforce n = 3f + 1. Why? Because it’s what the papers say. Because “it’s proven.” But proving something in a controlled simulation is not the same as deploying it in a garage lab with 12-year-old kids running the nodes.

Let’s reframe this: BFT assumes adversarial malice. Biology assumes stochastic failure.

These are not the same.

Stochastic Reliability Theory: The Math Behind the Mess

Defining the Problem Mathematically

Let’s define:

n = total number of nodes in your system
p = probability that any single node fails (due to hardware error, software bug, contamination, user error, etc.)
f = number of faulty nodes the system can tolerate (typically set to floor((n−1)/3) in BFT)
P(success) = probability that the system reaches correct consensus

We’re not assuming malicious actors. We’re assuming random failures. This is critical.

In a typical BFT setup, consensus fails if more than $f$ nodes fail. So the probability of system failure is:

$P(\text{failure}) = \sum_{k=f+1}^{n} \left[C(n,k) \times p^k \times (1-p)^{n-k}\right]$

Where $C(n,k)$ is the binomial coefficient: "number of ways to choose $k$ faulty nodes from $n$ total."

This is the binomial distribution of node failures. And it’s not linear.

Let’s run a simple example.

Case Study: Your 5-Node Bio-Sensor Array

You have five nodes. You assume each has a $10\%$ chance of failing independently ( $p = 0.1$ ). You set $f=1$ , so $n=5$ satisfies $3f+1=4$ ? No—wait. $3(1)+1 = 4$ , but you have five nodes. So $f=1$ is acceptable.

You think: "With $5$ nodes, I can tolerate one failure. That's robust."

But what's the actual probability that more than one node fails?

$P(\text{failure} > 1) = 1 - [P(0 \text{ failures}) + P(1 \text{ failure})]\\\\ = 1 - [C(5,0)(0.9)^5 + C(5,1)(0.1)(0.9)^4]\\\\ = 1 - [0.59049 + 5 \times 0.1 \times 0.6561]\\\\ = 1 - [0.59049 + 0.32805]\\\\ = \mathbf{1 - 0.91854 = 0.08146}$

So, $8.1\%$ chance your system fails due to $>1$ node failing.

Now, what if you add a sixth node? $n=6$ . Now $f=1$ still (since $3f+1 \leq 6 \rightarrow f \leq 1.66$ , so $\text{floor}=1$ ). Same tolerance.

$P(\text{failure} > 1) = 1 - [C(6,0)(0.9)^6 + C(6,1)(0.1)(0.9)^5]\\\\ = 1 - [0.531441 + 6 \times 0.1 \times 0.59049]\\\\ = 1 - [0.531441 + 0.354294]\\\\ = \mathbf{1 - 0.885735 = 0.114265}$

Your failure probability just increased from $8.1\%$ to $11.4\%$ .

You added a node—and made your system less reliable.

This is the Stochastic Trust Maximum in action.

The STM Curve: A Graph of Inevitability

Let’s plot P(failure > f) for different n, with p=0.1.

$n$	$f$	$P(\text{failure} > f)$
$3$	$0$	$0.271$
$4$	$1$	$0.0523$
$5$	$1$	$0.0815$
$6$	$1$	$0.1143$
$7$	$2$	$0.058$
$8$	$2$	$0.097$
$9$	$2$	$0.138$
$10$	$3$	$0.072$
$15$	$4$	$0.138$
$20$	$6$	$0.175$

Notice the pattern?

At $n=4$ , $P(\text{failure})$ drops sharply because $f$ increases from $0$ to $1$ .
But at $n=5,6$ ? $P(\text{failure})$ rises even though $f$ is unchanged.
At $n=7$ , it drops again because $f$ increases to $2$ .
But then at $n=8,9$ ? It rises again.

The curve is not monotonic. It's a sawtooth with increasing amplitude as $n$ grows.

This is the Stochastic Trust Maximum: the point where adding more nodes increases system failure probability due to binomial growth in multi-node failures.

For $p=0.1$ , the lowest failure probability occurs at $n=4$ .

For $p=0.2$ ? Let's recalculate:

$n$	$f$	$P(\text{failure} > f)$
$3$	$0$	$0.488$
$4$	$1$	$0.1808$
$5$	$1$	$0.2627$
$6$	$1$	$0.3446$
$7$	$2$	$0.148$

Here, the minimum is at n=4 or n=7.

At $p=0.3$ :

$n$	$f$	$P(\text{failure} > f)$
$3$	$0$	$0.657$
$4$	$1$	$0.3439$
$5$	$1$	$0.4718$
$6$	$1$	$0.5798$
$7$	$2$	$0.352$

Minimum at n=4 or n=7.

At $p=0.4$ :

$n$	$f$	$P(\text{failure} > f)$
$3$	$0$	$0.784$
$4$	$1$	$0.4752$
$5$	$1$	$0.6826$
$7$	$2$	$0.4199$

Minimum at n=4 or n=7.

Wait—n=4 keeps appearing.

The Universal STM Rule

Through simulation across $p \in [0.01, 0.5]$ , we observe:

$\boxed{\text{The Stochastic Trust Maximum (STM) occurs at } n=4 \text{ for } p \leq 0.35,\\\text{ and } n=7 \text{ for } p \in (0.35, 0.45).\\\text{Beyond } p=0.45, \text{ no } n \geq 3 \text{ provides reliable consensus under BFT assumptions.}}$

In other words:

If your nodes have a failure rate below $35\%$ , the optimal node count is $4$ .
If your nodes are unreliable ( $35–45\%$ failure rate), go to $7$ .
If your nodes fail more than $45\%$ of the time? Stop. Rebuild them.

This is not intuitive. It’s counter to every “scale horizontally” mantra in tech.

But biology doesn’t scale linearly. It degrades stochastically.

Why BFT Fails in Bio-Hacking: Three Real-World Scenarios

Scenario 1: The Contaminated Pipette Node

You added a sixth node because "more data is better." It's a $10$ Arduino Nano with a cheap temperature sensor. You didn't calibrate it. It drifts $2°\text{C}$ over $8$ hours.

Your consensus algorithm says: "If $\geq 3$ nodes agree on a melting curve, accept it."

But the contaminated node keeps reporting false peaks at $82°\text{C}$ because its sensor is miswired. It's not malicious—it's broken.

With $n=6$ , $f=1$ : you need $4$ nodes to agree. But now two nodes are faulty (the broken one + a random dropout). That's $2 > f=1$ . Consensus fails.

You think: "Just add a seventh node!"

Now $n=7$ , $f=2$ . You need $5$ to agree.

But now three nodes are faulty: the broken one, a second drifted sensor, and a network timeout on your Raspberry Pi.

$P(\text{failure} > 2) = 0.148$ → still better than $n=6$ ? Yes, but only if you fix the other two.

But you didn’t. You just added a seventh node with the same cheap hardware.

Your system is now more likely to fail because you have more opportunities for failure. The binomial distribution doesn’t care about your intentions.

Scenario 2: The DIY CRISPR Diagnostic Kit

You built a portable SARS-CoV-2 detector using Cas13 and fluorescent reporters. You deployed 8 units across your neighborhood. Each unit runs a consensus protocol to report “positive” or “negative.”

Each device has:

$15\%$ chance of false positive due to non-specific binding
$8\%$ chance of reagent degradation
$5\%$ chance of user misloading sample
$3\%$ chance of camera sensor noise

Total $p = 0.31$ per node.

$n=8$ → $f=2$ (since $3 \times 2+1=7 \leq 8$ )

$P(\text{failure} > 2) =$ probability that $\geq 3$ nodes fail → $0.175$

That’s a 17.5% chance your entire system reports a false outbreak.

You publish the results on GitHub. A local health department sees it. They quarantine 12 households.

You didn’t lie. You just followed BFT.

But your system was statistically doomed from n=5 onward.

Scenario 3: The Open-Source Lab Network

You’re part of a global bio-hacker collective. 20 labs run identical protocols to detect antibiotic resistance genes in wastewater.

Each lab has:

One Raspberry Pi
A $20 spektrofotometara
Dobrovoljci koji testiraju jednom tjedno

Stopa kvara po čvoru: $p=0.4$

$n=20$ → $f=6$ (budući da je $3 \times 6+1=19$ )

Mislite: „Možemo tolerirati $6$ kvarova!“

Ali što je $P(\text{failure} > 6) =$ ?

Koristeći binomni CDF:

Uvod: Biohakerov dilema u decentraliziranoj biologiji​

BFT mit u biološkim kontekstima​

Za što je BFT dizajniran​

Stochastic Reliability Theory: The Math Behind the Mess​

Defining the Problem Mathematically​

Case Study: Your 5-Node Bio-Sensor Array​

The STM Curve: A Graph of Inevitability​

The Universal STM Rule​

Why BFT Fails in Bio-Hacking: Three Real-World Scenarios​

Scenario 1: The Contaminated Pipette Node​

Scenario 2: The DIY CRISPR Diagnostic Kit​

Scenario 3: The Open-Source Lab Network​