A Finite-Element Analogy For Distributed Computing Resilience: Predictive, Non-Invasive Resiliency Engineering Beyond Chaos Testing
Main Article Content
Abstract
Distributed computing resilience is often evaluated by reactive, empirical approaches such as chaos engineering, which—while valuable—require injecting failures and cannot fully anticipate emergent fragility. This paper develops a predictive, non-invasive framework by adapting finite-element analysis (FEA) principles to distributed systems. We model nodes as elements with capacities, communication as stiffness couplings, workloads as load vectors, and performance degradation as displacements. Fragility is measured using a von-Mises style stress formulation. This revised manuscript incorporates reviewer feedback by: adding a clear problem statement, introducing a concise state-of-the-art section, providing narrative bridges be- fore equations, presenting telemetry- based parameter estimation, expanding proofs into for- mal theorems, and including a fully worked 4-node toy example with numerical results and TikZ/PGFPlots figures. The approach yields resilience scores, fragility indicators, and closed-form critical-load thresholds, validated with illustrative calculations
Article Details
Section

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
This is an Open Access article distributed under the terms of the Attribution-Noncommercial 4.0 International License [CC BY-NC 4.0], which requires that reusers give credit to the creator. It allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, for noncommercial purposes only.