Autonomous agents (robots, drones) rely on complex "Perceive-Plan-Act" loops powered by opaque hardware accelerators (NPUs/GPUs). Current validation focuses on model logic (adversarial attacks) or silicon durability, ignoring the critical Driver-Hardware Interface. In this blind spot, transient hardware faults (e.g., bit-flips) do not merely cause crashes; they cause "State Poisoning"—silently corrupting an agent's world model, leading to confident yet catastrophic behavioral failures.
We propose F-EDGE, a hardware-agnostic framework to stress-test the cognitive reliability of COTS accelerators (NVIDIA Orin, Rockchip, Coral) (see figure below).
It operates via a novel Kernel-Level "Man-in-the-Middle" Architecture:
- Proxy Injection: Replaces standard device nodes (e.g., /dev/rknpu) with proxy modules to intercept and corrupt tensor buffers in monolithic drivers.
- Symbol Hooking: Uses Kprobes on Unified Memory drivers (e.g., nvidia-uvm) to inject faults during CPU-GPU data migration.
- Cognitive Injection/Fuzzing: A state-aware engine that triggers faults contextually (e.g., specifically during "Path Planning" vs. "Idle") to map hardware errors to behavioral outcomes.
The framework will be validated on a representative hierarchical agent scenario (e.g., Visual Language Model (VLM)-based Planner + NPU/GPU-based Controller). By injecting faults into the driver layer, we will trace error propagation from silicon to semantic failure (e.g., misclassification
- Open-Source Toolchain: A verifiable suite of proxy drivers for major Edge AI platforms
- Failure Taxonomy: The first classification mapping Silicon Faults to Agentic Behaviors (Transient Glitch vs. Mission Kill)
- New Metric: Definition of Mean Time to Agent Failure (MTTAF), establishing a standardized benchmark for trustworthy autonomous systems.
Today, AI workloads heavily rely on hardware accelerators for critical autonomy. The ability to independently verify driver and hardware robustness is a matter of technological sovereignty and safety. F-EDGE provides the necessary methodology to bridge the gap between silicon faults and semantic understanding, ensuring that the next generation of Edge AI agents is not only intelligent but demonstrably trustworthy.