Skip to content

When the U.S. Navy recently redeployed tactical assets from Southeast Asia to the Mediterranean, the mission changed overnight. The environment changed. The threat landscape changed. And every AI system running on those platforms had to change with it, or become a liability.

That kind of operational reality doesn’t show up in a benchmark. It doesn’t get tested in a lab. And it has nothing to do with whether the underlying model is good. Most programs encounter this the hard way. As our own Jon Brookshire, Director of Integrated Systems at Latent AI, who leads our drone control systems work, put it plainly: “AI is just one piece of a much larger system architecture. The tracking is not the hard problem. The AI model handles that. The hard problem shifts to the next tallest pole in the tent.”

Getting AI ready for the field isn’t primarily a model problem. It’s an edge AI systems engineering problem, and it’s the root cause of most AI deployment failures. Edge AI systems fail in the field when adaptability is treated as an afterthought rather than a systems engineering requirement. Unfortunately, most of the industry is still building as if it’s the former.

Here’s what that distinction means in practice. A model trained on desert environments may perform perfectly on every benchmark you run and fail the moment it’s deployed elsewhere. An adversary that understands your AI’s known performance parameters will simply operate outside them. In Ukraine, both Russian and Ukrainian forces have responded to drone AI targeting by physically modifying their vehicles, welding cage structures, chains, and “hedgehog” cable systems directly onto tanks in the field, specifically to defeat drone targeting AI. According to reporting by Euromaidan Press, forces on both sides have become rapid innovators in vehicle protection, continuously adapting countermeasures that directly challenge the assumptions on which AI systems were trained. The geometry changes. The visual signature changes. The model that worked yesterday stopped working today.

This isn’t only a defense problem. A predictive maintenance model trained on a factory floor in a controlled Midwestern facility will degrade when that same company opens a plant in Southeast Asia, where heat, humidity, and dust levels are a completely different engineering reality. An autonomous vehicle system optimized for well-marked suburban roads will encounter the same systems engineering challenge when it hits an unmarked rural intersection for the first time. The conditions change. The assumptions fail. And if the system can’t adapt, it doesn’t matter how good the model was in testing.

The ability to retune a model in the field, without a server, without an ML expert, without a trip back to the lab, isn’t a nice-to-have. It’s what separates a system that survives mission reality from one that doesn’t. As our CTO Sek Chai described it: “At that moment in the battle, you need to tune your model to win that scenario. Give that capability to the warfighter.” The same principle applies to any operator in any environment where conditions change faster than a development cycle can respond.

But adaptability can’t be bolted on after the fact. It has to be engineered from the beginning, spanning the model, hardware, update pipeline, and deployment architecture. It requires treating the whole system as the unit of performance, not just the algorithm.

This is where the traditional development process breaks down. Machine learning engineers optimize for algorithmic accuracy. Embedded engineers optimize for hardware constraints. The two teams meet at the end, discover the mismatch, and start cutting corners. The result is always what Sek calls “the average of everything,” a system that neither performs to its algorithmic potential nor meets its hardware requirements. AI deployment failures at this stage account for ninety percent of AI models that never make it to production. This is largely why.

The better approach is to make AI deployment an engineering problem rather than an art form. That means building with hardware requirements in mind from day one. It means accumulating the telemetry across platforms, environments, and deployment scenarios, which lets you know how a model-hardware combination will actually perform before you put it in the field. It means being able to answer the question every program manager and operations leader asks first: how long will this take? Because if the answer is unknown, planning becomes impossible and operations stall.

At Latent AI, we’ve spent years accumulating that knowledge, terabytes of real-world telemetry across hundreds of thousands of device hours, spanning the programs and platforms where edge AI actually has to work. Not to make the deployment process more elegant. To make it deterministic. So that when conditions change, when a fleet redeploys, when an adversary adapts, when a factory shifts production lines, when the mission changes, the AI can change with it.

Jon put it simply when describing what field tests almost always reveal: “They fail for reasons not related to the core technology. It’s the assumptions you didn’t realize you were making. And when you get out there, they don’t hold.”

The systems that survive aren’t the ones with the highest benchmark scores. They’re the ones built to keep working when the assumptions run out.

That’s the engineering problem worth solving. That’s what we’re building toward.

Read the full white paper, “AI Under Fire: Designing Edge AI Systems to Survive Mission Reality,” for the complete framework.