Bezos’s public embrace of the Pentagon comes as Amazon is challenging the award of a $10 billion cloud computing contract called JEDI, or the Joint Enterprise Defense Infrastructure, to Microsoft. That system will be key to Shanahan’s AI ambitions, giving him the computing power and the shared infrastructure to crunch massive data sets and unify disparate systems.
It was the lack of such a cloud system that convinced Shanahan of its importance. When he ran Maven, he couldn’t digitally access the surveillance footage he needed, instead having to dispatch his subordinates to fetch it. “We had cases where we had trucks going around and picking up tapes of full-motion video,” Shanahan says. “That would have been a hell of a lot easier had there been an enterprise cloud solution.”
To push updates to the system, Shanahan’s team similarly had to travel to physically install newer versions at military installations. Today, Maven is getting software updates every month or so—fast for government work, but still not fast enough, he adds.
But JEDI isn’t going to solve all of Shanahan’s problems, chief among them the poor quality of data. Take just one JAIC project, a predictive maintenance tool for the military’s ubiquitous UH-60 Black Hawk helicopter that tries to figure out when key components are about to break. When they started collecting data from across the various branches, Shanahan’s team discovered that the Army’s Black Hawk was instrumented slightly differently than a version used by Special Operations Command, generating different data for machines that are essentially identical.
“In every single instance the data is never quite in the quality that you’re looking for,” he says. “If it exists, I have not seen a pristine set of data yet.”
Data quality is one of the chief pitfalls in applying artificial intelligence to military systems; a computer will never know what it doesn’t know. “There are risks that algorithms trained on historical data might face battlefield conditions that are different than the one it trained on,” says Michael Horowitz, a professor at the University of Pennsylvania.
Shanahan argues a rigorous testing and evaluation program will mitigate that risk, and it might very well be manageable when trying to predict the moment an engine blade will crack. But it becomes a different question entirely in a shooting war at the scale and speed of which the AI has never seen.
The at times unpredictable nature of computer reasoning reasoning presents a thorny problem when paired with the mind of a human being. A computer may reach a baffling conclusion, one that the human who has been teamed with it has to decide whether to trust. When Google’s AlphaGo defeated Lee Sedol, the world’s best Go player, in 2016, there was a moment in the match when Lee simply stood up from his chair and left the room. His computer adversary had made such an ingenious and unexpected move (from a human perspective) that Lee was flummoxed. “I’ve never seen a human play this move,” one observer said of the move. “So beautiful.”
Imagine a weapons system giving a human commander a similarly incomprehensible course of action in the heat of a high-stakes conflict. It’s a problem the US military is actively working on, but one for which it doesn’t have a ready solution. The Defense Advanced Research Projects Agency is working on a program to come up with “explainable AI,” which aims to turn the black box of a machine-learning system into one that can provide the reasoning for the decisions it makes.
To build that trust, Shanahan notes commanders need to be educated in the technology early on. Projects using computer vision and satellite imagery to understand flooding and wildfire risks allow his team to learn by doing and build up expertise. “You have to understand the art of the possible or else it’s all science fiction,” he says.
But key bureaucratic hurdles also stand in Shanahan’s way. A congressionally mandated report on the Pentagon’s AI initiatives released this week finds that the DoD lacks “baselines and metrics” to assess progress, that the JAIC’s role within the DoD ecosystem remains unclear, and that the JAIC lacks the authority to deliver on its goals. It also offers a dismal assessment of the Pentagon’s testing and verification regime as “nowhere close to ensuring the performance and safety of AI applications, particularly where safety-critical systems are concerned.”