8 months ago
At first glance, the intelligence report is remarkable in that it looks unremarkable:
There is a thesis statement, followed by some key supporting information, all clearly laid out in complete, legible sentences.
Only on closer inspection does the language betray the document: an odd arrangement of facts and anecdotes, colorful descriptions not typically found in dry analysis, the clumsy introduction of subjects that are suddenly abandoned and, overall, sentences that don’t seem to follow each other.
A news editor would be appalled by the jumble, but officials at the nation’s lead spy agency were elated by it for a simple reason: The report was written not by a human, but by an algorithm, and in just a few seconds.
The 526-word paper, titled Artificial Intelligence Advancements in Military Ranks, was among the reports compiled with lightning speed by computer code during a recent contest put on by the Office of the Director of National Intelligence (ODNI). The point of the contest, called the Xpress Challenge and launched in May, was to determine “just how far along we are toward achieving the goal of machine-generated finished intelligence.”
“Current Intelligence Community (IC) analytic production methods are labor intensive, time consuming and struggle to scale effectively against ever-growing volumes of information as well as the speed with which new threats emerge,” an ODNI official told RealClearLife. “Although analytic aids and tools exist to assist IC analysts, there are occasions when policy-makers require analysis-driven conclusions and options in near-real time. The intent of the Xpress Challenge was to explore the state-of-the-art for machine-based opportunities for enhancing and speeding up these IC analytic production processes.”
Essentially, how good is an algorithm at doing the complex and delicate work of a human analyst?
The sample report above was drawn from the code developed by French programmer Simon Cazals, who beat out nearly 400 other contestants to win $150,000 in the contest. But he acknowledged that there’s quite a long way to go.
“I think this project was really challenging. I had a lot of fun doing it, and was sometimes positively surprised with some sentences that the algorithm would come up with,” Cazals told RealClearLife over email. “But I also ran into some output [that didn’t make] any sense. I believe getting a human-like product would require a lot of time and effort, and I was probably far from this result.”
The contest focused on what Dennis Gleeson, a former director of strategy in the CIA’s Directorate of Analysis, previously said was a major hurdle for “deep learning” algorithms: understanding “unstructured data” that is difficult to quantify computationally but makes perfect sense to human eyes and ears.
A human intelligence analyst can pull from a dozen disparate sources — from spy cables to signals intelligence to news articles to taped speeches to tweetstorms — to compile an analytic product, Gleeson said. The process may be labor intensive, but it’s relatively straightforward for analysts since most of those sources are designed for human consumption. But from there it’s a trick of translation to zeros and ones that even the speediest and smartest of computer brains haven’t mastered yet.
The Xpress Challenge was not a real-world simulator and simplified things significantly by having contestants draw from a single archive of unstructured data: more than 15,000 news articles published by SIGNAL magazine, a national security publication produced by the Armed Forces Communications and Electronics Association (AFCEA).
Once the contestants submitted their code, ODNI would ask a question of the programs and see what popped out the other side. ODNI then judged the responses as if they were human-produced analytical products by giving various elements rankings of Poor (0), Fair (1), Good (2), or Excellent (3). “A score of Good (2) is considered the threshold for meeting IC-wide standards,” the ODNI official said.
“Although no Xpress solver achieved an ‘Excellent (3)’ score in any category, it is highly encouraging that solvers were able to produce many ‘Fair (1)’ scores and even several ‘Good (2)’ scores… On average, the current state-of-the-art was just short of ‘Fair (1)’ across the evaluation categories used,” the official said.
The sample report, which was provided to RealClearLife by the ODNI, was created in response to this (kind of meta) question: “What developments related to artificial intelligence are most impactful to the national security of the United States?”
The automated report focused on artificial intelligence in the military and appeared to extract relevant near-complete sentences from SIGNAL articles before reconfiguring them into a more legible, semi-logical format.
It’s clearly not to the level of top-quality human analysis, but what the code’s report lacked in quality, it made up for in speed. Another ODNI official said in a press release that Cazals’ program provided responses “in about 10 seconds.”
The official, ODNI Directorate of Science and Technology program manager for the challenge David Isaacson, said that as quality improves, that kind of speed “may afford decision-makers a parallel intelligence production model that allows them to rapidly determine if such a machine-generated output it ‘good enough’ for their pressing information needs.”
The ODNI official who spoke to RealClearLife was cautiously optimistic about this approach, but appeared aware of the potential concern over making national security decisions based on human-free analysis.
“[W]e hope that with prominent, clearly-crafted caveats, sophisticated decision makers would understand the initial limitations of this approach while appreciating the reduced lead time and increased breadth in coverage it would provide,” the official said.
The official said the ODNI is nowhere near done with these types of contests and hopes to expand them to include more questions and a real-world breadth of sources.
“Although this was instructive, [intelligence community] researchers will now need to establish and employ conditions that more closely approximate those of IC analysts and their customers to begin exploring and assessing in earnest the benefits of the machine analytics approach offered by the Xpress Challenge,” the official said.
In the meantime, Cazals said he thinks algorithms like his can be used now to “help to extract relevant data very quickly and summarize information to help analysts” as they go about their work.
“In the future, it will surely get better at these tasks but my knowledge of natural language processing is pretty limited and I cannot tell what will be the state-of-the-art in five to 10 years,” he said. (Gleeson thinks an algorithm may have a hand in writing one of the most sensitive national security documents on the planet, the President’s Daily Brief, within the decade.)
But the ODNI official said there are at least some critical questions about any national security development that computers may forever have a difficult time answering — the same ones with which humans often struggle.
“[F]or example, the ‘why’ behind a given incident, or detecting slow shifts in the strategic underpinnings of important events,” he said. “To continue meeting these enduring, critical needs, the Intelligence Community will still require its human analysts to guide collection, build expertise on their accounts, and pursue sophisticated intelligence products.”