Communicating Uncertainty in #Wargaming Outcomes

In late 1990, Frank Chadwick, the president of Game Designers Workshop, received a letter from an Army soldier deployed to the Middle East. The letter described how the members of his unit were huge fans of the printed, commercial wargames developed by Chadwick’s company. The letter also described an exigent need for which they were seeking immediate help. They wanted to purchase another copy of a particular, hard-to-get wargame. During game play there in the desert, a sandstorm had blown away the board and all the pieces. It turned out that as tensions continued to rise in in the Persian Gulf, this unit had made tactical wargaming part of their everyday preparation for the imminent combat operations.

This is just one of many anecdotes in the upcoming book, On Wargaming, by Matt Caffrey. In it, he describes that the use of wargames varies widely, from the tactical example above to the most strategic, especially in the lead-up to Desert Storm. At each of these levels, leaders from the company commander to the President used the outcome of these wargames to support decisions, but the conclusions drawn from these wargames vary as widely as the level of their use. At the tactical level, we have detailed, quantitative information about weapons effectiveness, troop numbers, and terrain. We can analyze the outcomes of these wargames with our most powerful analytic tools.

At the strategic level, however, the outcomes are not so tidy. The Persian Gulf War illustrates this profoundly and serves as an historical example of the most comprehensive and effective use of wargames in US history. Surprisingly, some of Caffrey’s most profound findings about wargaming and Desert Storm are not conclusions. Instead, they are questions.

The impact of wargaming on the Gulf War was enormous--and mostly positive. Yet, Coalition casualty indications were over 20 times too high. These indications had real political and military consequences. How would these events shape the evolution of wargaming in the US and around the world?[i]

The H.W. Bush administration used the casualty estimates to shape many of their decisions. Did a single leader stop to ask how much confidence do we have in that estimate? If so, did that same leader ask how could we have so grossly missed the mark? Caffrey asks similar questions, but one suggestion in particular stands out:

The intelligence community routinely includes confidence assessments within the products they provide to decision makers. They may say, “We have high confidence in this report as it is based on multiple highly trusted sources,” or, “we have low confidence in this report as it is based on a single fairly new source.” At the time of this writing, it is still unusual for reports on wargames to include explicit statements on their assessed level of confidence.[ii]

That leads to a particular line of thought about wargaming: If a game were played one hundred times, would the outcome be the same every time? On the one hand, we do not expect the tactical results of this game to appear identical, but would the strategic outcomes appear consistent across countless repeats of game play? Is there any chance that strategic outcomes would vary? If so, how much? It’s certainly expected that small variations of the strategic outcomes may appear in repeated wargaming, but is it plausible to believe that some percentage of outcomes would suggest a completely opposite strategy or strategic outcome? These questions are what we mean when we ask, “How much confidence do we have in the outcome?” Unfortunately, we are ill-prepared to answer this question but not for dearth of tools and technology to make such assessments. Instead, there is a chance that most of us would not accurately comprehend such a confidence statement. This occurs largely because of a lack of shared understanding of a shared language of confidence and uncertainty. To help us build a vocabulary for answering these questions, I would like to propose three foundational rules. First, we should express wargame outcomes both qualitatively and quantitatively. Second, we should attempt to describe the range of possible outcomes. Finally, we ought to assess the frequency of potential outcomes.

Before further developing such a lexicon, we should do away with the word confidence, because it carries both technical and emotional baggage. I would prefer that we refer to it as “communicating uncertainty.” By leading with the word uncertainty, we approach the subject with the appropriate amount of humility. In a sense, we are saying, “my answer might be wrong, and I know that.” Additionally it acknowledges that wargaming is not a crystal ball, not a prediction about the future, but a model. That attitude disarms the tendency to be defensive or argumentative, and it opens the ears of the listeners, making them more receptive for what follows. It also prevents them from having any preconceived notions about what they might hear, as it strips away the technical baggage associated with the other term.

Since a wargame is in fact a game, we can use the common vernacular of winning and losing as a foundation for establishing an appropriate language for communicating uncertainty. The widespread familiarity with sports allows us to apply analogies to shape our understanding of uncertainty and the need to communicate it clearly. Our first principle for communicating uncertainty illustrates the efficacy of this approach.

  1. Express the outcome both qualitatively and quantitatively. Every time we discuss the results of a wargame or the effectiveness of a strategy in a wargame, we should consider both of these characteristics. Every “answer” should have these measures. Asking the question “who won?” should serve as the qualitative baseline, the minimum expression of the outcome. The notion of winning, however, may be ambiguous, but it may serve to simply force the players to admit what they already believe. Introducing a quantitative descriptor may help in the analysis and decision. When we report the score, as we did in the example above, it gives us a more precise understanding of the outcome, but it also gives us a more general notion about uncertainty of the results. For example, if we heard that a football game had a final score of 3-5, we would easily deduce that this type of outcome was highly unlikely. This is more difficult in the realm of wargames, because “score” may not be well-defined, but this idea of reporting the score transitions us to the second principle of understanding and communicating uncertainty.
  2. Describe the range of possible outcomes. Any time we execute a wargame, the outcome is just one of many possibilities. This particular result is a validation of something because we may not have any intuition at all about what outcomes are even possible. So we get one data point, and that is very important. However, it would be better if we knew the total range of possible outcomes. To continue our sports analogy, we see that the range of outcomes include winning, losing, or possibly even a tie. These we have expressed qualitatively. Once we examine it quantitatively, the range of possible outcomes takes on an even more complex but descriptive characteristic. From our football example, we can begin to picture the range of outcomes. A game in which neither team scored at all would be extremely boring and result in a score of 0-0. This anchors one end of the spectrum of outcomes. We can imagine a situation in which a team scores 100 points—this is remotely possible but highly unlikely—and so we begin to solidify our notion about the uncertainty of such an outcome, mixing both qualitative and quantitative notions in a wonderful way and set the stage for the final principle.

  3. Assess the frequency of potential outcomes. This is inherent in the above point. A football game in which one team scores 100 points is highly unlikely. “Highly unlikely” is the portion of that statement that describes the frequency of such an outcome. If we watched hundreds of football games, we might never observe such a high scoring game. It would certainly only be observed on an infrequent basis. A 0-0 tie might be slightly more likely. Somewhere in the middle of this spectrum, we can imagine many games with scores in the 20s and 30s, and thus we have a notion about the frequency of these outcomes. In this example, we can further refine our idea about frequency by applying rigorous probabilistic ideas and quantitative descriptors. Hence, we have advanced significantly in our ability to understand the outcome of sports games with three simple principles. There is a good chance we can do the same in our ideas about wargames.

In the first example cited above, the outcome of a commercial wargame has a relatively well-defined outcome in terms of winners and losers. The rules, pieces, and constraints of the game, together with a role of the dice, determine the outcome, and from these facts, we can derive rigorous quantitative models of each principle above. Furthermore, the repeated game play by the soldiers would allow us to construct a reasonable statistical estimate with less effort than the exhaustive mathematical derivation. However, that may not be necessary. We can agree that this type of wargame was useful to exercise the thought processes, decision skills, and forethought of the players. We can also agree that the outcomes would not accurately predict the infinitely more nuanced situation in front of these battle-ready units. That last statement is sufficient to communicate the uncertainty about the wargame outcome in that case.

At the strategic level, it makes sense to do a more thoughtful and complete review of the wargame outcome. The Desert Storm wargame report that opened this article included at least two critical pieces of information upon which we will focus: the declaration of a victor in the combat operation, and an estimate of casualties. It is also safe to assume that the estimate of casualties was coincident with US victory.

  1. Express the outcome both qualitatively and quantitatively. When we consider this expression of the outcome, we immediately see that the estimate was quantitative, 20x, where x was the actual number of casualties. Is this number of casualties reasonable or extremely unreasonable? (Though any loss of life is undesirable, the decision to go to war comes with the responsibility to make decisive judgments like this.) These words are qualitative descriptors of the outcome.
  2. Describe the range of possible outcomes. Avoidance of combat operations or some freakish overwhelming victory may result in zero casualties. This establishes a lower bound on the range of outcomes. An estimate of the upper bound should be based on the wargame initial conditions, total number of units deployed, etc. With these two limits, we can assess the casualty estimate. This also gives us the ability to determine relative losses as a percentage of the total force. This proportion will further inform the decisions. A casualty estimate that seems excessive in the abstract may appear small as a fraction of the total force. This seems more reasonable, more likely. Casualty estimates that represent annihilation would seem far more unlikely and may force the players to question the setup of the wargame, as modern war at the theater level does not usually represent the Alamo.

  3. Assess the frequency of potential outcomes. One might wonder how many times Desert Storm planners and leaders repeated the strategic wargame. Usually, wargames played at this level are too costly in time, manpower, etc., to repeat, which makes it more difficult to assess the frequency of a given outcome. In such a case, overlaying an existing probability distribution, a known curve shape, may serve as a reasonable assumption and tool for assessing frequency. Game outcomes can also serve to update the shape of this curve. This step in the communicating uncertainty process is the most complex, difficult, and nuanced, but with these limitations come greater information and a more thorough understanding of uncertainty. If a wargame report suggested to Gulf War leaders that 50% of wargame simulations resulted in casualty estimates of this magnitude, then decision makers could stand firmly on such an estimate. If no such frequency was ever contemplated, we must blame ourselves as strategists and wargamers. The more often strategic leaders and wargame practitioners make these kinds of assessment, the better we will get and the richer our data set for making more rigorous inference.

Each of these three rules and our hindsight about Desert Storm yield a starting point for exploring the language of uncertainty and its application to communicating wargame outcomes. This is only a starting point, and it befits us to recall that “no plan survives first contact with the enemy.” The next step in the journey will be applying these principles to ongoing wargames, putting our ideas to the test, and we can expect that our understanding of these concepts will evolve. Just as Chadwick’s board game did not inexorably determine the outcome of any Desert Storm battles, neither should we expect these three guidelines to provide a perfect silhouette of the future battlespace or strategic environment. Instead, we should use them like painted outlines of the practice pitch, boundary markers on the playing field, of which MacArthur said, “On these fields of friendly strife are sown the seeds that on other days, on other fields will bear the fruits of victory.” So should we use these tenets during wargame play to strengthen and shape our ability to encounter uncertainty in war time.

With these games will come a far greater deluge of information, requiring of leaders a greater skill, a more urgent need to make sense of it all and inform decisions. Since the dawn of man and war, we have seen technology improve our ability to strike targets and wage war, and we should expect the same learning curve in our application of these three principles for communicating uncertainty together with advances in simulation and computation. At the dawn of airpower in World War I, hundreds of bombs fell before single targets were destroyed. Today we hit single targets within hundreds of centimeters. In the next war, we will be required to use information, like the uncertainty implicit in the outcomes of a hundred wargames, to create strategic effects with the same precision. This simple introduction to communicating uncertainty may be analogous to those early days, to a single bomb dropped in the first World War. Hopefully, though, the utility of these ideas is more readily apparent and their potential will be realized more quickly.

Mark Jones, Jr. is an experimental test pilot, USAF Reservist, and writer. He is also a member of the Military Writers Guild. The opinions expressed are his alone, and do not reflect those of the U.S. Air Force Reserve, the Department of Defense, or the U.S. Government.

Header Image: Image by PAXSims.

Have a response or an idea for your own article? Follow the logo below, and you too can contribute to The Bridge:

Enjoy what you just read? Please help spread the word to new readers by sharing it on social media.


[i] Caffrey, Matthew. “On Wargaming: How Wargames have Shaped History and How They may Shape the Future.” Draft Manuscript, 2017. (Vol 1, pg 103).

[ii] Ibid. (Vol 2, pg 66).