Being paranoid doesn't mean that they aren't out to get you |
After last week's game, I came home convinced that something was awry with many of my dice. After a Google search, I came across several blogs and websites claiming that mass-produced dice such as those made by GW or Chessex can be irregular enough to produce skewed results. This is particularly true of the d20s that various RPGs use. Many of these unbalanced dice will tend to produce huge numbers of opposites; a die that produces too many ones or twos can often produce too many fives and sixes as well. I soon realized that my initial tests were worthless; an unbalanced die that favors mostly ones and sixes will tend to have an average roll of 3.5. As a Space Marine player who usually needs only 3+ and 4+ to do what I want, I don't want dice that will produce an unusual number of fives and sixes (which are overkill) but also give me a ton of ones and twos (which represent failed armor saves and missed shots). Instead of making the expected 66.7% of my armor saves or to hit rolls, extremely skewed dice could give me an armor save and to hit rate of nearly 50%.
I eventually came across this D&D blog, which described using a Pearson's chi-square test to validate whether or not a d6 or a d20 is fair. This test has the advantage of detecting when individual numbers come up too often or not often enough rather than depending on averaging results. The equation for a d6 is:
Σ[(O-E)^2/E] > 11.070
Where "O" is observed frequency and "E" is expected frequency. The value of 11.070 is the chi-square value for a system with five degrees of freedom; i.e., the number of possible results (six for a d6) minus 1. The equation essentially says that if the result of the equation is greater than the chi-square value, then the claim that the die is fair is probably false.
To use this equation you roll a die a certain number of times and record how many times each side shows up. For each of the six possible die results, you subtract the expected frequency that the side should appear from the observed number of times that the side actually showed up. You divide that value by the expected frequency and then add up the resulting numbers from each of the six possible results. If the sum is greater than 11.070, it is statistically probable that the die is not fair. Although 30 rolls (with an expected frequency of 5 hits for each side) is considered to be a minimum, the same D&D blog noted that a 30 roll test is likely to miss slightly unbalanced dice. A better test would use more than 100 rolls per die.
I decided to apply this test to my own dice and started by placing them in a grid and assigning each die a letter and number designation from A1 to F6. I'm crazy, but not crazy enough to roll all 36 of my dice over 100 times, so I started with just 30 rolls per die. I put the data into a spreadsheet that executed the equation and also calculated how often the die gave a result of 2+ (e.g., a Terminator armor save, the usual value needed to kill infantry with anti-tank weapons), of 3+ (e.g., a power armor save, the value needed for a Marine to hit), and of 4+ (e.g., a Scout armor save, the value to wound with most sniper weapons). I used the conditional formatting feature to automatically color code results that I thought were suspicious. After crunching the numbers, I found that none of my dice failed after only 30 rolls, but that several were close. The blog's warning about a 30 roll test being insufficient became apparent when one die failed to roll a single 1 but was still able to pass the test (although just barely).
Those dice that were close to failing or that I found suspicious (18 out of the 36) were given an additional 60 rolls (90 total rolls with an expected frequency of 15 hits per side). The additional results showed that a few of the 18 dice merely had a bad series of rolls earlier and that they were relatively well balanced. However, four dice outright failed the chi-square test. The worst offender would roll a one more than a quarter of the time (27.78%) and would make a 3+ armor save only 53.33% of the time versus the expected 66.67%. Another die would only roll a 4+ 33.33% of the time rather than the expected 50%. Of the four failures, only one rolled unusually high.
I would call this one "The Widowmaker" if Astartes had widows |
Some would call this one "lucky". I retired it, too. |
After more than 2,000 individual dice rolls, I decided to "retire" the four failed dice as well as five others that gave results greater than 6 (a little more than half of the chi-square value). And yes, I also removed the high rolling die, which had the second highest degree of bias. In short, I removed 1/4 of all my dice from my set based on this test. Fortunately I play Space Marines, so I don't really need all 36 dice anyway. Oddly enough, I can't bring myself to throw out the biased dice. They're segregated in their own little Ziploc bag labeled with dire warnings.
I would recommend the above test for any player who thinks his dice are giving him a raw deal. I was somewhat relieved to find that, although a lot of my bad luck may be a matter of perception, a significant amount of it may be due to biased dice. And if your dice don't turn out to be biased, then at least you have mathematical proof that it's all in your head.
I guess the question is, what can a tabletop gamer do about the problem? There are a variety of dice companies that promise balanced dice, but they charge as much for five dice as Chessex does for 36. Plus, Chessex sells a greater variety of colors and styles that are great for those players who want to match their dice to their army. I've simply decided to order a second box of Chessex dice (blue and white to match Ultramarine veterans) with the intention of performing another set of chi-square tests to weed out the biased ones. The test takes me about as long to finish as two long games of 40K. If that saves me from any more dice-sabotaged games, then it was worth it.