# Generalizing the German Tank Problem

• Steven Miller Williams College
Keywords: German Tank Problem; uniform distribution; discrete setting; continuous setting

### Abstract

The German Tank Problem dates back to World War II when the Allies used a statistical approach to estimate the number of enemy tanks produced or on the field from observed serial numbers after battles. Assuming that the tanks are labeled consecutively starting from 1, if we observe k tanks from a total of N tanks with the maximum observed tank being m, then the best estimate for N is m(1 + 1/k) - 1. We refer to an estimate as "best" when the estimate is closest to the actual number of tanks. We explore many generalizations; first, we looked at the discrete and continuous one-dimensional case. We attempted to improve the original formula by using different estimators such as the second largest and Lth largest tank, and applied motivation from portfolio theory by seeing if a weighted average of different estimators would produce less variance; however, the original formula, using the largest tank proved to be the best; the continuous case was similar. Then, we looked at the discrete and continuous square and circle variants where we pick pairs instead of points, which were more complex as we dealt with problems in geometry and number theory, such as dealing with curvature issues in the circle, and the problem that not every number is representable as a sum of two squares. In some cases, when we could not derive precise formulas, we derived approximate formulas. For the discrete and continuous square, we tested various statistics, but found that the largest observed component of our pairs is the best statistic to look at; the scaling factor for both cases is (2k+1)/2k. For the circle we used  motivation from the equation of a circle; for the continuous case, we looked at the square root of X2+Y2 and for the discrete case, we looked at X2+Y2 and took a square root at the end to estimate for r. Interestingly, the scaling factors, a number, generally a little greater than 1, that we multiplied to scale up to get our estimation, were different for the cases. Lastly, we generalized the problem into L-dimensional squares and circles. The discrete and continuous square proved to be similar to the two-dimensional square problem. However, for the Lth dimensional circle, we had to use formulas for the volume of the L-ball, and had to approximate the number of lattice points inside it. The discrete circle formula was particularly interesting, as there was no L dependence in the formula.