The German Tank Problem with Multiple Factories

Authors

  • Steven J. Miller Williams College
  • Kishan Sharma University of Cambridge
  • Andrew K. Yang University of Cambridge

DOI:

https://doi.org/10.46787/pump.v7i0.4249

Keywords:

German Tank Problem; sufficient statistics; complete statistics; sampling without replacement; linear estimators

Abstract

During the Second World War, estimates of the number of tanks deployed by Germany were critically needed. The Allies adopted a successful statistical approach to estimate this information: assuming that the tanks are sequentially numbered starting from 1, if we observe k tanks from an unknown total of N, then the best linear unbiased estimator for N is M(1+1/k)-1 where M is the maximum observed serial number. However, in many situations, the original German Tank Problem is insufficient, since typically there are l > 1 factories, and tanks produced by different factories may have serial numbers in disjoint ranges that are often far separated.

Clark, Gonye and Miller presented an unbiased estimator for N when the minimum serial number is unknown. Provided one identifies which samples correspond to which factory, one can then estimate each factory's range and summing the sizes of these ranges yields an estimate for the rival's total productivity. We construct an efficient procedure to estimate the total productivity and prove that it is effective when log l/log k is sufficiently small. In the final section, we show that given information about the gaps, we can make an estimator that performs orders of magnitude better when we have a small number of samples.

Downloads

Published

2024-11-10

How to Cite

Miller, S. J., Sharma, K., & Yang, A. K. (2024). The German Tank Problem with Multiple Factories. The PUMP Journal of Undergraduate Research, 7, 231–257. https://doi.org/10.46787/pump.v7i0.4249