Dit (unit)

The hartley (symbol Hart), also called a ban, or a dit (short for "decimal digit"),[1][2][3] is a logarithmic unit that measures information or entropy, based on base 10 logarithms and powers of 10. One hartley is the information content of an event if the probability of that event occurring is 110.[4] It is therefore equal to the information contained in one decimal digit (or dit), assuming a priori equiprobability of each possible value. It is named after Ralph Hartley.

If base 2 logarithms and powers of 2 are used instead, then the unit of information is the shannon or bit, which is the information content of an event if the probability of that event occurring is 12. Natural logarithms and powers of e define the nat.

One ban corresponds to ln(10) nat = log2(10) Sh, or approximately 2.303 nat, or 3.322 bit (3.322 Sh).[a] A deciban is one tenth of a ban (or about 0.332 Sh); the name is formed from ban by the SI prefix deci-.

Though there is no associated SI unit, information entropy is part of the International System of Quantities, defined by International Standard IEC 80000-13 of the International Electrotechnical Commission.

History

The term hartley is named after Ralph Hartley, who suggested in 1928 to measure information using a logarithmic base equal to the number of distinguishable states in its representation, which would be the base 10 for a decimal digit.[5][6]

The ban and the deciban were invented by Alan Turing with Irving John "Jack" Good in 1940, to measure the amount of information that could be deduced by the codebreakers at Bletchley Park using the Banburismus procedure, towards determining each day's unknown setting of the German naval Enigma cipher machine. The name was inspired by the enormous sheets of card, printed in the town of Banbury about 30 miles away, that were used in the process.[7]

Good argued that the sequential summation of decibans to build up a measure of the weight of evidence in favour of a hypothesis, is essentially Bayesian inference.[7] Donald A. Gillies, however, argued the ban is, in effect, the same as Karl Popper's measure of the severity of a test.[8]

Usage as a unit of odds

The deciban is a particularly useful unit for log-odds, notably as a measure of information in Bayes factors, odds ratios (ratio of odds, so log is difference of log-odds), or weights of evidence. 10 decibans corresponds to odds of 10:1; 20 decibans to 100:1 odds, etc. According to Good, a change in a weight of evidence of 1 deciban (i.e., a change in the odds from evens to about 5:4) is about as finely as humans can reasonably be expected to quantify their degree of belief in a hypothesis.[9]

Odds corresponding to integer decibans can often be well-approximated by simple integer ratios; these are collated below. Value to two decimal places, simple approximation (to within about 5%), with more accurate approximation (to within 1%) if simple one is inaccurate:

decibans exact
value
approx.
value
approx.
ratio
accurate
ratio
probability
0 100/10 1 1:1 50%
1 101/10 1.26 5:4 56%
2 102/10 1.58 3:2 8:5 61%
3 103/10 2.00 2:1 67%
4 104/10 2.51 5:2 71.5%
5 105/10 3.16 3:1 19:6, 16:5 76%
6 106/10 3.98 4:1 80%
7 107/10 5.01 5:1 83%
8 108/10 6.31 6:1 19:3, 25:4 86%
9 109/10 7.94 8:1 89%
10 1010/10 10 10:1 91%

See also

Notes

  1. ^ This value, approximately 103, but slightly less, can be understood simply because : 3 decimal digits are slightly less information than 10 binary digits, so 1 decimal digit is slightly less than 103 binary digits.

References

  1. ^ Klar, Rainer (1970-02-01). "1.8.1 Begriffe aus der Informationstheorie" [1.8.1 Terms used in information theory]. Digitale Rechenautomaten – Eine Einführung [Digital Computers – An Introduction]. Sammlung Göschen (in German). Vol. 1241/1241a (1 ed.). Berlin, Germany: Walter de Gruyter & Co. / G. J. Göschen'sche Verlagsbuchhandlung [de]. p. 35. ISBN 3-11-083160-0. ISBN 978-3-11-083160-3. Archiv-Nr. 7990709. Archived from the original on 2020-04-18. Retrieved 2020-04-13. (205 pages) (NB. A 2019 reprint of the first edition is available under ISBN 3-11002793-3, 978-3-11002793-8. A reworked and expanded 4th edition exists as well.)
  2. ^ Klar, Rainer (1989) [1988-10-01]. "1.9.1 Begriffe aus der Informationstheorie" [1.9.1 Terms used in information theory]. Digitale Rechenautomaten – Eine Einführung in die Struktur von Computerhardware [Digital Computers – An Introduction into the structure of computer hardware]. Sammlung Göschen (in German). Vol. 2050 (4th reworked ed.). Berlin, Germany: Walter de Gruyter & Co. p. 57. ISBN 3-11011700-2. ISBN 978-3-11011700-4. (320 pages)
  3. ^ Lukoff, Herman (1979). From Dits to Bits: A personal history of the electronic computer. Portland, Oregon, USA: Robotics Press. ISBN 0-89661-002-0. LCCN 79-90567.
  4. ^ "IEC 80000-13:2008". International Organization for Standardization (ISO). Retrieved 2013-07-21.
  5. ^ Hartley, Ralph Vinton Lyon (July 1928). "Transmission of Information" (PDF). Bell System Technical Journal. VII (3): 535–563. Retrieved 2008-03-27.
  6. ^ Reza, Fazlollah M. (1994). An Introduction to Information Theory. New York: Dover Publications. ISBN 0-486-68210-2.
  7. ^ a b Good, Irving John (1979). "Studies in the History of Probability and Statistics. XXXVII A. M. Turing's statistical work in World War II". Biometrika. 66 (2): 393–396. doi:10.1093/biomet/66.2.393. MR 0548210.
  8. ^ Gillies, Donald A. (1990). "The Turing-Good Weight of Evidence Function and Popper's Measure of the Severity of a Test". British Journal for the Philosophy of Science. 41 (1): 143–146. doi:10.1093/bjps/41.1.143. JSTOR 688010. MR 0055678.
  9. ^ Good, Irving John (1985). "Weight of Evidence: A Brief Survey" (PDF). Bayesian Statistics. 2: 253. Retrieved 2012-12-13.