Sample Text
While Emily Dickinson wrote that:
- Much madness is divinest Sense
- To the discerning Eye. . .
the problem lies in the discernment. Distinguishing meaningful
utterances from nonsense is not a trivial task. Confronted with a
lengthy text in an unknown script, how does one determine whether those
characters in fact contained a meaningful text, or were simply set using
the equivalent of printer's pi or a lorem ipsum style text?
The problem is important in cryptography and other intelligence
fields, where it is important to distinguish signal from noise.
Cryptanalysts have devised algorithms for this purpose, to determine
whether a given text is in fact nonsense or not. These algorithms
typically analyse the presence of repetitions and redundancy in a text;
in meaningful texts, certain frequently used words -- for example, the,
is, and and in a text in the English language -- will
occur over and over again. A random scattering of letters, punctuation
marks, and spaces will not exhibit these regularities. Zipf's law
attempts to state this analysis in the language of mathematics. By
contrast, cryptographers typically seek to make their ciphertexts
resemble random distributions, to avoid tell-tale repetitions and
patterns that may give an opening for cryptanalysis. |