The field of statistics is often said to have been born in
1662 out of the collaboration between John Graunt and William Petty to develop
human statistical and census methods. However, the earliest writing on
statistics is actually found in a 9th-century book called "Manuscript
on Deciphering Cryptographic Messages," written by Al-Kindi.
Al-Kindi (801–873 AD) was, among other things, an Arab
mathematician during the Abbasid Caliphate. One of his greatest works in the
field of statistics was the world’s first description and use of statistics – applied
to cryptanalysis – which he discussed in the aforementioned book. In his
discussion of the quantitative techniques in cryptanalysis, Al-Kindi explained how to use letter frequency statistics to solve a given cryptogram:
“One way to solve an encrypted message,
if we know its [original] language, is to find a [different clear] text of the same
language long enough to fill one sheet or so and then we count [the occurrences of]
each letter of it. We call the most frequently occurring letter the “first”, the
next most occurring the “second”, the following most occurring the “third” and
so on, until we finish all different letters in the clear text [sample]. Then we
look at the cryptogram we want to solve and we also classify its symbols. We find
the most occurring symbol and change it to the form of the “first” letter [of the
clear text sample], the next most common symbol is changed to the form of the “second”
letter, and the following most common symbol is changed to the form of the “third”
letter and so on, until we account for all symbols of the cryptogram we want to
solve… But if the cryptogram is too short, equivalence does not apply, letter ranks
are not correct and [consequently] a second trick should be used to recover letters…”
There are many other contributions Al-Kindi has made to
statistics. He discusses the principles of cryptanalysis, where he explains four
cryptanalysis methods for normal texts and an additional method for decrypting
poetry. He also discusses cipher types, Arabic phonetics, Arabic syntax, and he
conducted a study on Arabic letter frequency, the first such study ever done.
But rather than expand on those, I would like to present an astonishing example
of frequency analysis. I found this example on the website of the Department of
Mathematics at Aarhus University in Aarhus, Denmark:
PIRIJSGYTDQRKVIPITKGTHMJUQRPDLSQQCKWSCMSSCIKPDBDJKNRPYLSKGKBYMJT
SCIMPMARKJSPDAUSDKJQSKDSMPIKGTIPMJTHKPIIXSIJQDVISCMJLPIVDKUQGYSC
KUBCSSCIWKPTRDLCIPDJIUPKLIMJGMJBUMBIQRKHIQNPKHSCIMPMADRWKPTQDNPS
CIJDJSCRIJSUPYMPMAQRDIJSDQSMGFDJTDDQSCIMUSCKPKNSCIKGTIQSFJKWJAKK
FKJRPYLSKGKBYMJSITMSDJBMJYKSCIPAYHKPISCMJSCPIICUJTPITYIMPQSCDQLM
LIPCDBCGDBCSQSCIQLIRDNDRRKJSPDAUSDKJQKNQKHIMPMARPYLSKGKBDQSQAMQI
TKJSCIJIWGYTDQRKVIPITTKRUHIJSQSCMSDJRGUTIAKKFQKNMGFDJTDDAJMTGMJM JTDAJMTTUPMDCDHNMRSKPQAIBDJTSCIIHIPBIJRIMJTMTVMJRIHIJSKNMPMARPYL
SKGKBYMPITDQRUQQITSCITDQRKVIPDIQPILKPSITDJSCDQLMLIPLUQCSCINPKJSD
IPQKNSCICDQSKPYKNRPYLSKGKBYAMRFAYMAKUSNDVICUJTPITYIMPQ
We know that the language of this message is English. First
by analyzing the frequency of letters on a page of English text we find that “e”
is the most common letter, followed by “t.” In this message above, “I” is the
most common letter followed by “S.” So I = e and S = t.
PeReJtGYTDQRKVePeTKGTHMJUQRPDLtQQCKWtCMttCeKPDBDJKNRPYLtKGKBYMJT tCeMPMARKJtPDAUtDKJQtKDtMPeKGTePMJTHKPeeXteJQDVetCMJLPeVDKUQGYtC
KUBCttCeWKPTRDLCePDJeUPKLeMJGMJBUMBeQRKHeQNPKHtCeMPMADRWKPTQDNPt
CeJDJtCReJtUPYMPMAQRDeJtDQtMGFDJTDDQtCeMUtCKPKNtCeKGTeQtFJKWJAKK FKJRPYLtKGKBYMJteTMtDJBMJYKtCePAYHKPetCMJtCPeeCUJTPeTYeMPQtCDQLM
LePCDBCGDBCtQtCeQLeRDNDRRKJtPDAUtDKJQKNQKHeMPMARPYLtKGKBDQtQAMQe
TKJtCeJeWGYTDQRKVePeTTKRUHeJtQtCMtDJRGUTeAKKFQKNMGFDJTDDAJMTGMJM
JTDAJMTTUPMDCDHNMRtKPQAeBDJTtCeeHePBeJReMJTMTVMJReHeJtKNMPMARPYL
tKGKBYMPeTDQRUQQeTtCeTDQRKVePDeQPeLKPteTDJtCDQLMLePLUQCtCeNPKJtD ePQKNtCeCDQtKPYKNRPYLtKGKBYAMRFAYMAKUtNDVeCUJTPeTYeMPQ
Substituting into the message, we find many instances
popping up of “tCe.” That looks a lot
like “the,” so C = h.
PeReJtGYTDQRKVePeTKGTHMJUQRPDLtQQhKWthMttheKPDBDJKNRPYLtKGKBYMJT
theMPMARKJtPDAUtDKJQtKDtMPeKGTePMJTHKPeeXteJQDVethMJLPeVDKUQGYth
KUBhttheWKPTRDLhePDJeUPKLeMJGMJBUMBeQRKHeQNPKHtheMPMADRWKPTQDNPt
heJDJthReJtUPYMPMAQRDeJtDQtMGFDJTDDQtheMUthKPKNtheKGTeQtFJKWJAKK
FKJRPYLtKGKBYMJteTMtDJBMJYKthePAYHKPethMJthPeehUJTPeTYeMPQthDQLM
LePhDBhGDBhtQtheQLeRDNDRRKJtPDAUtDKJQKNQKHeMPMARPYLtKGKBDQtQAMQe
TKJtheJeWGYTDQRKVePeTTKRUHeJtQthMtDJRGUTeAKKFQKNMGFDJTDDAJMTGMJM
JTDAJMTTUPMDhDHNMRtKPQAeBDJTtheeHePBeJReMJTMTVMJReHeJtKNMPMARPYL
tKGKBYMPeTDQRUQQeTtheTDQRKVePDeQPeLKPteTDJthDQLMLePLUQhtheNPKJtD
ePQKNthehDQtKPYKNRPYLtKGKBYAMRFAYMAKUtNDVehUJTPeTYeMPQ
Now, “H” appears only once in the entire message here. We
find that on the page of English text, c, u, m, w, f, g, y, p, b also only
appear once. So H may be any one of those letters. To find out which one it is,
we isolate the group of letters that surround the only “H” in the message: theeHe. We narrow the possibilities for what “H”
can be down to m, w, g, y based on the fact that they are the only letters that
could possibly make a meaningful word in that position (emerge, ewe, egest,
eye). We think “emerge” is the most likely word, so we go with H = m. We continue doing this for each letter
until we decrypt the message:
recentlydiscoveredoldmanuscriptsshowthattheoriginofcryptologyand
thearabcontributionstoitareolderandmoreextensivethanpreviouslyth
oughtthewordcipherineuropeanlanguagescomesfromthearabicwordsifrt
heninthcenturyarabscientistalkindiistheauthoroftheoldestknownboo
koncryptologyantedatinganyotherbymorethanthreehundredyearsthispa
perhighlightsthespecificcontributionsofsomearabcryptologistsbase
donthenewlydiscovereddocumentsthatincludebooksofalkindiibnadlana
ndibnadduraihimfactorsbegindtheemergenceandadvancementofarabcryp
tologyarediscussedthediscoveriesreportedinthispaperpushthefronti
ersofthehistoryofcryptologybackbyaboutfivehundredyears
References:
Ibrahim A. Al-Kadi (1992): ORIGINS OF CRYPTOLOGY: THE ARAB CONTRIBUTIONS, Cryptologia, 16:2, 97-126.
Kryptologi De arabiske bidrag, af Marc Skov Madsen, Department of Mathematics, Aarhus University, Aarhus, Denmark
No comments:
Post a Comment