On a recent walk I was listening to a podcast that discussed translations of ancient Sumerian tablets and the lore they contained about the Anunnaki. I won’t expound on it very much except to say that it conveys an incredible set of creation myths, many of which you can directly trace elements right up through history to the modern religious lore, legends and stories we know today. Through a modern day lens, it speaks of a race of what we would now call aliens from a different planet, descending in crafts and creating humans through genetic engineering of primates. Far-fetched or not, I find this 6000 year old story fascinating. One item caught my attention on my walk and it was that there were allusions to the fact that secrets could be found within our genetic code placed there by these creators. Being something I could potentially objectively test, I decided I would take a look!
I started with the assumption that there are markers to find in our DNA that contain information that is meant to be found and that there is no expectation of a shared language or format context between the author and the reader.
If I were to leave such a message for an alien to discover in a pattern of data it would seem obvious to encode a universal mathematical constant. I would do so with a level of precision such that it greatly exceeded the probability of a random match. I would design the DNA to ensure that this segment was resilient to mutation over time and I would perhaps bookend a payload of data with this constant. I would select an obvious encoding mechanism to ensure identification and would perhaps include an instruction set in the payload that defined a more complex encoding mechanism that could pack data tighter and incorporate aspects of error correction to account for mutations. In any event, finding a universal constant is my principle objective.
Given the four nucleotides (A, C, G and T) that define a strand of DNA, an obvious encoding would be base 4. Given a numeric value, I can convert it to base 4 and I can associate each of the four nucleotides to 0, 1, 2, or 3. Not knowing the correct mapping, I can easily test each permutation (4! = 24). As a side note, I also started down the path of searching using base 64 encoding based on triplets of nucleotides, which are known to be essential units in many genetic activities. The number of permutations this creates for my search though (64! = 1.27×1089) is infeasibly large to test, and worse, if I did find a match, given the number of permutations it would be statistically meaningless to find anything; it would be a poor choice to use unless there was a known key.
The unit-less constants of the universe that I decided to search for were as follows. Note that I also search for the reciprocal of each number because it occurred to me that this representation is often an arbitrary choice. I also search it both forwards and in reverse as I cannot assume which direction DNA is intended to be read.
- Pi (π):
3.14159265358979323846264338327950288419716939937510
… - Tau:
6.28318530717958647692528676655900576839433879875020
… - Euler’s Number (e):
2.71828182845904523536028747135266249775724709369995
… - Golden Ratio (φ):
1.61803398874989484820458683436563811772030917980576
… - Feigenbaum Constant (δ):
4.66920160910299067185320382046620161725818557747576
… - Feigenbaum Constant (α):
2.50290787509589282228390287321821578638127137672714
… - Apéry’s Constant (ζ(3)):
1.20205690315959428539973816151144999076498629234050
… - Khinchin’s Constant:
2.68545200106530644530971483548179569382038229399446
… - Glaisher-Kinkelin Constant (A):
1.28242712910062263687534256886979172776768892732500
… - Catalan’s Constant (G):
0.91596559417721901505460351493238411077414937428167
… - Brun’s Constant for Twin Primes (B2):
1.90216058310401225355083740436407981021334353001897
… - Plastic Number (ρ):
1.324717957244746025960908854478097340734404056901733
… - Square Root of 2 (√2):
1.41421356237309504880168872420969807856967187537694
… - Euler-Mascheroni Constant (γ):
0.57721566490153286060651209008240243104215933593992
… - Embree-Trefethen Constant:
2.29558714939263807403429804918949039
… - Conway’s Constant (λ):
1.303577269034296391257099112152551890730702504659939
… - Golomb-Dickman Constant (δ):
0.62432998854355087099293638310083724
… - Meissel-Mertens Constant (M):
0.2614972128476427837554268386086958590515666482617
… - Niven’s Constant:
1.705211140105367764288551453434508160477
… - Copeland-Erdős Constant:
0.235711131719232931374143179454494
… - Champernowne Constant:
0.1234567891011121314151617181920
… - Liouville’s Constant:
0.110001000000000000000001000
… - Gauss-Kuzmin-Wirsing Constant:
0.30366300289873265859744812190155623
… - Fransén-Robinson Constant:
2.807770242028519365221501186557772932130349804086
… - Mills’ Constant:
1.30637788386308069046861449260260571
…
I next secured a copy of the human genome (~3.5gb) and wrote the code in python, and after a bit of experimentation and adjustments I was ready to go. I decided to only search for the decimal portion of the values as I cannot make an assumption on how the integer and decimal portion might be delimited. I also searched only to 19 digits of precision. While this is not safely beyond statistical coincidence, it weeds out quite a bit and gives me the opportunity to apply some hands on evaluation of the results because a pattern is often like porn, you can’t define it but you know it when you see it!
For better context, here is a chart of the number of characters one can search for and the probability of finding something by coincidence in a data set of this size:
Length of Pattern | Probability of Randomly Finding |
14 | 1 in 1.000004 |
15 | 1 in 1.05 |
16 | 1 in 1.9 |
17 | 1 in 5.6 |
18 | 1 in 21 |
19 | 1 in 83 |
20 | 1 in 330 |
21 | 1 in 1317 |
22 | 1 in 5268 |
23 | 1 in 21071 |
24 | 1 in 84281 |
25 | 1 in 337123 |
26 | 1 in 1348489 |
Searching 25 constants plus their reciprocals with 24 permutations is 1,200 searches. At a length 19, all things being equal (which they rarely are), I statistically should expect maybe 14 matches. My scan in fact found 36 matches across only 11 of the 50 constants being searched for (25 constants + 25 reciprocals) as shown below. I don’t take this higher find rate to be significant on its face.
- Pi (π): 1 Match
- Tau: 1 Match
- Euler’s Number (e): 1 Match
- Apéry’s Constant (ζ(3)): 1 Match
- Reciprocal of Khinchin’s Constant: 10 Matches
- Brun’s Constant for Twin Primes (B2): 1 Match
- Square Root of 2 (√2): 1 Match
- Reciprocal of Golomb-Dickman Constant (δ): 1 Match
- Reciprocal of Champernowne Constant: 15 Matches
- Gauss-Kuzmin-Wirsing Constant: 2 Matches
- Reciprocal of Mills’ Constant: 2 Matches
Inspecting these matches manually did not reveal anything that jumped out to me. Some matches do in fact match to 20 characters (not just 19), such as Pi, but none match to 21+ so this does not exceed statistical likelihood of being found. I did not see any instances of bookending a payload of a reasonably small size. I attribute those that matched with high counts to numeric repetition in the digits that correspond to a number of repetitive sequences that one can find in the genome, for example, the [12] repetition in [0121212121222321230] (Reciprocal of Champernowne Constant in base 4).
Bottom line, nothing screams out to me that a secret message is hidden in the code of the human genome, using my line of inquiry at least
I also went pretty far down the road of searching for palindromes in the genome as well as creating 2d images of various dimensions to look for a visual pattern. There are so many trivial palindromes in the genome of material length that I felt that was a poor path. In the visual patterns I did not identify anything of note.
Below I have stripped the result log down to the details of all the constant matches for those that care to explore these results further:
Base 4 representation of 3.14159265358979323846264338327950288419716939937510: 3.021003331222202020112203002031030103012120220232000313001303 search pattern: 3[0210033312222020201]1220300203
Match found: Pattern 3 ATCAAGGGCTTTTATATAC (Mapping: 0=A, 1=C, 2=T, 3=G) 465138218 to 465138237 AAAAGAAGCA[CATATATTTTCGGGAACTA]GGGGGGTGTG 0000300310[1020202222133300120]3333332323Base 4 representation of 6.28318530717958647692528676655900576839433879875020: 12.102013323111010100231012010122120212030301101130001232003212 search pattern: 12[1020133231110101002]3101201012
Match found: Pattern 19 AGTGACCTCAAAGAGAGGT (Mapping: 0=G, 1=A, 2=T, 3=C) 1547710374 to 1547710393 GTTTTCTGGC[AGTGACCTCAAAGAGAGGT]AGTATGGCTT 0222232003[1020133231110101002]1021200322Base 4 representation of 2.71828182845904523536028747135266249775724709369995: 2.231332011101120220223231022212222333130111202000213033103303 search pattern: 2[2313320111011202202]2323102221
Match found: Pattern 13 TGAGGTCAAACAATCTTCT (Mapping: 0=C, 1=A, 2=T, 3=G) 1483253494 to 1483253513 TTCAATCTCC[TGAGGTCAAACAATCTTCT]CACCTCAGCC 2201120200[2313320111011202202]0100201300Base 4 representation of 1.20205690315959428539973816151144999076498629234050: 1.030323220000103300001202010320031301130111301121321221001333 search pattern: 1[0303232200001033000]0120201032
Match found: Pattern 5 ACACTCTTAAAAGACCAAA (Mapping: 0=A, 1=G, 2=T, 3=C) 2222406228 to 2222406247 ACAGAAATTG[AAACCAGAAAATTCTCACA]GAACAGATGC 0301000221[0003301000022323030]1003010213Base 4 representation of 0.3723767915432130620414228265: 0.113311100111313113132232230312113022001322331222322213232033 search pattern: 0[1133111001113131131]3223223031
Match found: Pattern 6 GGTTGGGAAGGGTGTGGTG (Mapping: 0=A, 1=G, 2=C, 3=T) 2346461166 to 2346461185 GACCCAGAGG[GGTTGGGAAGGGTGTGGTG]GGCTGACTGG 1022201011[1133111001113131131]1123102311
Match found: Pattern 8 AACCAAATTAAACACAACA (Mapping: 0=T, 1=A, 2=G, 3=C) 1309683221 to 1309683240 AGAGAAGATC[AACCAAATTAAACACAACA]GCAGGGAGGC 1212112103[1133111001113131131]2312221223
Match found: Pattern 13 AAGGAAACCAAAGAGAAGA (Mapping: 0=C, 1=A, 2=T, 3=G) 2732275857 to 2732275876 CTCACTTTAT[AGAAGAGAAACCAAAGGAA]ACAAAGAAAG 0201022212[1311313111001113311]1011131113
Match found: Pattern 13 AAGGAAACCAAAGAGAAGA (Mapping: 0=C, 1=A, 2=T, 3=G) 2780550767 to 2780550786 ATGTCCCCCC[AGAAGAGAAACCAAAGGAA]GAGCGTGGGG 1232000000[1311313111001113311]3130323333
Match found: Pattern 16 TTAATTTCCTTTATATTAT (Mapping: 0=C, 1=T, 2=G, 3=A) 2612833054 to 2612833073 ACCAAGATAA[TATTATATTTCCTTTAATT]AAACTCAAGG 3003323133[1311313111001113311]3330103322
Match found: Pattern 17 GGTTGGGCCGGGTGTGGTG (Mapping: 0=C, 1=G, 2=A, 3=T) 426715349 to 426715368 AAAACATAAG[GGTTGGGCCGGGTGTGGTG]GCTCACGCTT 2222023221[1133111001113131131]1030201033
Match found: Pattern 20 AATTAAAGGAAATATAATA (Mapping: 0=G, 1=A, 2=C, 3=T) 2135692048 to 2135692067 TTAAAGTATA[AATTAAAGGAAATATAATA]TTTGGAGATG 3311103131[1133111001113131131]3330010130
Match found: Pattern 20 AATTAAAGGAAATATAATA (Mapping: 0=G, 1=A, 2=C, 3=T) 1282086015 to 1282086034 CAGACTGCTG[ATAATATAAAGGAAATTAA]ATGCCTGCAG 2101230230[1311313111001113311]1302230210
Match found: Pattern 24 CCAACCCGGCCCACACCAC (Mapping: 0=G, 1=C, 2=T, 3=A) 2252506106 to 2252506125 AGGCATGAGC[CACCACACCCGGCCCAACC]ACACACTTTT 3001320301[1311313111001113311]3131312222
Match found: Pattern 24 CCAACCCGGCCCACACCAC (Mapping: 0=G, 1=C, 2=T, 3=A) 2672006322 to 2672006341 AGGCGTGAGC[CACCACACCCGGCCCAACC]CTGTCTCTTA 3001020301[1311313111001113311]1202121223Base 4 representation of 1.90216058310401225355083740436407981021334353001897: 1.321233033332332002300021223321201113112311032131303013233210 search pattern: 1[3212330333323320023]0002122332
Match found: Pattern 18 ATGTAACAAAATAATCCTA (Mapping: 0=C, 1=G, 2=T, 3=A) 949142393 to 949142412 TTTTTAAATA[ATGTAACAAAATAATCCTA]ATTCCTTCCT 2222233323[3212330333323320023]3220022002Base 4 representation of 1.41421356237309504880168872420969807856967187537694: 1.122200213212121333032330302100202302332301031212322221111331 search pattern: 1[1222002132121213330]3233030210
Match found: Pattern 17 GAAACCAGTAGAGAGTTTC (Mapping: 0=C, 1=G, 2=A, 3=T) 669045677 to 669045696 AGACATTATG[GAAACCAGTAGAGAGTTTC]CTAAAAATAT 2120233231[1222002132121213330]0322222323Base 4 representation of 1.601717070059087553367897578: 1.212200220201100111211011311113101100000300320322113002331013 search pattern: 1[2122002202011001112]1101131111
Match found: Pattern 15 ATAACCAACACTTCCTTTA (Mapping: 0=C, 1=T, 2=A, 3=G) 1622660849 to 1622660868 TGTATAAAAC[ATTTCCTTCACAACCAATA]ACTGTGTATT 1312122220[2111001102022002212]2013131211Base 4 representation of 8.100000067076033613307319674: 20.012121212122232123001322212112013101302120102110311201122010 search pattern: 20[0121212121222321230]0132221211
Match found: Pattern 2 ATGTGTGTGTGGGCGTGCA (Mapping: 0=A, 1=T, 2=G, 3=C) 1666256002 to 1666256021 AAGGTGTGTG[ATGTGTGTGTGGGCGTGCA]CACGAGTGTG 0022121212[0121212121222321230]3032021212
Match found: Pattern 2 ATGTGTGTGTGGGCGTGCA (Mapping: 0=A, 1=T, 2=G, 3=C) 2878631215 to 2878631234 ACATGCACAT[ATGTGTGTGTGGGCGTGCA]CCGCGCTGTG 0301230301[0121212121222321230]3323231212
Match found: Pattern 2 ATGTGTGTGTGGGCGTGCA (Mapping: 0=A, 1=T, 2=G, 3=C) 378133506 to 378133525 AGATGCACAC[ACGTGCGGGTGTGTGTGTA]TGGGGGACTC 0201230303[0321232221212121210]1222220313
Match found: Pattern 3 ACTCTCTCTCTTTGTCTGA (Mapping: 0=A, 1=C, 2=T, 3=G) 2187099647 to 2187099666 ATGCAAATGT[AGTCTGTTTCTCTCTCTCA]AAATATCTTA 0231000232[0321232221212121210]0002021220
Match found: Pattern 3 ACTCTCTCTCTTTGTCTGA (Mapping: 0=A, 1=C, 2=T, 3=G) 3234406458 to 3234406477 AAAAGATAAC[AGTCTGTTTCTCTCTCTCA]GTTCCAGCTG 0000302001[0321232221212121210]3221103123
Match found: Pattern 5 AGTGTGTGTGTTTCTGTCA (Mapping: 0=A, 1=G, 2=T, 3=C) 1737810067 to 1737810086 TGTGTGTGTG[AGTGTGTGTGTTTCTGTCA]TGATGGCAAC 2121212121[0121212121222321230]2102113003
Match found: Pattern 11 TGAGAGAGAGAAACAGACT (Mapping: 0=T, 1=G, 2=A, 3=C) 1220060902 to 1220060921 ATGGATGGCG[TCAGACAAAGAGAGAGAGT]TTATGCAGGG 2011201131[0321232221212121210]0020132111
Match found: Pattern 18 CGTGTGTGTGTTTATGTAC (Mapping: 0=C, 1=G, 2=T, 3=A) 1212095067 to 1212095086 TATGCGTGTG[CATGTATTTGTGTGTGTGC]ATTTTAGGGG 2321012121[0321232221212121210]3222231111
Match found: Pattern 18 CGTGTGTGTGTTTATGTAC (Mapping: 0=C, 1=G, 2=T, 3=A) 1556512220 to 1556512239 CTACCCCACG[CATGTATTTGTGTGTGTGC]TGAGCTGTGA 0230000301[0321232221212121210]2131021213
Match found: Pattern 18 CGTGTGTGTGTTTATGTAC (Mapping: 0=C, 1=G, 2=T, 3=A) 1631873544 to 1631873563 CCATACAAGG[CATGTATTTGTGTGTGTGC]ATATGTATGT 0032303311[0321232221212121210]3232123212
Match found: Pattern 18 CGTGTGTGTGTTTATGTAC (Mapping: 0=C, 1=G, 2=T, 3=A) 2058895657 to 2058895676 TATATGTGTG[CATGTATTTGTGTGTGTGC]ATATATTTGT 2323212121[0321232221212121210]3232322212
Match found: Pattern 18 CGTGTGTGTGTTTATGTAC (Mapping: 0=C, 1=G, 2=T, 3=A) 2249360950 to 2249360969 TTTCTATATA[CATGTATTTGTGTGTGTGC]ATGTGTATAC 2220232323[0321232221212121210]3212123230
Match found: Pattern 21 GTATATATATAAACATACG (Mapping: 0=G, 1=T, 2=A, 3=C) 253561184 to 253561203 GAATATAAGT[GTATATATATAAACATACG]TATATATAGG 0221212201[0121212121222321230]1212121200
Match found: Pattern 23 GCACACACACAAATACATG (Mapping: 0=G, 1=C, 2=A, 3=T) 322593139 to 322593158 ACACGTGAGT[GCACACACACAAATACATG]CCTGCACATG 2121030203[0121212121222321230]1130121230
Match found: Pattern 23 GCACACACACAAATACATG (Mapping: 0=G, 1=C, 2=A, 3=T) 1352863119 to 1352863138 GTTACTATAC[GTACATAAACACACACACG]TATATACATA 0332132321[0321232221212121210]3232321232Base 4 representation of 0.30366300289873265859744812190155623: 0.103123303123302213102020220333132132301220233201320131231232 search pattern: 0[1031233031233022131]0202022033
Match found: Pattern 7 ATGACGGTGACGGTCCAGA (Mapping: 0=T, 1=A, 2=C, 3=G) 1947941593 to 1947941612 TTTCACTTTA[AGACCTGGCAGTGGCAGTA]AAAGTGCACA 0002120001[1312203321303321301]1113032121
Match found: Pattern 7 ATGACGGTGACGGTCCAGA (Mapping: 0=T, 1=A, 2=C, 3=G) 3147352655 to 3147352674 TTTCACTTTA[AGACCTGGCAGTGGCAGTA]AAAGTGCACA 0002120001[1312203321303321301]1113032121Base 4 representation of 0.7654752980377371033867398966: 0.300333120300122233133022200233303312112022213323013122000330 search pattern: 0[3003331203001222331]3302220023
Match found: Pattern 16 ACCAAATGCACCTGGGAAT (Mapping: 0=C, 1=T, 2=G, 3=A) 2373697505 to 2373697524 GTCACTGTAC[ACCAAATGCACCTGGGAAT]CTCAATTCCA 2103012130[3003331203001222331]0103311003
Match found: Pattern 23 TGGTTTCAGTGGCAAATTC (Mapping: 0=G, 1=C, 2=A, 3=T) 1427800378 to 1427800397 ATTTTTTCTC[TGGTTTCAGTGGCAAATTC]AGTTCCCTCC 2333333131[3003331203001222331]2033111311