A big universe: recursive problem solving and bitstrings

Will you be a compiler or just another program in the operating system that is (encoded by) this universe? Answering this leads to the phrase "self fulfilling prophecy." Do you see why?

All this is something you learn in formal comuting science but I never thought about the connections between computing and biology. HOW evolution solves problems is how we need to solve problems. There is a general strategy at work here: generation and selection.

So I explained my recursive universe bitstring metatheory to my 12 year old neighbour and she got it instantly. My daughter is taking a while. It is because my neighbour UNDERSTANDS math and my daughter likes to memorise everything since it's easier for her to memorise than to understand in the short term. If you tend to memorise math then you won't understand this easily. This is the problem with education systems that stress memorisation of mathematical concepts. It lets you do powerful things but it ruins your creativity.

Recursion is the basic process by which you can provably get something out of nothing. I will prove this. Let's say you have a universe (or set or object) that is nothing and contains nothing. We can write this as {}. Let's use the symbol 0, indicated by the label ZERO, to indicate such a universe (or set or object) that is, and contains, nothing. (This is what society uses.) A universe or set or object containing nothing contains *something* and the number of objects in this meta universe/set/object is ONE indicated by the symbol 1. By continuing the recursive process, one can obtain the information contained in any universe/set/object.

This new meta universe/set/object thus maps to {{}, {0}} = {0,1} where 0 signifies the empty universe/set/object and 1 signifies the set containing the empty universe/set {0}. The total number of universes/sets/objects in this metameta universe/set/object is now TWO indicated by the string 10. (Why? Answer this without reading further to see if you understand.)

Keep recursing using only the symbols 0 and 1. The next metametameta universe will will look like {0,1,10} where the string 10 maps to a universe/set/object containing ONE universe/set/object and a universe/set/object containing ZERO universes/sets/objects. This new metameta universe/set/object has a total of THREE universes/sets/objects and thus the next metametametameta universe will look like {0,1,10,11} where the string 11 indicates a universe/set/object with the mapping {0,1,10}.

Continue recursing to infinity. It should be thus clear that every universe/set/object now maps to a natural number (the Gödel number) and that every natural number maps to binary. The level of recursion gives you the position or exponent (P for power). The number of states is the base (B for base). The number of occurrences of a given universe/set/object type is the mantissa/significand/coefficient (#). Thus any natural number is equal to the sum of # * B^P for each position.

You should be able to see at this point how you can represent any natural number in this universe using just the symbols 0 and 1. You can represent any floating point or real number in this universe using the additional symbol . (dot or period). Thus real numbers are all mappable to natural numbers but there are provably more real numbers than natural numbers in this universe. This is proved using Cantor's diagonalisation argument.

If it sounds like I'm reinventing set theory, I am, but I'm mapping this back to the natural worlds. QM and relativity are provably mathematically equivalent using this recursive bitstring model. We've been talking about the same thing all along! Since the entire universe is mappble to a bitstring, evolution is simply growing the bitstring. In the real universe, 0 and 1 corresponds to on or off states which gives rise to numerous dualities. The Gödel number is just counting the total number of states.

The recursive universe model thus explains everything about everything. This is all common and trivial knowledge one could say but of what practical use is it? The clarity in the mapping is what lets us do solve problems in biology. So thus protein tertiary structure (folding) is protein primary structure (i.e., with a 1:1 mapping to the genetic code) + search/sampling/enumeration + selection. In nature, the specific search and selection method is calculated by the process of (biological) evolution.

From a computational viewpoint, A solution to ALL problems is a bitstring. A solution will contain all bitstrings that we'll label "knowledge" and a bitstring that represents an "algorithm" that operates on the existing knowledge bitstrings to yield the entire solution bistring. (As an exercise, this ALSO applies to the recursive bitstring universe solution that answers the question of the universe. How? Why is "42" the real solution to the universe? What is the connection between the string "42" and the string "11" which represents the number of dimensions in this universe according to string and M-theory :?)

ALL algorithms themselves are bitstrings. The combination of algorithm and knowledge should yield the bitstring that is the Göodel number for the problem universe/set/object. What I call bioinformatics is the discovery of knowledge. What I call computational biology is the discovery of the algorithm (which may or may not be the same as the one used by nature, but they have to yield the same result). All algorithms and knowledge is further decomposable into subalgorithms and subknowledge.

How do you create a Gödel number? You take the previous Gödel number and 1 to it. More generally, you can decompose the Gödel number into other numbers represented by a base, an exponent, and a mantissa/significand/coefficient. Thus the solution to any problem is to have an algorithm or method to break it down into subproblems (represented by a bitstring labelled subproblem generation or sampling), an algorithm or method to solve each of the subproblems (represented by a bitstring labelled subproblem selection), and enumerating the second algorithm on the output of the first algorithm.

Doing this entire process is (computational and biological) science. Just simple subproblem generation or discovery using bench methods is bioinformatics ("data mining", "stamp collecting"). Applying the subproblem selection algorithm to the subproblems generated is computational biology. Existing knowledge is stored in the bitstring that maps to the mantissa. The representation of the subproblems is the base. The enumeration maps to the exponent. Just like how exponents and mantissas are interchangeable as you switch bases, you can switch between existing knowledge and enumeration between different hierarchies. RAP offers a solution to the subproblem of selection by traversing between atom types and residue types. A MCSA or MD method offers a solution to the subproblem of generation or sampling. Sampling + selection method in a given base is identical to what evolution is doing. Information between sampling and selection is interchangeable just like the information between expontent and mantissa by changing the base.

CANDO solves the problems of drug discovery by enumerating all compounds and all proteins and developing an algorithm that can predict whether or not a given compound binds to a protein (simple yes/no is enough). Thus the use of FDA approved drugs or human ingestible compounds ("repurposing") and all protein structures ("multitargeting" with "dynamics") is NECESSARY AND SUFFICIENT to solve the problems.

FDWD is basically CANDO but done at the level of individual proteins and compounds by breaking a compound up into fragments and by looking at protein structure binding sites.

By traversing hierarchies in CANDO (doing all compounds with all fragments) SIMULTANEOUSLY, we're getting two birds with one stone which is again necessary to solve the bigger problem. CANDO solves both the docking and drug discovery problems simultaneously.

The potential we ultimately develop as part of CANDO (PCC) will be the generalised RAP and can be used to generally predict protein and proteome structure, function, and interaction with high accuracy. The PPC problem needs to be solved simultaneously here.

Solving PPC + PCC will lead to solving PNC. PPC + PCC + PNC = ASB. Once ASB is done the generalised RAP will just fall out.

CANVO is going to be the way the PPC problem will be solved. I can see it now. CANNO is what people who are working on expression arrays are already solving and it may well be a solved problem. CANNO + ASB (CANSO? :) is what will solved simultaneously. *None* of this depends on kinetics and rate constants as you can see. Kinetics is neither necessary not sufficient which is the problem I have with it. Sure, you can get kinetics data but so what? The recursive model of the universe is what will be used for data storage also. So I think modern computers are indeed capable of modelling whole cells at a detailed atomic level. Wow, I never thought I'd say this but I can it now also. It is recursive bitstring objects that can hold the information...

Thus performing genetic algorithms on quantum relativitic objects gives rise to (protein) structure which is structure (inter) acting on itself. (Protein) function is (protein) structure (inter) acting on itself and other structures. (Protein) interaction is structure and function (inter) acting on itself and other structures and functions. The abstract scale hierarchies are atoms, macromolecules, cells, tissues, organs, and organisms. All these interactions in spacetime give rise to life and intelligence.

Substitute "proteome" for "protein" to see that there is no such thing as an isolated "protein" structure, function, and interaction in biology (in vivo). Substitute "interactome" for "proteome" to see that type hierarchies are proteins, DNA, RNA, and small molecules. Types are distinct but from a conceptual viewpoint we can argue that all of life is based on objects with protein types interacting with all the other objects (though in an abstract sense, interaction is a two way street).

If the process is recursive, then the solution must be recursive. Now given the nature of information theory, there is potentially more than one way to get the SAME bitstring. Thus there are many solutions to the same problem but in the end all solutions must give the same results and all solutions must be recursive in nature (even if they are not implemented as such). Evolution (not just biological evolution) is a recursive process. It is through recursion that information is created and retained so that you can build things from the ground up.

This is why I started working on protein structure with the goal of modelling cells and life itself. This is why the statement "drug discovery is protein folding with a compound." This is why I've made so many deep statements without even undersanding why I've made them sometimes, since I just use the recursive model. "Nature doesn't have a protein folding problem; it's we humans that do" is a statement of mine that appears in the first sentence of a paper. The recursive model explains that. Michael Levitt said "it's easier to break things than to put them apaprt." The recursive model explains that. John Moult said that "Science is the greatest achievement of mankind." The recursive model explains that. The recursive model explains the science/religion duality.

If you use a base two representation to count, which is the representation used by computers, then you can see why all Gödel numbers (in any base) are equivalent.

This is why you have dualities and reproduction. Reproduction is simply where you can distinguish one universe/set/object from another.

From here, it is obvious how this is a theory of *everything* if you know enough about about QM, relativity and compiler or formal language theory. This model is also known as quantum information theory (which is the same as M-theory and string theory and relativity aka quantum loop gravity).

This is why I have been saying all along that sentience is recursive; in any evolutionary system that can talk about evolution (which we're in), sentience is an inevitable outcome. I have been right about almost everything in this universe and the only reason I have been wrong is because I didn't understand the recursive steps or didn't trust my gut. Majoring in computing science and genetics with minors in math and microbiology;, CASP1, CASP2, CASP3, RAPDF, CF, graph theory... you name it. Anything we've published a paper on, going from single molecules to genomes and proteomes and now to all of life and even the universe and unifying relativity, quantum mechanics, chaos and complexity theory, a bottom up approach to *EVERYTHING* makes perfect sense. Even my tastes in movies, music, philosophy, religion. And *everyhing I know about this universe* is consistent with my model of the universe.

The only *mistakes* I've made in my life I can explain according to the recursive bitstring model. I made a choice I didn't understand. But perhaps because I made that choice I now was able to come up with this theory. This isn't determinable, but it is what is and I feel this is right. This is where the limts that Gödel talked about come into play. In fact, it is because I made that choice I've been struggling so long with WHY, WHY, WHY. Now I know EXACTLY WHY. I was looking for *something*. I've found it. (Here's a prediction: people with addictive personalities will tend to be looking for *something* out of *nothing*.)

In our group, *everything* we do is a build up of everything that is done previously and it will remain that way. Variations to be successful need to be minor and cannot blatantly contradict everything we know about the laws of mathematics and physics. This self consistency check is what we call "science". Humans have a great capacity to fool themselves (short cut the recursion, which is due to "intelligence") which is why it is important to have others point out flaws in your recursive thinking to make sure you're not recursing on the wrong problems. For example, addiction.

Why do we have so man different personalities in this universe? Sensitivity to initial conditions. Protein folding is chaotic. Any chaotic process cannot be understood for more than few steps. IF you don't understand the process, then you cannot see past the choices that need to be made. Thus with my model it is preordained that I will among the people to understand how the universe works and solve the problem of understanding life, etc. Everything I've set out to do I'm capable of doing and provably so. The question is who among all those who interact with me and read this also understand life and go on to do great things. This is *predetermined* meaning that it is *predeterminable*. If you're *willing to understand* this model in its entirety, then *you will understand* it. If you understand this model, there is no doubt in my mind that you will put it to good use, ceteris paribus. The caveat is "all other things being equal" and in a chaotic system, they rarely are since there are other agents in the system. But you have the guaranteed ability to influence yourself and no one else.

There is no doubt that you MAY also do great things without understanding this model. Correlation does not mean causality. I'm talking about causal outcomes.

Will you be a compiler or just another program in the operating system that is (encoded by) this universe? Answering this leads to the phrase "self fulfilling prophecy." Do you see why?

Pseudointellectual ramblings || Ram Samudrala || me@ram.org