Friday 17 February 2017

A PROGRAM FOR LIFE

Dick Pountain/Idealog 266/06 September 2016 12:41

Long-term readers of this column, assuming that any remain, may have become aware of a small number of themes to which I return fairly regularly. I don't count computing itself as one of these as all my columns are supposed to be about that. The top three of those themes are biology-versus-AI, parallel processing and object-oriented programming but recently, for the first time in 20+ years, I've begun to see a way to combine them all into one column. This one...

Last month I described how I plucked up the courage to catch up on Evo Devo (Evolutionary Developmental Biology), the recent science of the way DNA actually gets turned into the multitude of forms of living creatures. To relate Evo Devo to computing I knocked together a rather clunky analogy with 3D Printing, but since then it has occurred to me there's a far stronger potential connection which has yet to be realised. The aspect of modern genetics that lay people are most familiar with is the Human Genome Project, and in particular the claims that were made for it as a route to curing genetic diseases. Those cures have so far been slow in arriving, and Evo Devo tells us why: because the relationship between individual genes and properties of organisms is nowhere near so simple as was believed even 20 years ago. DNA-to-RNA-to-protein-to-physical effect isn't even close to what happens. Instead there's a small group of genes shared by almost all multi-cellular creatures that get used over and again for many different purposes, in many different places, at many different times, controlled by a mind-boggling network of DNA switches contained in that "junk" DNA that doesn't code for proteins. In short, genes are the almost static data inputs to a complex biological computer, contained in the same DNA, which executes programs that can build a mouse or a tiger from mostly the same few genes.

The Human Genome Project of course relied heavily on actual silicon computing power, not merely to store the results for each organism it sequenced - around 200GB per creature - but also to operate the guts of those automated sequencing machines that made it possible at all. However the data structures it worked with were fairly simple, mainly lists of pairs of the bases A, C, G, T. But let's suppose that the next generation project ought to be to simulate Evo Devo, in other words to mimic the way those lists of bases actually get turned into critters. Then you'd need some very fancy data structures indeed, ones that aren't merely static data but include active processes, conditional execution, spatial coordinates and evolutionary hierarchies. And of course all of these components already exist and are well understood in the world of computer science.

The first step would involve object-oriented programming: decompose those long lists (around 3 billion bases in each strand of your DNA) into individual genes and switch sequences, then put each one into an object whose data is the ACGT sequence and whose methods set up links to other genes and switches. You'd have to incorporate embryological findings about where in relative space (measured in cells within the developing embryo) and time (relative to fertilisation) each method is to be executed, and since millions of genes are doing their thing at the same, meticulously choreographed time, the description language would need to support synchronised parallelism. Having built such a description for many creatures, you could then arrange all their object trees into an inheritance hierarchy that accorded with the latest findings of evolutionary biology. If you were feeling mischievous you could call the base class of this vast tree "God".

Imagine that someone has made me Director of this project and promised me a few tens of billions of dollars, then I'd propose a new programming language be created as a hybrid of Python - which has excellent sequence handling, in addition to objects - and Occam, whose concept of self-syncing communication channels the IT business is only just catching up with after 30 years (witness Nvidia's Tesla P100 architecture). Naming it would be a problem as neither Pytham nor Octhon appeals and Darwin (the obvious choice) is already taken. The rest of the dosh would go towards a truly colossal multiprocessor computer, bigger than those used for weather forecasting, simulating nuclear explosions or the EU's Human Brain Project. Distribute the object tree for some creature over its millions of cores, connect up to a state-of-the-art visualisation system and you'd be in the business of Virtual Creation. And once it could tell mice from tigers, you'd perhaps be in the business of curing human diseases too. Unfortunately the people who have this sort of cash to spend would rather spend it on one-way trips to Mars...



No comments:

Post a Comment

SOCIAL UNEASE

Dick Pountain /Idealog 350/ 07 Sep 2023 10:58 Ten years ago this column might have listed a handful of online apps that assist my everyday...