Friday 17 February 2017

A PROGRAM FOR LIFE

Dick Pountain/Idealog 266/06 September 2016 12:41

Long-term readers of this column, assuming that any remain, may have become aware of a small number of themes to which I return fairly regularly. I don't count computing itself as one of these as all my columns are supposed to be about that. The top three of those themes are biology-versus-AI, parallel processing and object-oriented programming but recently, for the first time in 20+ years, I've begun to see a way to combine them all into one column. This one...

Last month I described how I plucked up the courage to catch up on Evo Devo (Evolutionary Developmental Biology), the recent science of the way DNA actually gets turned into the multitude of forms of living creatures. To relate Evo Devo to computing I knocked together a rather clunky analogy with 3D Printing, but since then it has occurred to me there's a far stronger potential connection which has yet to be realised. The aspect of modern genetics that lay people are most familiar with is the Human Genome Project, and in particular the claims that were made for it as a route to curing genetic diseases. Those cures have so far been slow in arriving, and Evo Devo tells us why: because the relationship between individual genes and properties of organisms is nowhere near so simple as was believed even 20 years ago. DNA-to-RNA-to-protein-to-physical effect isn't even close to what happens. Instead there's a small group of genes shared by almost all multi-cellular creatures that get used over and again for many different purposes, in many different places, at many different times, controlled by a mind-boggling network of DNA switches contained in that "junk" DNA that doesn't code for proteins. In short, genes are the almost static data inputs to a complex biological computer, contained in the same DNA, which executes programs that can build a mouse or a tiger from mostly the same few genes.

The Human Genome Project of course relied heavily on actual silicon computing power, not merely to store the results for each organism it sequenced - around 200GB per creature - but also to operate the guts of those automated sequencing machines that made it possible at all. However the data structures it worked with were fairly simple, mainly lists of pairs of the bases A, C, G, T. But let's suppose that the next generation project ought to be to simulate Evo Devo, in other words to mimic the way those lists of bases actually get turned into critters. Then you'd need some very fancy data structures indeed, ones that aren't merely static data but include active processes, conditional execution, spatial coordinates and evolutionary hierarchies. And of course all of these components already exist and are well understood in the world of computer science.

The first step would involve object-oriented programming: decompose those long lists (around 3 billion bases in each strand of your DNA) into individual genes and switch sequences, then put each one into an object whose data is the ACGT sequence and whose methods set up links to other genes and switches. You'd have to incorporate embryological findings about where in relative space (measured in cells within the developing embryo) and time (relative to fertilisation) each method is to be executed, and since millions of genes are doing their thing at the same, meticulously choreographed time, the description language would need to support synchronised parallelism. Having built such a description for many creatures, you could then arrange all their object trees into an inheritance hierarchy that accorded with the latest findings of evolutionary biology. If you were feeling mischievous you could call the base class of this vast tree "God".

Imagine that someone has made me Director of this project and promised me a few tens of billions of dollars, then I'd propose a new programming language be created as a hybrid of Python - which has excellent sequence handling, in addition to objects - and Occam, whose concept of self-syncing communication channels the IT business is only just catching up with after 30 years (witness Nvidia's Tesla P100 architecture). Naming it would be a problem as neither Pytham nor Octhon appeals and Darwin (the obvious choice) is already taken. The rest of the dosh would go towards a truly colossal multiprocessor computer, bigger than those used for weather forecasting, simulating nuclear explosions or the EU's Human Brain Project. Distribute the object tree for some creature over its millions of cores, connect up to a state-of-the-art visualisation system and you'd be in the business of Virtual Creation. And once it could tell mice from tigers, you'd perhaps be in the business of curing human diseases too. Unfortunately the people who have this sort of cash to spend would rather spend it on one-way trips to Mars...



CICÁDAMON GO?

Dick Pountain/Idealog 265/08 August 2016 09:54

Just back from a holiday in Southern France where I spent much of my time reclining in a cheap, aluminium-framed lounger under an olive tree in the 39° heat, reading a very good book. One day I went down to my tree and found a bizarre creature sitting on the back of my lounger: it was however not any member of the Pokémon family but "merely" a humble cicada. The contrast between a grid of white polythene strips woven on an aluminium frame and the fantastic detail of the creature's eyes and wings could have been an illustration from the book I was reading, "Endless Forms Most Beautiful" by Sean B. Carroll.

Endless Forms is about Evo Devo (Evolutionary Developmental Biology), written by one of its pioneers, making it the most important popular science book since Dawkin's Blind Watchmaker, and in effect a sequel. Dawkins explained the "Modern Synthesis" of evolutionary theory with molecular biology: the structure of DNA, genes and biological information. Carroll explains how embryology was added to this mix, finally revealing the precise mechanisms via which DNA gets transcribed into the structure of actual plants and animals. It's all quite recent - the Nobel (for Medicine) was awarded only in 1995, to Wieschaus, Lewis and Nüsslein-Volhard - and Carroll's book was published in 2006. That I waited 10 years to read it was sheer cowardice, now bitterly regretted because Carroll makes mind-bendingly complex events marvellously comprehensible.  And I'm writing about it here because Evo Devo is all about information, real-time computation and Nature's own 3D-printer.

I've written in previous columns about how DNA stores information, ribosomes transcribe some of it into the proteins from which we're built, and how much of it (once thought "junk") is about control rather than substance. Evo Devo reveals just what all that junk really does, and Charles Darwin should have been alive to see it. DNA does indeed encode the information to create our bodies, but genes that encode structural proteins are only a small part of it: most is encoded in switches that get set at "runtime", that is, not at fertilisation of the egg but during the course of embryonic development. These switches get flipped and flopped in real time, by proteins that are expressed only for precise time periods, in precise locations, enabling 3D structures to  be built up. Imagine some staggeringly-complex meta-3D printer in which the plastic wire is itself continuously generated by another print-head, whose wire is generated by another, down to ten or more levels: and all this choreographed in real time, so that what gets printed varies from second to second. My cowardice did have some basis.

However the even more stunning conclusion of Evo Devo is that, thanks to such deep indirection and parameterisation, the entire diversity of life on this planet can be generated by a few common genes, shared by everything from bacteria to us. Biologists worried for years that there just isn't enough DNA to encode all the different shapes, sizes, colours of life via different genes, and sure enough there isn't. Instead it's encoded in relatively few common genes that generate proteins that set switches in the DNA during embryonic development, like self-modifying computer code. And evolutionary change mostly comes from mutations in these switches rather than the protein-coding genes. Life is actually like a 3D printer controlled by FPGAs (Field Programmable Gate Arrays) that are being configured by self-modifying code in real time. Those of you with experience of software or hardware design should be boggling uncontrollably by now.

Carroll explains, with copious examples from fruit flies to chickens, frogs and humans, how mutations in a single switch can make the difference between legs and wings, mice and chickens, geese and centipedes. He christens the set of a few hundred common body-building genes, preserved for over 500 million years, the Tool Kit, and the mutable set of switches that make up so much of DNA are the operating system for it. I'll not lie to you: his book is more challenging than Blind Watchmaker (most copies of which remain unread) but if you're at all curious about where we come from it's essential.

But please forgive me if I can't raise much excitement for Pokémon Go. Superimposing two-dimensional cartoon figures onto the real world is a bit sad when you're confronted by real cicadas. Some of these species have evolved varying hibernation cycle times to prevent predators relying on them as a food source. To imagine that our technology even approaches the amazing complexity of biological life is purest hubris, an idolatry that mistakes pictures of things for things themselves, and logic for embodied conciousness. If the urge to "collect 'em all" is irresistible, why not become an entomologist?

SOCIAL UNEASE

Dick Pountain /Idealog 350/ 07 Sep 2023 10:58 Ten years ago this column might have listed a handful of online apps that assist my everyday...