Polymath at Large: DNA is part of a feedback loop

kw: book reviews, nonfiction, science, genetics, dna, history, stories

As I mentioned a couple of reviews back I bought an eBook bundle, three by Sam Kean. This is the third, The Violinist's Thumb: And Other Tales of Love, War, and Genius, as Written in Our Genetic Code.

Firstly, a remark on the cover art:

The cover designs for the first two books are by Will Staehle. Some might call them "busy", but I really like them. Keith Hayes designed the third cover, and I think it matches the subject very well. Kudos to some folks who are seldom recognized.

The violinist in question is Niccolò Paganini (1782-1840), who was probably the most accomplished virtuoso ever. His astonishing ability is typically attributed to the freakish flexibility of his hand and finger joints (and all the other joints also). It is worth mentioning that he worked very hard, practicing endlessly. We find that such flexibility is both a blessing and a curse, a product of a handful of rare snippets of DNA, alleles (a better word than "mutations") that together affect the ligaments. The "curse" part is that such loose joints are frequently painful. However, the times being what they were, and the fact that Paganini also contracted both tuberculosis and syphilis, make his case hard to diagnose two centuries after the fact. Furthermore, the thumb of Paganini was notable not mainly for flexibility, but incredible strength. He could hold a saucer in one hand, press with the thumb, and break it.

Sam Kean likes his books to progress from the micro- to the macro-scale (In Disappearing Spoon, about the elements, each chapter had to have its own structure). The micro-scale of DNA is small indeed, and so many other authors have "gone into the weeds" with ribose sugars, bases, and hydrogen bonds, that there is little need to dwell on them yet again. The lower level stuff we'll find in this book is more about transcription, translation, and gene editing (I'd like to have seen more about gene editing, but in 2012 the details of intron removal and the various ways 20,000 "genes" can produce a few million proteins were little known, and much is still opaque on that subject).

As was brought out even more forcibly in Dueling Neurosurgeons, we learn a lot from the ways things go wrong. Paganini was one who turned a genetic handicap into a career. But when we say that a certain disorder is "in the genes", it is not always so clear-cut. Cystic fibrosis results when the cftr gene doesn't work right, allowing salt transport to fail and thin mucus in the lungs to thicken. Other syndromes such as diabetes and cancer do not result from faulty genes, per se, but from mis-regulation: a genetic sequence may be either over-stimulated (most cancers) or under-stimulated (hypoglycemia or certain kinds of diabetes). One tragic case in Chapter 8 showed that the placenta does not prevent all possible genetic transfer between mother and child, for example.

The actual number of human genes is still disputed. So is the definition of "gene". The prior dogma was DNA→RNA→protein. This is, like, dinosaur-level out of date! One article by M. Pertea and others counts 21,306 protein-coding genes and 21,856 non-coding genes. At one time, the only first group would have been called "genes". The non-coding genes carry on regulatory functions, such as directly triggering or halting the activity of a coding gene or producing RNA that does so less directly; or they affect how introns are removed and the bits of RNA are stitched together. Some proteins can only be produced after more than 100 strings of RNA are connected (and in the right order!) from a transcription that may be much, much larger than the final, edited transcript. There are tons of things going on that we don't yet understand.

Those 43,000+ "genes" still comprise only a few percent of our DNA. About 8% (at least twice as much!) is made up of various broken virus genomes, and perhaps some that aren't so broken. Retroviruses leave workable copies of their genome in every cell they infect. We all carry many such.

How does all this go together to produce a human? or, for that matter, a fruit fly, a tiny worm, or a blue whale? The book takes a step in this direction with Chapter 8: "Love and Atavisms; what makes a mammal a mammal?" Somehow, a line of reptiles developed the placenta, using lots of virus DNA to do so. Many details are found in this chapter. The placenta has a heck of a job. It has to protect a growing baby from the immune system of its mother, and it must also protect the mother from the developing immune system of this "new resident", all the while allowing the baby to conscript a large proportion of the mother's nutrition for its own use. Many viruses are adept at avoiding or even silencing immune system counterattacks, and these capabilities are built into cells that face both ways, outward from the baby/mother boundary. But the placenta is not bullet-proof, to mix a simile. A mother who has many sons may notice that the older ones are typically more "manly" and the younger ones less so, if not effeminate (Bible readers will recall that King David, though called "mighty", was the youngest, and smallest, of eight sons). This may be due to the mother's immune system getting better at influencing the environment of the developing baby within.

And what of our immense brains? In two places, the book discusses two genes, microcephalin and aspm, that are related to brain size. Certain alleles of these genes lead to babies with no brain or a very small one, a tragic circumstance. Still more fascinating, checking the DNA clock on these genes shows that the modern form of microcephalin arose about 37,000 years ago and soon swept through the entire population of humans, and aspm did the same thing about 6,000 years ago (Bible readers who happen to accept evolution will find that intriguing, because the Biblical story of God molding Adam's body and then putting a spirit of life into him is thought to date to just 6,000 years ago).

A big lesson of the book is the author's growing realization that DNA is not destiny. Or, not usually. There is seldom a single-point "thing" that causes a trait. Even blue eyes/brown eyes are more complicated than that, and eye color is a rather simple system. The author had a DNA test done, but initially asked that the gene(s) "for" Parkinson's Disease be hidden from him, because of family history. Months or years later he reports that he realized that DNA is probabilistic, not causative. So he unlocked the locked section, and found that there was apparently no problem, but then a revision a few days later showed a "slight chance" that he might develop Parkinson's. By then he had the mental fortitude to accept the news.

I've had similar worries, because one line of my family carries Alzheimer's Disease, and another line (this is very recent news to us) carries Lewy Body Dementia. Two arrows pointed at my brain. Considering my age and general health, Alzheimer's is a no-show (Mom was afflicted beginning in her fifties), but Lewy Body shows up later, so who knows? Probability is not destiny.

It will take a long time, may be a really, really long time, to know enough about DNA to begin to "take control". From time to time a sci-fi novel gets into this territory, and posits a future of people "engineered" to live on Mars without a spacesuit, or people with gills who can live under the sea, and so forth. We are a long, long way from learning if this is even possible without screwing up something else that underlies our humanity.

The last chapter introduces DNA as a computing mechanism. The stuff is very, very good at pattern matching. Some tests have been made, and the author describes a DNA algorithm to solve the "traveling salesman" problem, something your GPS unit has to do, and it usually does it pretty well. DNA has the potential to solve huge problems, like the salesman routing problem with 500 stops; we're talking age-of-the-universe time scales for today's supercomputers to deal with that one. However, once the "problem" is solved—it takes about a minute—you have a vat of DNA soup with "the answer" and billions or trillions of partial answers, and you are faced with winnowing out the longest chain in the whole bowl from all the others. That may also be an age-of-the-universe sized problem!

A side note: sorting has been studied more than any other kind of computer operation. The most optimized sort method can still take a long time if you need to sort billions of items. By contrast, the "spaghetti sort" is very fast. Just produce strands of spaghetti (or a more robust sort of rigid rod), cut to length, with the key value written on each one. For a modest size sort, a few hundred strands, you can hold it in your hand and stand it on the table. Then remove the strands, longest first, and read off the key numbers on each. Now, making the strands, and reading the results can be time consuming, but the sorting operation takes a fraction of a second. To sort billions or even trillions of "strands" (presumably of very long pieces of welding rod or something), the actual sorting operation would be almost instant, but the construction of the rods, and reading the results, are still incredibly time-consuming! That's my analogy to DNA calculation.

The book is incredibly fun to read. I like Sam Kean's writing. After catching up on books for other subjects, I may just snarf up another triple-pack of his books.

Polymath at Large

Monday, August 10, 2020

DNA is part of a feedback loop

No comments:

Post a Comment