Sunday, October 10, 2021

Give it to me straight, Doc

 kw: book reviews, nonfiction, science, publications, research, fraud, bias, negligence, hype, polemics

When I was a chemistry major, taking organic chemistry, we had a lab exercise called The Martius Yellow Competition. It's a famous experiment, designed at Harvard, in which we had to produce seven compounds, one of which was the famous dye Martius Yellow. The dye is protein-specific, making it useful for staining certain cell preparations for microscopy. That also makes it problematic if you get it on you. It stains your skin bright yellow, and the stained skin takes about a month to grow out. A few of my fellow students finished that lab day with big yellow blotches on their hands and even faces.

What is of interest here is that two of the compounds—all are crystalline solids at room temperature—are hard to crystallize. To get one of them to precipitate out of solution, we had to cool the solution on an ice bath and then scratch the bottom of the flask, carefully, with a metal spatula. The other needed exposure to UV light, which we accomplished by putting the flask on a window sill for an hour or so.

Of course, if you already have a little bit of the target material on hand, in crystalline form, you can get a solution to crystallize quickly by seeding it with a bit of dust from crushing a tiny crystal of the same stuff. And here the professor told us a story. One of his mentors was an elderly chemist who had a beard. Once in a while one of his students would be having a hard time getting a solution to crystallize. He would call the professor over for advice or help. The professor would look at the flask, scratching his beard, and then the desired crystals would begin to form! As we heard the story we first thought the old professor must have poor hygiene habits. But No: Then he told us the old fellow knew what to anticipate, and early in the morning of such a day, he would put a bit of solution on his beard and let it dry. The scratching would release a few seed crystals into the air. When any of them fell into the flask in question, it would start the crystallization!

Such playful tricks aside, during my extended education (14 years at four universities and colleges), I learned that some scientists play fast and loose with their "science." Not all published "science" is genuine, and some of it is downright dangerous. I knew professors who were, quite simply, frauds. I don't wish to mention any names, because others have already exposed the worst of them, or their "work" has been superseded anyway. But I learned that there are several ways to get results into print even if you have no useful results to report. It's the fruit of the perverse motivation system called "Publish or Perish".

Thankfully, I don't need to get into detail, because a real scientist, Dr. Stuart Ritchie, has written a great book about how science goes wrong: Science Fictions: How Fraud, Bias, Negligence and Hype Undermine the Search for Truth. Dr. Ritchie is a real scientist, in contrast to myself: I got the degrees, but spent almost half a century writing software for scientists, without doing any science. Much of my value was making sure they got their math right, because I remembered all the calculus they had forgotten. I am particularly adept at statistics, so I know deeply how easily they can be misused to cook up a result almost out of thin air.

As an illustrative aside: I am an amateur radio operator (a Ham). On one occasion the members of a radio club I was part of visited an amateur who specialized in moon-bounce communication. He had a steerable antenna the size of a barn door, fed by a thousand-watt transmitter. It was just barely capable of getting a signal to the Moon and hearing it when it came back. The signal was noisy and barely discernable above background noise. Sometimes a Morse code "dah" would be broken up and sound like two "dits". (The dah is three times as long as the dit). Our host told us that sometimes the signal is so buried in the noise, and you listen so hard, that you can imagine an entire conversation out of random noise.

This is relevant. Many, many published results are based on something called "statistical significance", which has a criterion called the p value, with a "significance threshold" of 0.05. As long as the p value is less than 0.05, the result is considered "significant". It requires backward thinking to understand a p value. It is the probability that the "result" you obtained could have happened completely at random. Sometimes you will hear it said that there's only one chance in twenty, or less, that the conclusion you have drawn is incorrect.

That is not a very strict criterion. If you peruse scientific literature that includes statistical analysis, you are likely to find that most of the papers show results with a p value only a little below the threshold: 0.048, 0.04, 0.045, and so forth. Sometimes a "more robust" result will be reported, with a p value of 0.01 or even 0.005. To me, that is more like it. Because if you have thirty or so publications, all touting a p value just below 0.05, you have to say to yourself, "At least one of these is likely to be false. Maybe more than one." Then you should ask, "How can I find out which?"

The "how" is to replicate the experiment. Some experiments aren't too hard to replicate. What if the new experiment gets a different result? You can't stop there and say "It was wrong." It requires digging deeper, and nailing it down, then finding a journal to publish your counter-article (which can be remarkably hard; see the author's first story). In Science Fictions you'll read about some of the ways one can follow up.

What is more serious is the practice of hiding results that didn't work out, often called Null results. This is called the File Drawer Bias. For some kinds of experiments, particularly in psychology and medicine, there may be five or ten times as many "results" in the file drawer as the ones that were published. Automatically, we have to realize that a quarter to a half of the published reports are probably incorrect. Again, replication might be able to clear the matter up. However, doing experiments takes time and money. Our author reports a partial solution that is being implemented by many funding bodies, both governmental and private: They will only support the experiment if it is pre-registered and the results are guaranteed to be published. Such as system can still be gamed, but it is harder.

All this shows up close to halfway into the book, where the author tackles the issue of Publication Bias. Earlier, he takes on Fraud, as the most dangerous, and paradoxically, often the hardest to deal with in any timely way. A few heartbreaking stories are told, such as a surgeon who claimed he had perfected an artificial trachea to replace one damaged by accident or cancer. All of his patients died, usually after only a few months. Yet he was protected by the institution where he worked because of his fame. But eventually it all blew up. Far too many charlatans get famous enough (based on little substance!), that they are protected this way. And let us not forget the "accepted science" of the late 1700's, which resulted in the physicians for retired President George Washington bleeding him nearly dry because of a bad cold, such that he died.

Fortunately, we are more likely to encounter various kinds of bias. We are all biased. I was fortunate enough to take a class in literary discernment (I don't recall its actual title). We read articles from a great many publications. Some journals that I remember were The Wall Street Journal, Commonweal, National Review, The New Republic, The London Times, and The New York Times. We learned that every writer is biased, as is every editor. We learned to determine the bias of each writer and, from multiple articles, the likely bias of the editorial board for some of the publications. We also learned how to tell if a writer or editor is aware of the bias and has tried to mitigate it to any degree. Hint: Look for the number of modifiers (adjectives and adverbs) and their "flavor" (for example, "The company reached a compromise with the plaintiff" compared to "…reached a risky compromise…" or "…reached a satisfactory compromise…", and also compare to "…barely reached…"). Honest editors remove as many modifiers as possible, keeping only those that bear their weight in meaning.

I don't know how much Negligence is a problem. The stories didn't stick with me. Hype was of greater interest. Who remembers (from 1989) Cold Fusion? Lots of hype. Eventually, a total fizzle. The first story in the "Hype" chapter of Science Fictions tells of a supposed bacterium that used arsenic instead of phosphorus in its biochemistry. It turned out to be a story of contamination in the lab, not novel biochem in the field. It seems to be accepted today for a scientist with any kind of result to issue a press release long before submitting an article for peer review and publication. Perhaps the Snake Oil guy above would fit better alongside this paragraph!

But, seriously, what, oh what, can be done about it? Dr. Ritchie knows science from the inside. His last two chapters plus the Epilogue have suggestions that look workable to me. They primarily deal with incentives. Some of the current incentives seem designed to reward bad science. The simplest example is the Publish or Perish atmosphere in which tenure is only to be had by publishing at a superhuman level. This rewards "salami slicing", in which work that has several results will be published as several small papers rather than one that links them all together.

A family proverb is the "Moses method": To change the system, get everyone into the wilderness and wait 40 years for the older generation to die off. I hope the suggestions of Dr. Ritchie can make great inroads into the mess we are presently in, a lot quicker than 40 years.

Special bonus feature:

I sometimes tell scientists I know my suggestion for getting lots of good science done:

  • Do a sketchy experiment to test an outlandish hypothesis. Drag it out until you get some kind of publishable result.
  • Publish, with much fanfare.
  • Based on the publication, trawl for funding to do more experiments to "confirm" your finding.
  • Publish again; two papers if possible. Many more, if you can.
  • Produce plenty of fanfare, including "stick in your eye" statements to rile up the establishment.
  • Repeat as much as possible or until you can't get more funding.
  • Some angered scientists will publish rebuttals. Some may even try to replicate your result.
  • Answer every rebuttal, vociferously, in multiple venues if possible.
  • Publish a "synthesis" of the entire matter. Be sure to cite all your prior work. Your "citation index" gets you noticed more.

This will get a lot of scientists to work their butts off to prove you wrong. One side effect is likely to be some unexpected, good science. Then, if you want to retain a shred of reputation, publish again, "de-biasing" your results, with more modest conclusions and a bit of mea culpa about the "little bit of overreach" in which you formerly indulged.

And now, back to our regularly scheduled program. It's a great book!

No comments: