Wednesday, April 26, 2023

Mathematical models, useful and otherwise

 kw: book reviews, nonfiction, mathematics, modeling, simulation, cautions, analysis

For a significant part of my career I worked with a group of talented computer programmers in a "skunk works" at an oil company. A colleague and I made up the Modeling and Simulation sub-group among the 20 members of the group. He and I developed software that simulated the production of crude oil and natural gas from kerogen, their migration upward through rock layers, and their accumulation against a trapping layer. No model is useful until it is checked against the real world, what we called "getting ground truth". I visited several exploration offices to show off the software and to use it with data those offices had on hand.

One memorable day in Louisiana, an explorer showed me the 3D seismic survey of one area. He pointed out the most likely source rock and explained the character of other layers, so we entered the appropriate setup parameters, "pointed" the software at the survey data, and let 'er rip. It showed progress over time, of the filling of trapped pools, as growing green blobs on a series of maps. He said, "OK, that blob is 'X' field, that one is 'Y' field,…but what is that?", pointing to a third blob between the other two. I answered, "I don't know, but I suspect it represents a lot of money." As it happened, the company had leases that covered most of the "what is that" area, but a deal had already been made to sell the leases to another company. That company made the money!

I had less exciting encounters with exploration geologists in Europe. The result was the validation of a useful model. Getting "ground truth" turned a simulation program into a tool the geologists could use to rank prospects.

Let me say right now that this tool is a million times less complex than the "general circulation models" (GCMs) used to forecast weather. Crude oil is gummy and moves slowly; air masses in the atmosphere, which are the elements of weather, move rapidly and swirl around on all scales. Oil forecasting is hard, but not as incredibly difficult as weather forecasting. So I was never faced with an irate caller complaining about "shoveling a foot of 'partly cloudy' from [his] @#&% driveway!"

Furthermore, I had the great good fortune to decide early in my career to "let the singers sing and the dancers dance": to turn over to the computer those tasks that are hardest for humans, while retaining tasks for the humans that we do better than computers. This led to very productive synergies. Far too many programmers spend years beating their heads against the wall trying to replace the human element. Futility personified.

I was delighted to read Escape From Model Land: How Mathematical Models Can Lead Us Astray and What We Can Do About It by Erica Thompson. She sets the tone early on by quoting statistician George Box: "All models are wrong, but some are useful." Those who forget to think this way, or never heard this aphorism, get stuck in Model Land.

The author continues with the observation by President Dwight Eisenhower, that "Plans are useless, but planning is indispensable." The thinking behind the model, or the plan, is the great value of the exercise. I also recall what Sun Tzu wrote in The Art of War, "No battle plan survives contact with the enemy." In more peaceable pursuits, contact with "ground truth" exposes the errors of every model. It is our task to determine the tolerable level of error, for we must typically carry on anyway.

A model is a tool. It can help us understand a process, and perhaps inform the solution to a problem. BUT no model solves any problem all by itself. Even better than one model, a suite of models, built with various assumptions and focusing on different sets of driving parameters, can help us set boundaries on the range of outcomes.

It doesn't seem so long ago that the fastest supercomputer needed to run for half a day to produce a 2- or 3-day forecast for a continent-sized area. Now numerous GCMs exist, and the weather forecasters collect the output from all of them. One result is a spaghetti plots of hurricane tracks. In this image, the letter codes such as COTC represent the names of the models used. The characteristics of a spaghetti plot are used to produce the "cone of likelihood" that is often shown.

The situation with climate modeling is far different. Escape focuses on two areas, because of current events. One is climate "change" (spoiler: it is always changing, but on a slow time scale) which is all based on modeling because we can't perform physical experiments. The second is the epidemiology of COVID-19, modeled numerous ways, and almost never properly! The disease fooled the "experts" almost daily, and the societal flailing around that resulted seems to have caused more harm than doing nothing. I called the CDC the "strategy of the week club."

Both phenomena became so intensely politicized that no actual science has been possible. A certain spokesman whose name I hate to utter said, "I am the science." Tantamount to blasphemy. Another group of mostly pundits and a few scientists bludgeoned the public with the notion of "settled science." There is no such thing, except perhaps certain portions of mathematical physics. Neither climate and weather, nor epidemiology, are amenable to mathematico-physics treatment.

I am an educated layman. I went to school in an era in which we were taught critical thinking, and learned to identify bias. Putting on those hats, I can say the following, first about climate, and then about the pandemic.

1) I learned to apply the mathematics used by Arrhenius to study the Greenhouse Effect before I was in high school. The simplest model of the atmospheric response to sunlight with respect to CO2 has four spectral regions:

  1. The Ultraviolet-Visible-Near Infrared region: wavelengths that are not affected by CO2.
  2. Three narrow bands of Medium Infrared in the range 2.5µ-4.5µ, one of which is fully lapped over by an absorption band of water vapor. These have little warming effect, but they are well positioned for optical CO2 detectors.
  3. A moderately wide band of Longwave Infrared centered on about 14µ, of absorption by CO2; the amount of absorption depends on the concentration. This is the "thermal IR" band of interest.
  4. The rest of the Infrared spectrum, Far-Infrared and so on; it is not affected by CO2.

Within region #3 there is a variable level of absorptivity, but once the concentration of CO2 reaches 0.2% (2,000 ppm, comparable to the level during the age of the dinosaurs), the "carbon dioxide window" is effectively "closed". At that point, within that wavelength region, about half of the infrared radiation from the warm ground is absorbed by the atmosphere and is reradiated, half to outer space, and half back down. At a specific temperature a balance is achieved. That temperature is 4°C warmer than the average global temperature in the year 1900. Today, with 400 ppm, we're at 2°C, and it will take much more than another 400 ppm to push into that +4°C region. The relationship is not linear.

What detailed computer modeling can do is to show where the warming is greater, and where it is less. We've been hearing for years that the warming is greater in the polar regions and less in the tropics. The general picture is 6°-7°C warming around the poles and less than 3°C warming in the tropics, when CO2 concentration exceeds 1,000 ppm. At this point the "window" is mostly closed already; extra warming greater than 1°C is unlikely.

The above discussion means that warnings about deadly heat waves in the tropics are overblown. On the other hand, we can expect some frozen polar areas to thaw. One effect I haven't heard the slightest discussion about is that Siberia, northern Canada, and the southern part of South America could be the next breadbaskets. Will the Sahara and Mojave/Sonoran deserts, in Africa and North America, respectively, get even drier and hotter? The computer models are inconsistent. Fretful silence on these questions reflects the uncertainty.

2) The situation of the COVID-19 pandemic, and the incredible array of opinion/ideology presented as "science" is a stunning spectacle. Roughly half the adults in the U.S. think that the crisis was exploited to the hilt for political purposes, partly to remove Donald Trump from office and even more to increase the scope of totalitarian control on the part of the Left. Meanwhile, the other half are thrilled that Trump is out of office, but ambivalent about J.R. Biden's performance.

If there has been any serious modeling of the epidemiology of the C19 virus, I haven't seen it. I've seen numerous toy models presented, followed by lots of screaming to "follow the science". When it became evident that actual science contradicts what the screamers are saying, they took up new mantras about "protecting Democracy" (which really means protecting political power for Democrats). Sadly, I still see people walking alone in near-isolation, wearing a bandanna or cheap mask or, if they have an actual N95 or KN95 mask, wearing it below the nose. Firstly, they are insane to wear the mask at all, and secondly, the "face covering" they are using is not effective. Close to 0%. Nearly everyone who caught C19 after mid-2020 was wearing a mask when they caught it.

There is one and only one valid reason to wear a mask outside, anywhere there is no crowd: To keep the sun off one's face. My wife does this. She wears a mask to keep her cheeks from getting burnt when doing yard work. Never any other time!

There are three simple models that can be used to understand the risks of contracting the COVID-19 virus, SARS-COV2, when outside, with or without a mask. Firstly, except in very humid weather, the virus aerosolizes rapidly. The tiny droplets that a mask would stop evaporate completely in just a few minutes. You can look up the formula (the first "model") to calculate how long a droplet of size 1µ or 5µ will evaporate, at different levels of humidity. That means that the virus particles, which have a diameter of about 120nm (0.12µ), are what your mask has to stop. This introduces the second model.

A N-95 mask is called that because it catches 95% of particles (virus or otherwise) in the size range near 300nm, where the mask is least effective. It is very nearly 100% effective for larger particles (which are caught mechanically) and smaller particles (which are caught electrostatically). Particles in the 120nm range are caught electrostatically with an efficiency near 97%. Think a moment. If there are few viruses about, only 3% of them will get through the mask, if you wear it correctly. But suppose you enter a very crowded area that includes perhaps half a dozen folks who are coughing out C19 particles. Then, 3% of that viral load may well be enough for you to be infected. It is a numbers game.

Thirdly, during the daytime the C19 virus is about twice as susceptible to being disabled by solar ultraviolet as the Ebola virus. I worked out the numbers: Between 10 AM and 2 PM solar time, 90% of virus particles exposed to sunlight are inactivated within about 45 minutes. During the next 3/4 hour, 90% of whatever is left is inactivated, and so forth. It's a statistical function, now long it takes before a UV photon strikes a particular virus particle in a vulnerable spot.

Now we can pull back from my cogitations and look at the book's conclusions. The main problem with any model is the person who uses it. A model will give definite results, but it is easy to forget that those results pertain to the model, not to the system being modeled. They may be close, or they may not. But properly used, a model helps you think about a system of interest. It can't decide for you! Letting models do the deciding is always, always, a travesty.

What does the author suggest? How can we escape from Model Land? Five points form the meat of the last chapter:

  1. Define the purpose. That purpose better not be "decide for me." The purpose has to include capturing the relationship between all the relevant parameters and the results. All...but learn what to neglect.
  2. Don't say, "I don't know". Ask, "What do I know now that I didn't know before?"
  3. Make value judgments. Every model reflects someone's values. Make sure the values behind the structure of a model are the right ones. Remember: genuine science is value-free. YOU supply the values.
  4. Write about the real world. Bring the model's conclusions into the real world, by getting "ground truth", for example.
  5. Use many models. Consider the spaghetti plot of the hurricane, shown above. If the system is "too simple" for multiple models to be generated, perhaps it is simple enough to comprehend without mathematical modeling.

It takes work just to understand this list. It is worth it! It takes much more work to carry out a modeling exercise that doesn't trap you in Model Land without an exit. I used a Jumping Ship metaphor at the start of this piece to show that sometimes we need to get far away, to seek a really different perspective. This is an extension of Principle #5 above.

Finally, consider this: You have two eyes for a reason. It is not just for parallax, to find out the 3D aspect of the situation. It is for completeness. Except in very simple views, your right eye will see something your left eye cannot, and vice versa. Even more, if you have someone standing near you, each of you will see things the other does not, particularly if you're looking in different directions. Using a numerical or mathematical model in too simple a manner, and yielding too much authority to it, is like viewing a complex scene using one eye from one viewpoint. I leave you with this proverb: "If two people have exactly the same opinion about everything, one of them is redundant."

This book is a great read!

No comments: