Wednesday, February 19, 2025

Book as Blog

 kw: book reviews, nonfiction, essays, science, blogs

I think mushrooms are lovely. I tried collecting them when I was about ten, and soon found that they quickly rotted. Not having access to formaldehyde or pickling alcohol, nor a supply of large jars (Mom was definitely not keen on letting me use her canning jars!), I gave up the pursuit, and collected less perishable items, such as stamps and rocks. Mushrooms are sometimes dangerous also. Although only about one percent of fungi species are toxic, a handful of those, mostly of the genus Amanita, are deadly in even small amounts. 

This picture, generated by Imagen 3 (Gemini), is based on the appearance of Amanita phalloides, the "death cap" mushroom, but in fanciful colors. Gemini wrote the haiku and I chose the typeface.

Toxic mushrooms are included in a sort of rogues' gallery of harmful things in nature, in the essay "But It's Natural!" in the book Superfoods, Silkworms, and Spandex: Science and Pseudoscience in Everyday Life by Dr. Joe Schwarcz. The appellation "natural" is applied to a great many things, particularly foods. As the author points out, the term is meaningless and usually misleading. To quote a few sentences from this essay,

Take pollen from one flower, sprinkle it on a different type of flower, and a new variety of flower emerges. Is that flower natural? It would not have been produced if a human hand had not intervened. But isn't the hand also natural?

"Natural" has become a marketing term, as has the term "Superfoods", the subject of the essay "Superfoods and Superhype", a valuable survey of the ways folks with something to sell attempt to mislead the public. His earlier book, Quack Quack: The Threat of Pseudoscience, digs much deeper into the phenomenon, particularly in medicine and chemistry. Willow bark contains a natural analgesic, but it harms the stomach; Aspirin is a chemical derivative that works better and is less harmful. The "natural" is worse than the "artificial." (Image via Dall-E3)

This book has no chapters. The text consists of 75 essays strung together like posts in a blog, with only a headline/title separating them. In the introductory pages we find a list of 18 other books by Dr. Schwarcz, and I found it interesting that nine of the titles contain the phrase "Science of Everyday Life" (this book makes the tenth), four contain "Chemistry of Everyday Life", six contain the word "Commentaries", and three contain the word "Inquiries". Ten titles also include a number, as in "…62 All-New Commentaries…". I suspect that he has a blog to which he has contributed for one or more decades, in addition to weekly columns in at least two newspapers. Preparing a book, then, is not so much writing anything new, but compiling from such sources. Just for context: This (my) blog has nearly 3,000 posts written over the past 20 years, and I have other writings, so perhaps I could get a few publications together; it's a thought…

Dr. Schwarcz has a mission, to combat errors of scientific reporting. These days the term "misinformation" is bandied about, but it has unfortunate political overtones. In fact, he has apparently fallen afoul of establishment misinformation regarding treatments for COVID-19. Such treatments became extremely political footballs. The author writes, "Dr. Vladimir Zelenko's claim of having successfully treated thousands of COVID-19 patients using hydroxychloroquine (HCQ), azithromycin, and zinc sulfate has been widely disputed." Yes, it has, primarily by "official" voices that have since been discredited. I must comment further.

HCQ and Ivermectin were both widely disparaged, but only after President Trump mentioned them. Suddenly, two extremely safe medications were declared "risky", "dangerous", even "deadly", and doctors who had the temerity to prescribe either medication were threatened with losing their license. I know a few archaeologists, including a relative of mine. Whenever they go into the jungles of Central America, they take HCQ to prevent malaria. Another close relative of mine contracted a case of worms from eating sushi, and was treated with Ivermectin, quite successfully and safely I might add. It isn't just a "horse dewormer" as some declared, it is a "human dewormer" also. We must understand: neither of these medications has any action against the SARS-COV2 virus. Rather, they modulate the immune system so as to prevent pneumonia from developing. HCQ works best very early, from the time symptoms appear for 2-3 days, and Ivermectin works best for a longer term. This is not medical advice; I am repeating observations by honest doctors. If you need to know more, or get treatment, find an honest doctor, if there are any in your area. Secondarily, because the mucus in the lungs that develops during pneumonia is made from glucose, low blood sugar is preventive. Therefore, at the first sign of illness, fast for a couple of days. The old adage, "Feed a cold, starve a fever" is about preventing pneumonia caused by respiratory diseases accompanied by fever; it was good advice in Colonial times, and is still good advice.

Bottom line: Doctor Zelenko was right. Call him Jeremiah: never wrong, never believed.

The author is a chemist, so naturally, a good number of the essays relate to chemistry, usually as brief historical surveys that emphasize the accidental or surprise nature of discoveries. "Graphene!" is one such. Something was observed a few times over many years as various researchers "messed with" graphite, but graphene was not identified until 1962. This single-layer substance made only of carbon atoms has great potential. It is tricky to produce in quantity, very hard to produce in sheets big enough to do anything with, and potentially toxic when nano-sized, as it so often is. The illustration (I used ImageFX) evokes the way graphene sheets jumble atop one another when graphite is disturbed.

The essays range far and wide, however from the origins of "duck tape" or "duct tape" (whether the "duck" fabric came first, or the use of such tape on ductwork, isn't clear) to the "tin" in "Tin Pan Alley" to a more-balanced-than-usual survey of "forever chemicals", the perfluorocarbon chemicals used to make Teflon® and related products, and substances that arise as side products of such manufacture. And a whole lot more. Seventy-five great essays!

Friday, February 14, 2025

For AI art, word order matters, but less than I expected

 kw: art generation, ai art, experiments, word order, semantics

Continuing my Troglodyte series of generated images, I began rewriting the long prompts I'd been using, putting the main description of the room first, followed by a phrase about the cave, followed by other items and details. I noticed after a while that it has been harder to elicit images with lots of cave decoration (stalactites, etc.). I began to wonder if the elements of a prompt were somehow treated like the ingredients list on a cereal box, in order by quantity (or by importance in this application). For example, here is a prompt I used a year ago for "Cave Dining Room":

A room in a spectacular cave that has many stalactites and stalagmites, with flowstone on the room's walls, fitted out as a grand dining room with a long table and at least twelve chairs, a chandelier over the table, and a buffet stand nearby

This is the prompt I used in the past few days:

A dining room in a spectacular natural cave with stalactites and stalagmites and flowstone, with seating for twenty or more, with buffet to the side and chandeliers from the ceiling, plus a grandfather clock and a large pantry, and the floor is natural stone with a patterned rug under the table

Here are the resulting images, both from Leonardo AI. Note that the settings were not exactly the same, but here my focus is on the difference in the amount of cave decoration.



The upper image is a little more "cavey"; the lower image is the best of a dozen or more attempts to get the feel I wanted. Some of the images had ceilings that looked more like tangled tree roots, others were almost smooth, though rounded and arched.

In some settings Leonardo AI has an option for "AI Enhancement" of the prompt. It also implements the Style and other settings by modifying the prompt internally. I could only get an inkling of the latter phenomenon by saving an image, because about 50 characters of the prompt used are included in the file name. I say "an inkling" because an AI-enhanced prompt balloons to a few hundred characters. In either case, by studying prompt enhancement, I find that enhancement is primarily done by adding adjectives and sometimes adjective-noun groups (such as the phrase "vibrant cinematic photo").

I designed an experiment to see how much word order matters when a prompt consists entirely of nouns:

meadow, mountains, flowers, butterflies, birds

In Leonardo AI's Classic Mode (its other mode is Flow, which I'll say more about in another post), I first used the Flux Schnell model and the Creative style, 16x9 aspect ratio, small image size (1184x672), with a fixed seed of 142857. When I downloaded the image, the file name was

Flux_Schnell_a_surreal_and_vibrant_cinematic_photo_of_meadow_m_0.jpg

Thus, the prompt had been enhanced because of the Creative style. The program also appends a number so it can distinguish repeated uses of a prompt.

To keep to the bare 5-word prompt only I switched the style to None. Here are the two resulting images, full size (1184x672), enhanced prompt above, 5-word prompt below:


The images are very similar, with some interesting differences. The upper (enhanced) one has no birds; of the five birds in the lower (ordinary) one, the two birds at upper left replaced ambiguous-looking butterflies and a butterfly at far left appears to be a bird-butterfly hybrid. The trees are similar, but the mountains have certain differences, and the enhanced image appears more stormy or foggy. Take note of the yellow flower at bottom center. It is one of several persistent elements from image to image, with only one significant variation I'll point out later on.

The next two images have "rotated" prompts, first "mountains, flowers, butterflies, birds, meadow" and then "flowers, butterflies, birds, meadow, mountains".


In each image there are three birds flying in the distance, and in the upper image in particular, a couple of rather ambiguous flying things. You may have noticed that all the butterflies are Monarchs. A significant change from the upper to the lower image is the bokeh (out-of-focus look) in the distance below, whereas the mountains are sharp above. The next two images are from the next two rotated prompts, "butterflies, birds, meadow, mountains, flowers" and "birds, meadow, mountains, flowers, butterflies".


The upper image has no clearly-defined birds. The five butterflies at the top in a cluster are all distorted, having either birdlike aspects or extra wings. The upper image also has out-of-focus mountains and trees, but in the lower image everything is in focus, plus there are added trees to the left. That is all the prompt rotations.

As a further experiment I added a few words to the prompt, and then as a last experiment, added more words to make it a descriptive phrase. The two prompts are:

birds in the foreground, meadow, mountains, flowers, butterflies

birds in the foreground of a mountain meadow with flowers and butterflies


In each image, a single bird is in the foreground, as requested, but not "birds". Nothing is in the sky; all the butterflies are also in the foreground. The big yellow flower has been replaced in the upper image by three smaller flowers. Both images have significant bokeh, in both background and the immediate foreground. The trees and mountains are also a bit more different from the prior six images, moreso than those six differ among themselves.

Note that all of these use the same seed: 142857. It's a favorite number of mine, being the repeating unit of the decimal expansion of 1/7. A couple of times I tested seed consistency by repeating the generation without changing anything. So far as I could tell, the images were pixel-by-pixel identical. So the only thing available to cause differences between the images would be differences in the prompt.

I don't know how a prompt is turned into "tokens", which are numbers that represent conceptual entries in a database of "meanings", as "meanings" might be understood in the context of generative AI. The order that they occur clearly matters, but not by a great deal. Adding directive words, such as "in the foreground" and "glue words", the little articles, conjunctions, and prepositions between the nouns, also made a difference, otherwise the last two images would be identical.

I don't know why some images are in focus throughout, while others have various levels of bokeh in the background and/or immediate foreground.

I have learned a couple of things, and uncovered further mysteries about art generation. The adventure continues!

Wednesday, February 12, 2025

Music for another planet

 kw: book reviews, nonfiction, music, pop music, musical revolutions, biography

You just gotta know your own limitations. I just read Rob Sheffield's new book Heartbreak is the National Anthem: How Taylor Swift Reinvented Pop Music. Ms Swift is a genius, a phenomenon, an incredible performer…with an oeuvre that leaves me unmoved.

Mr. Sheffield's semi-biography, full of inside allusions and abbreviations (most of which went right past me) is written for her fans. About halfway through I realized that he has mastered the art of writing in a female voice. That makes him quite a phenomenon also. It ensures a connection with Taylor Swift's primary fan base.

To be banal about it: The book outlines the career of Taylor Swift, which now spans 19 years, and in particular, all the swerves and redirections she has made to keep it going and rising. It isn't mentioned clearly, that she knows keenly just how short the attention span of the music-consuming public is. Apparently, she is delighted to oblige, and she manages to pivot before anyone else notices the need to pivot.

The last time I went to a Pop music concert, it was to see Peter, Paul and Mary in 1971. If I listen to music without just letting it be a background for something else I am doing, it is to learn to sing it (This doesn't take long: When I was in high school I first heard Bob Dylan sing "The Times, They are a-Changin'", on the radio during a family road trip with my Dad driving. I grabbed some paper and wrote out the lyrics on the spot). Thus, reading and hearing about people who spend hours and hours just listening, sometimes over and over, to a favorite album, I feel they and I inhabit different planets.

That's OK. I liked the book. It let me peek in a window to learn about a remarkable person. I think her career will continue to rise (ignore the hype about that little hiccup at S.B. LIV). 

Monday, February 10, 2025

Why we can't see an atomic nucleus

 kw: speculative musings, calculations, photon energies, resolution

Looking at a small, old photograph made with a Brownie box camera in about 1950, I asked my father, "Can we take a picture of this picture and blow it up real big, so it is clearer?" He said, "The grain of the film would be magnified, not the details in the original scene." A few years later, using a slightly-better-than-a-toy microscope, I designed an "expander", a device with a couple of lenses that attached above the eyepiece to enlarge the image even more. I wanted to see things at 2,000X, and the microscope was limited to 400X (which is pretty good, actually). To my surprise, while the image was indeed five times as large, no more detail was visible. It was just fuzzier.

With more time and study I learned that the maximum practical magnification of an optical microscope is 800X, for someone with normal vision. Using higher magnification (up to 1,600X is available with many higher quality microscopes) makes it a little easier to see the details that are barely visible at 800X, but doesn't bring more details into view. Why is this? It comes down to the nature of light. The basic concept is that you need a "probe" smaller than the finest detail you want to see. To see bacteria, light is good enough, but to see smaller details, something smaller is needed. An atom is 10,000 times smaller than a small bacterial cell, and atomic nuclei are 100 million times smaller. We'll go step by step.

The visible spectrum is a narrow range of wavelengths of light, and you may have read that the range is from 400 nanometers (nm) at the blue end to 700 nm at the red end. These numbers are rounded off; for most people, the deepest violet (or "bluest blue") that is visible has a wavelength near 380 nm, and the "reddest red" visible is at about 750 nm. Our color vision is most sensitive to green light with a wavelength near 555 nm. This is used to calculate the highest practical magnification of microscopes. Using a bluish (sky blue, not deep blue) filter with an optical microscope reduces the fuzziness induced by longer wavelength light: red and orange are not eliminated, but reduced, making the image a bit clearer.

Articles abound regarding the ultimate resolution of microscope (and telescope) optics. Without getting into lots of equations, we can jump to the conclusion that a practical limit for most optical systems is about equal to half the wavelength of the illumination. Thus, in visible light, objects smaller than about 277 nm, or 1/3600th of a millimeter, cannot be discerned.

A second critical factor is the resolution of the unaided eye ("naked eye"). The definition of 20/20 vision is based on optical resolution of one arcminute, or 1/60th of a degree. This is sometimes stated as the ability to see lines spaced 1/10th mm apart at "reading distance"; if you do some figuring, that means "reading distance" is about 13.5 inches or 340 mm. I just held a book where I normally do for reading and measured 14 inches, so that's about right.

The ratio of 1/3600 to 1/10 is 360. How, then, can a microscope have useful magnification as high as 800X? Most microscope objective lenses are "stand-off" lenses, which sit a little farther from the subject than the diameter of their front lens. Special lens arrangements with a front lens at least twice as large as the lens-to-subject distance, used with oil between the lens and the object, can more than double the resolution, so the effective ratio gets into the range of 720 to 800. This is adequate to see bacteria such as E. coli, which are about 1/1000 mm in diameter and 2-3 thousandths of a mm long. At 800X, they would appear 0.8 mm in diameter and around 2 mm long "at reading distance". This is as far as an optical microscope can take us.

My original question was about seeing an atomic nucleus. First let's take a step in that direction and consider seeing atoms. A typical atom has a size of one or two tenths of a nanometer. That's about 1/10,000th the size of an E. coli bacterial cell. What kind of light can see that? To see atoms we need a wavelength of a tenth of a nanometer. Or less. Less is better. What kind of light does that?

To go further, we need an equation that relates photon energy to its wavelength. The standard form of this equation is

E = hc/λ

The symbols are

  • E is photon (or other particle) energy
  • h is Planck's constant (an extremely small number)
  • c is the speed of light (an extremely large number)
  • λ is the wavelength

Combining h and c in units consistent with energy in electron-Volts and wavelength in nm, this equation simplifies to

E = 1239.8/λ, and its converse, λ=1239.8/E

For our use, we can round 1239.8 to 1240. The electron-Volt, or eV, is a useful energy unit for dealing with light and with accelerated particles, such as the electrons we will consider momentarily. We find that blue light photons at 400 nm have an energy of 3.1 eV and red light photons at 700 nm have an energy of 1.77 eV. That is enough energy to stimulate the molecules in the retina of the eye without doing damage. Shorter wavelengths, with higher energy, such as ultraviolet, are damaging, which is a good reason to wear sunglasses outside.

What is the energy of a photon with a wavelength of 0.1 nm? It is 12,400 eV, often written 12.4 KeV. These are X-rays. The trouble with X-rays is that they are very hard to focus. Fortunately, because wave-particle duality applies to matter particles as well as to photons, the electron microscope was invented. Electrons can be easily turned into a beam and focused. Over time, electron microscopes were improved to the point that atoms can now be imaged directly.

This image was made with an electron microscope using an electron energy of 300,000 eV, or a wavelength of about 0.004 nm, or 4 picometers (pm). It shows a grain boundary between two silicon crystals. A technical detail: this is not a scanning electron microscope, which relies on electrons bouncing off the subject. This is transmission electron microscopy, or TEM, with the subject cut and polished to extreme thinness. The electrons that do bounce off scatter in all directions, and also knock electrons loose from the silicon, which makes for a messy image, but electrons that pass through ("transmit") form a good image, so TEM is best for this application.

On your screen the shortest distance between atoms should appear to be a few mm. Now consider: the nuclei of these atoms are 10,000 times smaller. What kind of probe can reach inside an atom to see the nucleus?

The wavelength of the electrons in this instrument, as noted, is 4 pm, while the size of an atomic nucleus is measured in femtometers (fm); 4 pm = 4,000 fm. That's much too big. A particle with a wavelength of 1 fm has to be very energetic indeed: 1,240,000,000 eV, or 1.24 GeV (1 GeV is one billion eV). The electron microscope that made the image above cost a couple of million dollars and is five feet high. Much of that size is needed to keep the very high voltage from sparking over and shorting out! Not to mention probably killing the operator. A billion eV is like lightning; it can jump for miles! A different kind of apparatus is needed.

Scientists probe the nucleus with large particle accelerators that cost many millions, or billions, of dollars, euros, whatever. They seldom use electrons. The electron is a light particle, and boosting it to energy greater than a billion eV is hard. The maximum seems to be 50 GeV. Using heavier particles, usually protons, works better. However, as the image at the top of this post implies, smacking a really high energy particle against a nucleus knocks off all kinds of stuff. That is really not conducive to making a "photograph" of a nucleus. Big accelerators such as the Large Hadron Collider in France and Switzerland have huge detectors to gather the spray of "stuff" that results from smacking nuclei really hard, with particles accelerated to extreme energies: 6.5 TeV (Trillion eV) so far (A detail: both probes and targets are accelerated, in opposite directions, so the combined collision energy is 13 TeV).

This all illustrates a critical point: The smaller the things you want to see, the more energy is needed to do so, and the more it will cost. To see really small things, right down to the level of the Planck length, which is about 1/6th of trillionth of a trillionth of a trillionth of a meter, would require enough energy to create another Universe. I don't think it is wise to play with such energies!

Sunday, February 09, 2025

Binary clock concept

 kw: ai experiments, binary time, binary clocks, art generation, simulated intelligence

Near the end of 2024 I was thinking about what clock faces would look like if a culture had a binary concept of time. That is, an 8- or 16- or 32-"hour" day, based on, for example, 32 divisions per "hour" and 32 further subdivisions. Everything based on powers of two. I decided to have several art generating programs attempt to draw a clock dial ("face") with sixteen symbols on it…with no success! Every program stuck to 12- or 24-hour format, with one exception.

This image generated by ImageFX (IFX hereafter) shows a dial with 13 items that look like jewels. Their spacing is rather uneven. Other scales around the dial have as many as 21 symbols, also not evenly spaced. If nothing else, IFX is good at producing unique symbols.

I had in mind a culture, perhaps on another planet, that had no contact with our Babylonian 24-60-60 scheme. They developed a number system based on binary digits. Perhaps their "hands" have four or eight "fingers".

I learned to think in sixteens when I had to write a lot of computer code for two different operating systems. One was based on Octal digits (0-7), and another on the much more common Hexadecimal digits (0,1,2…8,9,A,B,C,D,E,F). My colleagues and I were adept at thinking in 8's and 16's. 

Let's consider a 16-hour daytime and 16-hour nighttime. And let's use words from another language to get away from the English terms:

  • The planet rotates in one nichi composed of 32 jikan.
  • One jikan contains 32 bu.
  • One bu contains 32 byoh.

This language has no inflections for pluralization. 32x32x32 means that one nichi is divided into 32,768 byoh. If the inhabitants are of similar size to humans, perhaps their heart beats at about the same rate. This implies a nichi that is about 40% as long as an Earth day. If this is intended to pertain to a culture on Earth, the byoh would be about 2.64 seconds.

This is another of the images IFX offered up. The outer ring has 32 symbols, spaced evenly, so perhaps this symbol set could serve for a full-nichi dial. However, the next ring in has 19 symbols. Perhaps we can posit that the planet's orbit is divided into month-like periods, let's call them tsuki, of 19 nichi each. That's unlikely in a binary-based culture. Ditto for the outer ring of 28 "petals". And I am not sure what to do with the weird spiral. Clearly, I am not going to get far using SI, at least not yet.

By the way, look carefully at the two gears below the spiral. The teeth don't mesh. IFX "knows" what gears are, and that they have something to do with clocks, but it doesn't "know" how they work.

I created a set of symbols based on numbers used in Sumerian cuneiform:

The basic set is four. Two fours stacked is 8, 4+8 = 12, and two eights are 16. However, it might be better to devise a symbol for zero (a filled circle will do; it's what the Sumerians used), so the "16" would look a lot like our "10". On the other hand, perhaps such a culture would not be ready for a 2-symbol number until after 31 (the bottom two symbols set next to each other, effectively 24+7). Then the single tall wedge followed by the circle would represent 32.

I wanted a way these could be arranged around a circular dial. When we have numeric digits on a clock, they are usually upright, but we place Roman numerals all pointing outward from the center, like the pseudo-Roman symbols in the image above. The next image is a concept of these wedge-digits arranged that way.

Making this dial with PowerPoint was easier than I thought. Right away I noticed that when you turn a shape by its handle, the angle is shown in a status line on the Format menu. 360°/16 = 22.5°, so that was easy. The lines in the diagram help line up the symbols correctly.

I let all this percolate for a month or so. Then I went back to the art generator programs and tried a different tack. After much experimentation I settled on this prompt:

A colorful circular dial with sixteen ordered symbols evenly spaced around it

A few hours of playing around yielded lots of interesting images. I'll present 27 of them, from five programs, with a bit of discussion after each set of nine.


These are three each from three programs: Gemini, Dall-E3, and DreamStudio. The prompt for the first Gemini image left out the words "colorful" and "ordered", and "colorful" was added for the other two. Out of a large number of offerings, only the first two had sixteen items, although they are numbers and have repetition. The third is admirably wonky, but nothing like a clock dial, though the outer ring does have twelve items. All other images were produced using the full prompt.

Dall-E3 produced one dial with 20 symbols in the outer ring, and 12 in the inner rings; a dial with 24 symbols plus four knifelike items intervening and an inner ring of 24; and the third image goes off the rails in a big way. If DE3 were to produce a 16-member ring, it would be purely by chance!

DreamStudio went farther off the rails than that, and stayed there. Some of the rings in these images can be counted, and some, not so much. None is 16, and some have an odd number of symbols. Another set:


These were all produced by Leonardo AI, in various Styles. In the top row we have a dial with several sets of 12 items, primarily Roman numerals, secondly a dial with a main ring of 24 symbols, a mixture of Romanesque and "various", and lastly a dial that indeed has 16 symbols of alternating sizes (I like that!), while its inner ring has eight divisions.

In the middle row we first have a ring of 12 larger symbols, then a ring of 16 (Yay!); then a dial with two rings of 14; and the third dial also has 16, with the whole business offset by a half step, or 11.25°.

The bottom row starts with a dial that looks like needlework, and has 16 items. The second dial has an outer ring of 19 symbols, a narrow ring of 18 digits and digit-like symbols, then a compass-rose-like dial with 12 divisions. The last dial has 16 rather complicated symbols, and all the rings within it are also sets of 16. Now we're getting somewhere! Now the final set:


These were all produced by ImageFX. Most of these are easier to count, and several achieved at least one 16-symbol dial. At upper left, the counts are 16 and 10; next to it, they are 15, 9, and 6; and the "rainbow dial" has 16 and 16 only. Another great result.

In the second row we first have a dial with 14 symbols, a narrow ring of such variety I can't determine how to count it, and an inner ring of 8 symbols; secondly an oblique view of a dial with 15 symbols and an inner ring of 7 or 8, unevenly spaced; and thirdly a dial with 16 symbols, a narrow ring of 16 small symbols (maybe pronunciations?), and a narrower uncountable ring of many symbols.

The bottom row starts with an oblique view of a dial with 19 symbols and an inner ring of 12 divisions with barely visible symbols; then a dial with 16 symbols, a narrow ring of 16 numbers (in no particular order but all have 2 digits), a ring with numerous "words", then a very small ring of alternating colors totaling 16 symbols; and finally a dial with 16 symbols, and several rings with eight members each. This one could also prove useful.

Experiments like this show how the training sets of the SI programs affect the images they produce. Getting away from the notion of "clock" made it possible for a couple of the programs to generate images that could be useful to illustrate a clock for binary timekeeping. We get the most interesting results when we explore the edges of such software's capabilities.

Saturday, February 08, 2025

Techniques of wallpaper generation

 kw: instruction, ai art, upscaling, wallpaper, screen shows

Introduction

I am a collector. One type of collection is sets of images that make good screen wallpaper. My folders of screen wallpaper images includes masses of flowers, mineral crystals (particularly in masses), selections from Hubble and JWST astrophotographs, waterfall photos, and landscape paintings. Now that I've been experimenting with art generation using various SI (Simulated Intelligence) programs, I have a growing body of work in several genres.

My computer setup has two screens with resolutions of 1920x1080, or standard HD. Anticipating that sometime in the future I may transition to one or two 4K screens (3840x2160 or UHD), when I encounter an image that size or larger I don't reduce its size, but keep it for the future. However, I usually reshape an image by cropping it to a 16:9 aspect ratio. I use the "screen saver" program gPhotoshow to cycle through one or more folders of images I want to display during idle time.

This post is in two parts. Firstly I discuss the production of an image of appropriate size and aspect ratio, beginning with the seven image sizes produced by the five programs I have been using. Secondly I explain how I sign an image to give credit to myself and the program that I used, especially matching the signature size with the image size.

Image Sizes and How to Treat Them

This is a reduced version of an image I produced in my "Troglodyte" series. A troglodyte is a cave dweller. I call my basement library-and-computer-room my Man Cave, though most men who have such a space use it for watching sports. I decided to play with the idea of having an office in a real cave. Eventually I produced more than ten rooms, as though I owned a cave large enough to convert into a sort of mansion, which was dry enough that my furnishings would not mildew.

I was also experimenting with lengthy prompts for this project, but that is less critical just now. I used Leonardo AI, with the style Leonardo Lightning and substyle Creative, and Aspect Ratio set to 16:9, to produce images sized 1368x768 pixels. It didn't take long to find that the ratio was off a bit: 57:32. What's going on?

All art generating programs produce images that are made up of square blocks. In this case, the blocks are 24x24. There are a few ways to get 16:9, or 1.77777…, from 57:32, or 1.78125. However, I am a purist, so I want the vertical dimension to be a multiple of 9, and better yet, I decided to restrict images to multiples of 18 pixels high, and thus, also having horizontal dimension the same multiple of 32.

Considerations Regarding File Locations and Naming

Whatever program I use, when I download the image, it is put in the Downloads folder. Thereafter I do any work on the file, which may be several steps through intermediate files, in the Downloads area. Sometimes I use folders under Downloads to keep things organized.

Whichever program I use, right after I download it I rename the file. I have two ways depending on the length of the prompt.

Short Prompts (less than ~120 characters)

From a recent project: 

IFX 250203-08 Two naturalists and two medium-sized dinosaurs sitting together at a table having tea and croissants, digital art

This image was produced by ImageFX on the date shown (a few days ago), and is the 8th image I made that day using ImageFX. This prompt is 113 characters, a bit longer than usual, but it doesn't make the file name overflow the namespace in the disk's file system.

Long Prompts

I keep a file of long prompts, dated and labeled. The image above has the file name Leo 241108-03 LLightning Creative Prompt 241108-02, Cave Office

This was produced by Leonardo AI using Leonardo Lightning style and Creative substyle; the rest of the file name identifies the prompt.

If I have just one or two files I'm working on, I do so individually. Otherwise I use batch mode.

Convert Files as Needed

All but one of the programs I use produces files that are smaller than 1920x1080. Gemini produces only square images 2048x2048. I'll show how to deal with that later. For all the others I use Upscayl to make the image larger and sharpen it.

One caveat with Upscayl: although it will make a valiant effort with a PNG image, the resulting image is monochrome (gray), so I first convert PNG files to JPG, for which I use IrfanView, also in batch mode if I have more than two images. Furthermore, image files produced by ImageFX have a JPG extension, but they are PNG files. I open them one at a time using IrfanView (it can quickly skip from file to file, so this isn't too time consuming). Before opening a file it asks if I want the extension corrected. I click "Yes" and continue. Then you are ready to convert these to JPG.

DreamStudio also produces PNG files. They need to be converted. The following image shows the IrfanView batch edit screen:


Here are the steps:

  • Select Batch Conversion at top left
  • Select JPG as the output format
  • Select the output folder at middle left (I already made a folder named OutPix under Downloads). Next column:
  • Select the input folder at top right (I already put all the files to be converted into a folder named PNGPix under Downloads).
  • Select PNG as the input format (file type).
  • Highlight all the files and click "Add".
  • Click Start Batch at lower left.

This will convert the files and put them in the folder OutPix, ready for upscaling.

Upscaling

For upscaling an image by 2x, 3x, or 4x I use Upscayl. It has several upscaling models. I usually use FAST REAL-ESRGAN, which takes about ten seconds to convert a ~1K image on my computer. When I want ultimate quality from a highly detailed image, I use REMACRI, which takes about two minutes per image. Upscayl has a batch mode, which works on a folder at a time. 

This is the screen that appears when you set Upscayl to batch mode:


Here are the steps:

  • Step 1, I put the files to upscale into a folder called OutPix that is in the Downloads folder, as noted above. This would include files from programs that produced JPG natively.
  • Step 2, select the model and ratio (2X in this case)
  • Step 3, I choose the Downloads folder. I work in the Downloads folder because files there don't get backed up to OneDrive. That lets me work and then clean up, followed by putting the files I plan to keep in their final folder(s) later.
  • Step 4, click Upscayl to start the process. 

For the 14 files shown, this model ran a little more than two minutes. Had I used REMACRI, I could go get a snack: it would take half an hour.

Image Sizes in Each Program

IrfanView has a wonderful item in its Edit menu: "Create maximized selection (ratio:)" with a caret off to the right. Click that to see available ratios. You can add any ratio you like by using the prior Edit menu item: "Create custom selection...". 

If you have an image, for example, 1920x1200, that you want to crop to 1920x1080 (and there is headroom or footroom to do so), you can click this option, then the caret, then the 16:9 item, and cut lines will appear. You can use the up or down arrows on the keyboard to slide the selection, then CTRL/Y to crop the image. Save with a new name.

Dall-E3

1024x1024 → 1792x1024 (1.75:1=7:4). You initially get four square images. Click an image to work with it. There is a button "Resize" with a "4:3" option. It actually produces an image close to 16:9 but not quite. The image is similar to the square one, but not the same, and it can be quite different.

Process: Upscale by 2X to 3584x2048, then crop to 3584x2016.

DreamStudio

There are two Stable Diffusion models. In either case, there is a setting to modify the pixel count. Only DreamStudio can do this at present.

  • Stable Diffusion v1.6 with 16:9 selected sets the size to 910x512 (1.7773:1=455:256); reset to 896x504, which is (16x9) times 56. Then upscale 3x to 2688x1512.
  • Stable Diffusion XL v1.0 with 16:9 selected sets the size to 1344x768 (1.75:1=7:4); reset to 1344x756, which is (16x9) times 84. Then upscale 2x to 2688x1512.

Gemini

2048x2048. Negotiate by asking for "wide format" and possibly tinkering with the prompt until there is sufficient headroom and/or footroom to crop to 2048x1152, which is (16x9) times 128. No upscaling is needed.

ImageFX

1408x768 (1.833:1=33:18). Upscale 2x to 2816x1536, and then in IrvanView use the Resize function to get 2805x1530 and (using the Edit subfunction) crop to 2720x1530, which is (16x9) times 170.

Leonardo AI

At least one style mode produces a larger size.

  • Most styles produce 1368x768 (1.78125:1=57:32). Upscale 2x to 2736x1536, resize to 2725x1530, crop to 2720x1530.
  • Leonardo Phoenix produces 1472x832 (1.769:1=23:13). Upscale 2x to 2944x1664, and then crop to 2944x1656, which is (16x9) times 184.

There are some other methods one can use. For example, for a Dall-E3 image that is 1792x1024, you could use the Resize function in IrfanView to produce 1920x1097, and then crop it to 1920x1080. The Resize is only by 7%, so it produces an image with a good appearance. You can also crop out a 1920x1080 image from a Gemini image. A lot depends on the image you are working with.

Signing the Image

Style of Signature that I use

Having several different image sizes, it's a bit tricky to add a signature to an artwork that is of a consistent size. I like to use a signature in two or three parts. The more usual two part signature, using my first two initials plus last name, is like this:

OK Myname / Dall-E3 2025

For an image I produced this year using Dall-E3. I use abbreviations for the others:

  • DreamS = DreamStudio
  • Gem = Gemini
  • IFX = ImageFX
  • Leo = Leonardo AI

Sometimes an image is part of a series, such as the Cave (Troglodyte) series. Then I'll use a 3-part signature on two lines, such as:

OK Myname / Dall-E3 2025
Cave Master Bedroom

I set the second line to a smaller type size, typically about 0.7 of the first line.

Size and Placement

Most images have an area near lower left or lower right where a signature can be placed. I use the Paint dialog (F12 key) and select the "A" for text input. The cursor becomes a little plus that shows where the lower left corner of the first line is to go. Using a signature on the right is harder because it's often hard to tell how long the text is going to be. 

The greatly differing image sizes mean it is best to use corresponding type sizes. After some experimentation, I settled on using this basis: 14 points for an image 1988 pixels wide, which is suitable for images with horizontal dimensions from 1920 to 2194. This grows to 28 points for the range 3840 to 4114. This table shows the sizes I use:


This is a picture because tables are hard to produce in Blogger. The Paint/text tool's list of type sizes is all even numbers, but you can type in an odd number. All I can say is, experiment with it to find out a procedure that you like, or can live with.

Color

I looked at paintings by several artists. Some use a black signature, but many use a color from elsewhere in the painting that contrasts with the area of the signature. I like this method. The Paint tool has plenty of flexibility in this regard.

Font

Many years ago I designed a few fonts for a project. One of these is a strongly-weighted Sans Serif font that I like, so that's what I use. The Paint/text menu lets you select any installed font.

This isn't wholly comprehensive, but I hope it is enough to get you started turning generated art projects into wallpaper for your computer screen(s). Also, whatever art you produce, if it's for the public to see, signing it is a good idea.


Friday, February 07, 2025

Complexity of science or science of complexity

 kw: book reviews, nonfiction, science, scientists, scientific method, complex systems

The opening salvo in Dr Giorgio Parisi's book In a Flight of Starlings: The Wonders of Complex Systems is the story of how he and his colleagues studied starling murmurations.

When I lived in Oklahoma I would see these amazing formations in the sky in late summer and fall. A flock like this can have tens of thousands of birds, that swoop about like parts of a sky-borne animal. The author's team used multiple synchronized cameras shooting frequent images, which allowed them to identify individual birds and their relationships in three dimensions. As time passed and cameras became better and faster, their work became more and more precise.

Starlings don't flock like other birds such as sparrows, as shown here, nor do they use "V" formations like geese. Analyzing the starlings' flights in detail showed that each bird keeps track of only a few nearest neighbors, with two surprising characteristics: Firstly, the distance between near neighbors fore-and-aft is quite a bit greater than side-to-side, and secondly, a section of the flock is denser near the boundary than in the middle. These attributes seem to be driven by the tension between maintaining sufficient confusion to avoid predators such as falcons, and the need to avoid bumping into one another. Changes of speed and direction seem to ripple through parts of a murmuration like a "wave" through a football stadium, although with less overall coordination.

The author's work is in the general realm of the mathematics of theoretical physics as applied to complex systems. In the book many mathematical concepts and operations are discussed, somehow without the need to print a single equation. His aim is to show aspects of how science is done via stories about his own work, including the work that earned him a Nobel Prize in 2021. Two chapters of the book focus on phase transitions and transitions between order and disorder. If you think about it, these are two ways to understand the same thing. The phase transitions familiar to us, such as water either freezing or boiling, are brought about as, in the case of freezing, disorder gives way to order, and in the case of boiling, one style of disorder gives way to another. A less-recognized but common transition is sublimation, in which ice evaporates directly without passing through a liquid phase. But there are also transitions between one kind of order and another, such as the conversion of graphite to diamond, or the arcane transitions found by laboratory studies of at least ten crystal structures of ice, each of which is stable in a particular range of pressures and temperatures.

While the scientific stories are fascinating, I confess that, though I am a scientist also, and familiar with mathematical physics, Dr. Parisi lives on a very different level, and I struggled to understand the mathematical concepts he discusses. The chapter on spin glasses (a concept relating to the orientation of electrons in a solid material) was a complete mystery to me. However, I could just relax and take in whatever was meaningful, and that was satisfactory. He follows that chapter with one on metaphor and its limits, which is quite comprehensible because we live lives of metaphor. Metaphors and other similitudes let us model a phenomenon, which allows us to gradually build an understanding of what is going on.

The author's aim is to foster a greater understanding of science and how it operates, and to increase public trust in science, which has been so badly eroded in recent years. I hope he understands that the problem is not with science itself, but with dishonest scientists such as the very public figure who proclaimed, "I am Science!", and thus deeply offended the great majority of Americans who could tell he had been lying to us for years. I must also point out that anyone who says something is "settled science" displays a fundamental and deep misunderstanding of science. Perhaps you've heard the saying: "Half of all medical 'knowledge' is shown to be false about every ten years. The trouble is, we never can tell which half beforehand."

This is true. I have a small journal of medical mistakes I have suffered, one of which had me at death's door until I wrested control of my medical care from a mistaken doctor and "jumped over some fences" to get appropriate treatment. Medical science is admittedly at a level of complexity far beyond the "complex systems" studied by a physicist, but the principle is similar. In the next-to-last chapter in particular we read how an incorrect metaphor often leads a scientist to revise and develop a better metaphor, over and over, until the model and the system correspond, and you can say, "Now I understand this system." The stuff that is interesting in science is where we don't yet have full understanding, and still find things that make us say, "That's funny…"

Science is a human enterprise. It is saved from disaster and banality by the self-correcting nature of the process. The backing-and-forthing that goes on can be frustrating, but we haven't produced anything better. It is extremely unlikely that a better process is possible, no matter now much smarter we get, or how smart our so-called-AI (I prefer to say SI for "simulated intelligence") systems become.

Monday, February 03, 2025

AI can't tell time

 kw: ai experiments, prompts, ai art, failures

In various corners of the universe of knowledge, SI (Simulated Intelligence) is manifestly ignorant. This can be seen in certain everyday tasks, such as reading (or drawing) an analog clock. I heard mention that most advertising for clocks and watches shows the hands set to 10:10 because ad writers think that has the most attractive appearance. Since SI has no knowledge of what clocks even are, or how the hands show time, they are dependent on their training image sets, which are only useful if there is text accompanying the images. I decided to see how various art generators would handle this prompt:

An image of a very decorated mantel clock showing the time as 4:15

I first used Gemini. This is the result, showing my prompt (one has to tell Gemini this is to be an image or picture):

The program did a good job with the decoration. That is its strength. But, sure enough, the clock's hands are pointed at 10:10, or very nearly so. If you look closely, the hour hand is exactly at the 10, where it should be 1/6th of the way to the 11. 

Gemini produces only one image at a time, in contrast to all the other programs at my disposal.

I next tried DreamStudio, the most recent program I use. I set the number of images to make at 2, because I pay for credits, and each image costs something. Using the same prompt:


DreamStudio is playing a trick in the first image. The hands have a "head" at both ends, so the time being indicated is ambiguous, but one interpretation is still 10:10. Though the hands have different shapes, it's also hard to tell hour from minute hand, so eight interpretations are possible! Don't try to teach your kid to read a clock that has such pathological hands!

The second image at least has a hand pointed at the 4, but given that the other is pointed at the 8, indicating 4:40, the hour hand should be a bit more than halfway between the 4 and the 5.

Next victim: ImageFX (driving Imagen 3, the same as Gemini). It is free to use, so I let it run four images. I also left the aspect ratio at 16:9, the setting I usually use with this program.


All hands point to 10:10. the expected result. Next, Leonardo, using the "bare bones" Leonardo Lightning style and the default (Dynamic) substyle:


Here we find an interesting variety of responses. 

  • Upper left: The hands are so nearly the same length it's hard to say if this is 10:10 or 2:50, although whichever hand is the hour hand, it's pointed right at the digit, not advanced as it should be.
  • Upper right: This looks the most like a real clock. The hour hand is between the 4 and the 5. It still isn't showing 4:15.
  • Lower left: The hour hand is near the 6, but on the wrong side of it, unless it is just a little too far over and the time should be read as 5:40.
  • Lower right: 4:40, with a misplaced hour hand, as seen before.

Finally, here is the response from Dall-E3 in Bing:


Assuming I've figured out correctly which hand is which in each case, the times shown are 10:07, 2:50, 10:09 and 12:55.

So there you have it. Not one 4:15 in the bunch.

Dinosaurs triumphant!

 kw: book reviews, nonfiction, history, fossils, geologists, victorian era, dinosaurs, evolution

As a fun retirement job I work at a natural history museum. A few years ago the exhibit hall was being redesigned. Several of us were discussing the plans, and the kinds of exhibits to be included. One experienced scientist—like me, retired, and a volunteer in the research division—said primarily one thing, persistently and steadily, every couple of minutes: "Dinosaurs." It was true when I was a child and it remains true that having several dinosaur skeletons on display will bring the crowds.

As told by Edward Dolnick in Dinosaurs at the Dinner Party: How an Eccentric Group of Victorians Discovered Prehistoric Creatures and Accidentally Upended the World, people had found big fossilized bones for millennia, and fitted them into whatever worldview they had, usually as prehistoric, or sometimes contemporary, giants. For example a fossil elephant skull, with its large central hole for the trunk, is behind the myths of Cyclops with a single, giant eye. As the Industrial Revolution cranked up in the late 1700's, with all the digging of canals, tunnels, and deep foundations, many more large bones were unearthed. A few chapters are devoted to Mary Anning, the most productive fossil finder of her generation. Comparative anatomy became a thing, and by the early-middle 1800's a great number of the animals to which these bones had belonged were being likened to reptiles with certain mammalian characteristics, such as upright legs rather than the sprawling legs of a crocodile.

The word "dinosaur" was coined by Richard Owen in 1842. Not long after, marine reptiles such as plesiosaurs and ichthyosaurs, and aerial reptiles such as pterosaurs, were recognized as parallel to dinosaurs, but not included. By this time it was also becoming more and more clear that the Earth was a great deal older than the comfortable assumption of "around 6,000 years", or the more specific "created in 4004 BC", based on Biblical interpretation of the time. Also, the geological principle of superposition—rock layers and their fossils found below more shallow layers and their fossils are older, and a succession of layers represents a succession in time—showed that most creatures from long, long ago had gone extinct, even that there had been entire assemblages of living things that vanished from the scene, to be succeeded by other assemblages that then became extinct, several times.

This upset the simple theology of the time. Then when Darwin published On the Origin of Species in 1859 the faith of many was shaken. Time was impossibly long (about 750,000 times as long as they had thought), the vanishing of 90+% of all living species (more than once!), and the sudden insignificance of humans and human history seemed to relegate God to the status of a minor, and indifferent, demigod.

Side note: How is it that so many people retain Biblical faith to this day? I was taught from a very early age about The Gap: that verse 1 of Genesis 1 describes an original creation, which was damaged and became chaotic, such that verse 2 and onward describe a restorative creation. Hebrew has certain complications in its grammar. One is this: the past tense of the verb "created" in Genesis 1:1 is different from the past tense used for the verbs in the rest of the chapter that describe God's actions on the Six Days. Therefore, whatever one may think of the six days (such as whether they were 24-hour days or eons or something in between), the time of the first verse is not in any way constrained. In the Scofield Reference Bible (1909), the second note on Genesis 1:1 (using "but" to mean "only") is, "But three creative acts of God are recorded in this chapter: (1) the heavens and the earth, v. 1; (2) animal life, v. 21; and (3) human life, v. 26, 27. The first creative act refers to the dateless past, and gives scope for all the geologic ages." This last sentence is the key to understanding that God's focus is on His relationship with humans, and that is the emphasis of the entire Bible. It is not a text of natural history.

In this very entertaining book we learn how the people of the Victorian era (~1837-1901) were practically dragged out of their comfortable, small and short-lived world into a dramatic, vast and eons-long spectacle. Once dinosaurs had been discovered, many thought they and their flying and swimming kin might be found in the unexplored parts of the Earth. Eventually they understood, this was not to be.

In the next-to-last chapter, "Dinner in a Dinosaur" we learn of a fantastical dinner party held New Year's Eve 1853 inside a life-size model of an iguanodon. By the late 1800's dinosaurs had thunderously earned a seat at the table.

The last sentence of the Epilogue ends, "…with no warning, no foreboding, they vanished." I would add the proviso that today's birds are dinosaurs. If you are ever in the presence of a Cassowary or Ostrich in the wild—and you'd be in mortal danger in either case—you'll understand.