Tuesday, May 16, 2023

Studying tartan designs

 kw: analytical projects, plaids, tartans, statistical distributions, scale free, lognormal

Guess what this is? It isn't quite what it looks like. It's a printed plaid, a plaid-like pattern printed on white flannel, the backing for a comforter we made many years ago. Until I looked at it closely (microscopically), I thought it was a woven plaid.

Close inspection also reveals that the weave is single-over-under, rather than the over-2-under-2 of most plaid fabrics. Nonetheless, it is an attractive pattern, one of my favorites!

Some time ago I began to wonder about the distribution of stripe widths on plaids. Long ago I wrote, in GWBASIC, a "screen saver" program that produced plaid patterns on the screen. I used a scale free distribution because it is easy to program. It would generate a bunch of width values and then scramble them by sorting against a set of random numbers; it would assign colors and generate a plaid pattern.

I don't know how plaids are designed. The Scottish tartans such as Black Watch or Douglas can be centuries old, and were selected with aesthetics in mind, and an eye for being imposing because they were worn into battle. Today I suppose artistic designers pick the colors and stripe widths in a purely aesthetic way.

I decided to study the statistical distributions found in my own shirts and other fabrics. I figured out how to wrap a shirt around a dictionary to hold it on a scanner, and did so for 17 flannel shirts and two plaid jackets, plus the pattern above which I photographed because the comforter is large and very thick. I have a number of plaid summer shirts, which I may analyze in the future, but they are not included here.

The large variation in stripe widths led me to consider three model distributions: Normal, Lognormal and Scale Free or Log-Log. When graphed with appropriate coordinates, each of these is a straight line, but, for example, a Normal distribution will graph as a curved line on either Log-Log or Lognormal coordinates. First, we need to see the shapes of these distributions:

The Normal distribution is frequently called the Gaussian distribution, because it was first proposed by the mathematician Carl F. Gauss in the early 1800's. When several random variables are added and measured repeatedly, the distribution of the sum tends toward the center-weighted shape shown in orange. A mathematical proof of this additive tendency is called the Central Limit Theorem.

The Lognormal distribution results when an exponential function is taken for a set of values that have a Normal distribution. The Lognormal shape is shown in green. Also, when several random variables are multiplied and measured repeatedly, the distribution of the sum tends toward a Lognormal distribution. The logarithmic form of the Central Limit Theorem describes this tendency. Furthermore, when an area or extended object is fractured or divided into many pieces via a random process (such as dropping a pane of glass), the areas or weights of the pieces closely approximate a Lognormal distribution. I verified this once in the laboratory using a small piece of glass I broke with a light blow of a hammer, and then weighed a couple hundred pieces. The mathematical proof of this is called the Theory of Breakage, which was propounded by A.N. Kolmogoroff in 1941.

The Scale Free distribution results when a series of measurements are taken of the reciprocals of a uniform random distribution. This is also called a Fractal distribution, based on the work of Benoit Mandelbrot in the 1980's. A theoretical continuous Scale Free distribution has no limit in either direction; no largest or smallest member being predicted. Discrete sets of values that have a Scale Free distribution, however, do have a largest and smallest member. While the theoretical, continuous Normal and Lognormal distributions also have no limits, the probabilities of extreme values are vanishingly small (for a Lognormal distribution, "extreme" means either a very large positive value, or a value that is positive, but very, very close to zero).

Each distribution can be rectified (made to approximate a straight line) by sorting all the values and graphing them in order in an appropriate coordinate system. Idealized examples of these three distributions are all shown together in the three coordinate systems that are relevant to this discussion:


These charts each rectify one of the distributions. Firstly, for "Probability Coordinates", the horizontal axis has units of standard deviation and the vertical axis is linear. The sorted values in a Normal distribution (orange) follow a straight line here. Secondly, for "Log-Probability Coordinates", the horizontal axis is the same, while the vertical axis is the logarithm of the values, which straightens out the Lognormal distribution (green). Thirdly, for "Log-Log Coordinates", the horizontal axis is the logarithm of the ordinal number of the sorted values and the vertical axis is the logarithm of the values. This rectifies the Scale Free distribution (blue). Note that in each case, the "other two" distributions display a distinct curvature.

Now, for sets of more realistic distributions, created by appropriate random processes, we see the same three graphs:


The three coordinate systems are the same as those above. A straight line has been added to each graph to emphasize which set of values has been rectified.

How does all this apply to a study of plaids? I gathered data from the scans of the 20 plaids, measuring each one in both directions. This is because the warp and woof of the weave have different pitches, so the plaid designers adjust the number of threads of each color so the resulting plaid will not look distorted. Here is an example of a set of data for one of the plaids. I used rather generic color names, because the widths of the stripes were the meaningful parameter, not the color pattern.

Note that, while the order of the colors is the same in both directions, the number of threads is seldom the same in direction 2 as compared to direction 1. This enlargement of the pattern shows the threads; it takes a careful look to see that the spacing is different between horizontal and vertical. Look at the white square. It has 9 horizontal threads but 6 vertical threads, yet the "square" appears pretty close to a square.

One benefit of the over-2-under-2 weave is that it makes counting threads in wider bands easier, because I could count by 4.

This is a more overall view of the pattern. Although each "unit" of the pattern contains 5 white stripes, 4 black stripes, 2 navy stripes and only 1 gray stripe, gray dominates because its stripe is so wide, with navy blue running a close second.

What did I do with all these numbers? There are a lot of them. A few patterns had 38-40 stripes, and many had quantities in the 20's. Some plaids have mirror symmetry, a smaller number don't.

I copied all the data, sorted each set (each direction for each plaid), and set up both ordinal and probability axes for them all. I charted them in groups to see how they looked. I was looking for rectified distributions. As we see below, with a few of them as an example, the results are not clear-cut. I had been hoping to see a clear indication that the distributions were primarily either Scale Free or (my preference) Lognormal. The reality is a little of both. The graphs that follow pertain to six non-symmetrical patterns.

The overall view is that many of the lines have a downward curvature at the right, but not all. In particular, the yellow line and the gray line mostly hidden behind it (#16), and the lighter blue and lighter green lines in the midst of the scrum (#10), don't curve down.

The downward curvature indicates that most of these are better modeled as Lognormal. The next graph shows that presentation.


Here many of the lines appear straighter, while some either flatten out or curve oppositely (not really "upward"). We also see that the dark red line and the dark blue that accompanies it also flatten out, even though they have a bit of downward curvature in the other graph.

None of the patterns showed a hint of being closer to Normal than to Lognormal or Scale Free, so I didn't pursue that any further.

"Eyeballing" the charts proved unsatisfactory, so I used a mathematical measure of linearity, relevant to either Log-Log or Lognormal coordinates, to more clearly discern the trends.

I saw from this that some of the patterns were more Lognormal in one direction and more Scale Free in the other. I found the following:

  • 7 patterns were Lognormal in both directions.
  • 4 patterns were mixed, but leaned Lognormal more than Scale Free.
  • 2 patterns were mixed, but leaned Scale Free.
  • 7 patterns were Log-Log in both directions.

Here we have, from left to right, #3, which is the most Lognormal of them all, #8, which is the most ambiguous, and #6, which is the most Scale Free of them all.


As it happens, #3 and #8 are favorites of mine, and if the red plaid from our comforter were made into a shirt, as a pattern, it would also be a favorite (although my wife doesn't like me to wear red shirts); it is also a mixed-distribution pattern. I care less for #6; I consider it almost ugly. Just to show that Scale Free patterns are also attractive, another of my favorites is shown here, #10, which is more Scale Free in both directions:

A characteristic of Scale Free distributions is a greater number of narrower stripes, and this one shows that. It illustrates that what we like doesn't have a very strong mathematical basis. I had been thinking just the opposite, but I don't mind being proven wrong.

In the future I may scan my plaid summer shirts and analyze them, to see if these tendencies hold up. This has been an enlightening exercise.




No comments: