Thursday, October 27, 2011

Real Estate market distribution analysis

kw: real estate, analysis, statistical distributions, market distributions

Real estate values at "best places to live" kinds of web sites are typically stated only as median values. It is of great value to know the distribution of the real estate market in an area, particularly to determine if the market is distorted by excess or insufficient valuation in a portion of the market range. While I sometimes perform a detailed analysis for a small market, we'll also look at a powerful tool for quickly analyzing any size market.

1. Detailed Analysis

Keep in mind what any Realtor will tell you: "Location, Location, Location," but also note that every location has a context. This analysis is appropriate in a small market, such as a single Zip Code or small city, that contains no more than 150 homes for sale.

I pre-draw a grid such as that shown here. Number of Bedrooms runs vertically, and number of Bathrooms runs horizontally. At a bath-and-a-quarter or bath-and-powder room is counted as 2. When you actually look at individual homes, you'll pay attention to quarter and half baths, but for this analysis, a room is a room is a room.

These figures are for Zip 74075, the northern half of Stillwater, OK, an area in which I lived for a number of years so I am familiar with the market. The numbers are in thousands, rounded ($198,500 to $199,499 become 199). You can see that 3BR 2Ba is by far the most popular, and I ran out of room and stole some space from the 3+3 category. This is an economical market, with a total range in value of single-family homes from $46,000 to $499,000. I gathered the data October 25, 2011.

This Zip Code contained 125 single-family homes for sale. Just the 3+2 category contained 76, or 60% of the entire market. That indicates a homogeneous market. These two data sets are charted along with analyses of two markets closer to home, Ridley Park, PA (54 homes) and the Zip Code 19803 in northern Delaware, where I have friends.

I used Minitab to analyze these on Lognormal coordinates. The straight line for 19803 indicates it is a very homogeneous market, a set of similar bedroom communities. The two plot-lines for 74075 have a zig and a zag at the ends, indicating the market is not as homogeneous; there is at least one "ritzy" area that breaks the trend, and I suspect the low-end zag represents trailer houses on land on the edge of town.

Ridley Park shows up as quite homogeneous, but there is a wiggle in mid-line that indicates a depressed market for the middle-of-the-road homes there, a depressed median. This illustrates why you have to know more than the median to understand a market; if you wanted a home in the top or bottom of the range, you'd find yourself spending more than you expected.

This is a closer look at the four lines. Recall: 74075 = blue triangles, the same Zip with only 3BR+2Ba homes included = black circles, Ridley Park = red squares, and 19803 = green diamonds.

Recall from the handwritten grid that the three highest-priced homes in 74075 are 495, 495 and 499. They are the three triangles surrounding a black dot at the top. The black dot is the single 3+2 house among them, in its own distribution.

The tendency of home prices in a market to approximately follow a lognormal trend indicates that near the median is where you will find the largest number of homes on the market. As you move far from the median, there is not going to be much for sale.

2. Quick Analysis (7-Point Charting)

To understand this method, look at the image above. See how the 1 and 99 points are rather far from the 10 and 90 points, and that these distances are quite similar to the distance between 10 and 25 or 75 and 90. Of course, the 50 point is the exact center, the median value. So we want to pick the values from the home listings that represent 1%, 10%, 25%, 50%, 75%, 90%, and 99%. For a 100-home distribution, this is just the first, tenth, 25th and so forth. If a distribution has fewer than 50 homes, don't bother with the 1 and 99 points.

This table shows just these seven values for the three areas outlined above; I didn't bother with the 3+2 homes in 74075. To figure which homes to use, multiply the total number by the seven numbers, add ½ to each, and round the result.

For a 125-home market, the key numbers are 1.25, 12.5, 31.25, 62.5, 93.75, 112.5 and 123.75. Add ½: 1.75, 13, 31.75, 63, 94.25, 113 and 124.25. These round to 2, 13, 32, 63, 94, 113 and 124. We will discuss in a moment how to locate these items.

The "50" points are in slant type; these are the medians in each market. I also calculate the geometric midpoint, or virtual median, based on the 25 and 75 values, and 10 and 90 values. The calculation is, for example, SQRT(260*400) = 322.49, which rounds to 322, the 25-75 Median for Zip 19803. If both of these are close to the median, the market is balanced. If there is a strong trend, it reveals imbalance. Of course, the charts above show the imbalance in Ridley Park by the bent line, but here the numbers reveal it just as clearly. And, these can be plotted, as shown next.

There is a very subtle scoop in the 19803 line, and the numbers 315, 322, 331 show the same thing. 331/315 = 1.051. A 5% shift is insignificant. More significant is the very visible dip in the middle of Ridley Park, and the numbers show it: 150, 172, 187 and 187/150 = 1.247. A 25% shift is quite large. It indicates that the middle-priced houses are underpriced. A buyer's market!

The Stillwater Zip Code shows very good flatness in the middle range, with a zig and a zag, as mentioned, at the extremes of the range. The middle 80% of this market is balanced and homogeneous.

Now let us look at larger markets. This is useful to determine a price range that is appropriate for most "middle class" folks to buy homes, and how balanced the regional prices are. For three of the four analyses below, I selected only single-family homes; for Philadelphia I added condo/townhomes, which comprise 80% of the total market.

This shows a calculation table selecting the numbers for each market from the total number of homes on sale. Now, how do you find home #38 or #3,242?

By default, at the web site, there are ten houses listed at a time. At the bottom of the page you can click the "Next" button to get the next ten. When you do so, you'll see in the URL field (which starts "http"), near the end, "pg-2" (you may have to scan in the field). Change the "2" to the number of the page you want to go to. Now think about this; home #40 is at the end of page 4, so home #38 is the eighth listing on page 4. Similarly, to find listing #3,242, go to page 325, where it is the second listing (the page ends in #3,250). You have to count, because the web page doesn't put numbers on the listings.

Here are the 7 numbers collected for each of the four markets. I picked OK City because of familiarity, Columbus at random, San Jose because I knew it would be high, and Philadelphia because that's home these days (actually, I live in a suburb outside city limits).

A quick look shows that all these markets are well balanced. Dividing 517/490 = 1.055 indicates that San Jose may be slightly out of balance. By the way, prior to the 2008 crash, when I found an unbalanced region, it was more likely that the 10-90 median was lower than the actual median; high priced homes were undervalued, or conversely, the middle class, fueled by over-liberal lending policies, was overpricing mid-range homes. Let's look at the chart.

As expected, San Jose is high priced. Surprisingly, though Philadelphia is in the "rich" Northeast, its values are only slightly higher than the West and Midwest.

I was, frankly, quite surprised to find the distribution for Columbus as low as it is, lower than OKC. A number of years ago this was not so.

Seven-point analysis is the quickest way I know to learn so many things about markets of any size. It is the precursor to more specific analyses, helping you narrow your interest to price ranges which will have large numbers of listings.

Note: While I used Minitab for the first chart, there is a way to get a very similar chart using Excel or the Excel clone in the OpenOffice suite (it is called Calc). If a few folks rattle my cage about it, I might post a tutorial on making normal and lognormal plots using Excel. (Just to set expectations, there is no way to get the axis labeling the same. My method just produces a line with the right scaling and labels the probability axis in Standard Deviation units.)

A final tip. Choosing 3+ for the Bathrooms in nearly guarantees a recently built home, at least in the Northeast. In most areas of the country, only the largest houses had more than two bathrooms until the building boom of the 1980s and 1990s.

No comments: