Tuesday, January 27, 2009

Data miners, light your lamps

kw: book reviews, nonfiction, internet, behavior

PPC means 'pay per click', but to Bill Tancer (I think it rhymes with dancer), it also means, 'porn, pills, and casinos', some of the biggest categories of internet searches. Porn searches top the list of all searches…or at least, they used to. This may be changing.

In the UK at least, social networking searches have surpassed them (Image from Hitwise.com's blog site). The author of Click: Unexpected Insights for Business and Life, Bill Tancer is one of seven Hitwise bloggers, and the company's lead analyst (& general manager of global analysis).

Hitwise purveys competitive intelligence of all sorts, gleaned from a massive, continually growing database of internet searches, clicks and click-throughs. For most of us, a spate of activity on the Web starts with a Search, or several-many. Then we Click on some of the results. A Click-through is a trace: for each Click (URL clicked upon) it is the URL that preceded it and the one that follows. These can be chained for more complex analyses. With the right software and the amalgamated searches, etc. for ten million Web users, you can learn a lot.

The author takes us down a few of the paths he has followed, to find out why searches containing the words "prom dress" spike in January rather than later in the Spring; why the popularity of a certain lady wrestler didn't translate into a win on Dancing with the Stars; or how the internet has become many people's psychotherapist.

To take up the first item: One large segment of "prom dress" searchers is composed of affluent, fashion-conscious girls, and many fashion magazines (on-line and off) publish their initial prom fashions in mid-December…and these girls don't wait. The girls who are more likely to buy a department store dress tend to begin searching in April, and are less avid searchers.

Yearly trends clearly show what people are resolving, each New Years' Day, what diets are most popular when we need to get rid of some of that Thanksgiving lard we pack on, and when young minds begin to think of June weddings (it isn't June).

But the data have social correlations also. This chart, of clicks on charitable websites, shows an interesting trend compared to the Dow-Jones index. Whether this mainly represents people who think more charity will be needed, or people hoping to find help, is left unstated in the blog post this comes from.

But this isn't a review of the blog, but the book. The author supports his contention that Click data can reveal trends that polling never will. This is because we're not afraid of the search engine. It won't pass judgment on us. We ask things we really want to know, from "how to kiss" to "why am I sad" to "where is Tom Cruise today". You may tell a telephone pollster, "I don't gamble," and the number of people who might admit that is indeed very small; but a large proportion of us spend more time on casinos.com and similar sites than the poll would predict.

The author often begins a public appearance by saying, "I love data", or, "I've always loved data". The audience may titter, but quite a number of people whisper later, "So do I." Data are the roots of information, and information is power. The author quotes John Batelle, who said it best,
This information represents, in aggregate form, a place holder for the intentions of humankind—a massive database of desires, needs, wants, and likes that can be discovered, subpoenaed, archived, tracked, and exploited to all sorts of ends.
Just by the way, folks, there may be a constitutional right of privacy, but the real thing is now that much harder to attain. I enjoyed the book, but the implications are unsettling.

No comments: