Tuesday, September 06, 2016

Don't give up on your neighborhood library just yet

kw: book reviews, nonfiction, libraries, librarians, digital libraries

Clever geeks have been trying to replace people with mechanisms for a long, long time. "Artificial intelligence" didn't begin with Eniac in the 1940's. It didn't even begin with "The Turk", a chess playing automaton that was exhibited from about 1770 onwards, for more than 80 years. Even after it was revealed that a chess master hidden inside its cabinet was actually moving the hands, it remained a popular attraction. The earliest legend of a manufactured intelligent creature may be that of the Golem; the legends long predate the story of Rabbi Loew in the 1500's.

I have watched the hype intensify for the fifty years I have been involved with electronic computers. Marvin Minsky, in particular, made numerous predictions of things that computers would be able to do "in 20 years", "in 50 years", whatever. He was wildly optimistic. Eventually, though, humans have been bested at certain tasks we tend to associate with "intelligence," by computer systems: Chess, Go, Checkers and Jeopardy. That last, by the Watson system, required tens of thousands of hours of laborious work by more than a hundred computer scientists and database experts, and cost at least a couple of billion dollars (IBM has been coy about revealing the amount). Watson did indeed glean the highest score, but made a few silly mistakes.

During my last ten years working at Dupont, I worked with the Indexing group, spending part of my time reading technical reports—usually very quickly—and applying index labels to their metadata; and part of my time improving the computer interface used by the professional indexers. During those years, we were dogged by questions such as, "Why can't something like Google do this instead, for a lot less money?" Fifteen years earlier I had helped the Indexing group test and review software that was intended to produce the indexing labels automatically. There are a lot of "keyword generators" out there, but only one "Summarizer" seemed to produce any shadow of appropriate key terms lists. It was by far the best piece of software, but it seldom scored better than 50%, when compared to the lists of terms produced by the human indexers. I heard that, a year or so after I retired, most indexing was indeed handed over to software, and though the results are rather poor, they are indeed cost-effective, if one only thinks in the short term. Sigh.

Later, working as an indexer, I experimented with using the Summarizer: After I had gathered my own set of key terms, I would run it and it occasionally came up with a term or two that made me pause, and then add them to my own term list. With practice, and some improvements of the interface, I reckoned that an indexer might be able to save time by having the software display its list automatically after the human list was first created, because it takes no more than a few seconds to discern if any of the machine-derived terms make sense. We might have been able to save some time and also produced a little better results. But with software alone, there is no such hope, and there will not be for at least one or two (or three or more) generations.

This effort was part of the Library and Information Sciences division of the Research department. Librarians have long (thousands of years) been the keepers of the flame of knowledge. Now that "the Web" has become a ubiquitous choice for finding stuff, an index into a world-spanning online library, people are wondering, why keep our physical libraries around? Here is why. The PageRank algorithms used by Google, though they are being continually improved, actually leverage human intelligence! The largest factor in the PR ranking is still the number of pages that link to a page of interest, as a measure of how many people have found it useful.

I built a career in computer programming and information science spanning, so far, 48 years (I still work part time), based primarily on taking proper advantage of both human and machine abilities. I like to call it the Synergy of Mind and Mechanism. Some (benighted) people may think that "the internet" can replace all the functions of the 120,000 libraries across America, and the 350,000 or more worldwide. Actually, with the current flood of new information being generated, mostly by non-professionals and non-experts, the need for librarians is increasing.

All that being said, I find now a book that reaches a similar conclusion, from someone we might call a "partial insider." Though John Palfrey is not a degreed librarian, he was made head of the Harvard Law School Library, where he updated its workflows, reformed it mightily, and kept the best of its traditional "analog" character while adding great digital resources. His status as a "feral"—the librarians' derogatory term for "layman librarian"—has made many librarians look askance at his work, even though it is intended to save their jobs and even increase their numbers! His book is Biblio Tech: Why Libraries Matter More Than Ever in the Age of Google (for non-francophones, biblioth√®que is French for "library").

The book is not nearly as polemical as I expected. The ten chapters outline nine areas in which this digital age we are entering (we've only just begun) is a complement to the "analog age", not a replacement for it. Just because something is "old" doesn't make it "obsolete" or outdated. We are surrounded by "old" things that continue to function well and may never be replaced.

So it is with libraries. Documents were collected, not because people just like to collect (as indeed they do), but because there was value in having a place where every authorized academic or student could find the key documents needed to do his work. The primary innovation in libraries was not one or another technical improvement, but the opening of formerly closed collections to the public, which began no more than 150 years ago! Libraries and the networked communications of librarians were fundamental to the development of democracy.

I believe it is still true that most of us have a liking for the local library (some 16,700 branch libraries in towns and cities in the U.S. and nearly 100,000 in our schools). But Dr. Palfrey warns that nostalgia can only go so far. To date, nostalgia has not prevented the budgets for public and school libraries, in particular, from being reduced to half of what they were a generation ago, in real terms.

Call me a Luddite: I visit the local branch library at least monthly, and usually more frequently. I still prefer reading books printed on paper. I find that I can read from a computer screen, even the most "retina"-sharp, even the latest paper-white Kindle, no more than 10-15 minutes at a stretch. I can read a paper book for a practically unlimited amount of time. Sure, the under-30 crowd can focus on their screen the whole day long, but I suspect that nearly none of them spends long stretches of time reading an e-book. They scatter their attention, "multitasking". The "eye-unfriendly" nature of every kind of screen so far invented probably has something to do with that!

Libraries and librarians do their best work in a network. This has always been so, although there is a strong tendency toward hoarding within any "collecting" profession, including museum collections as well as libraries. I would never have finished graduate school without free InterLibrary Loan (ILL) facilities. Any book I could learn the title and author for, a reference librarian could find somewhere, and have it shipped to the library or even to me, within a week. The spread of digital technologies makes networking easier than ever. For those who like e-books, ILL can now take no more than a few minutes to find and deliver a document or book…for those items that are still lendable. Chapter 9 on Law warns that digital works are frequently hard or impossible for a library to loan out. The licensing contracts called "Digital Rights Management" trump copyright law, with its specific "second sale" provisions that favor library lending. Lending a hardcopy book is practically free. Lending an e-book often costs some small licensing fee for each loan, and those can add up in a hurry. So while  a $25 hardbound book may seem costly compared to the $12 license fee for an e-book (you never BUY an e-book, you license it), the residual lending-licensing fees can make an e-book much, much more costly than the hardback.

Though this book has less of a "go do this" character than I expected, it does provide great resources for librarians and those who love libraries, and great ammunition to use at those public meetings where town and county budgets are discussed. The librarians are our friends, and they need our help. Has a library near you closed recently, or is one under threat? Get two copies of this book, one for yourself and one for your friendly local librarian, and then go together to budget discussions and enter into the fray of democracy, not just as a voter but as a participant. Perhaps a future generation will look back and thank us for keeping "physical libraries" from going the way of the Dodo.

