Tuesday, May 12, 2009

kw: musings, technology, computers

On a colleague's bookshelf I saw a copy of Managing Gigabytes, by Ian H. Witten, Alistair Moffat, Timothy C. Bell, and I couldn't resist a look inside. Published in 1994, the book is primarily about data compression techniques and indexing of large data collections. Just fifteen years ago, a few Gbytes was "large". It got me thinking.

In 1994 the PC I was using had a 500 Megabyte hard drive. A few years later I added a second drive with 2.5 Gigabytes (The current version of Windows won't fit on a drive that small). Some time ago I removed the hard drives and consigned the rest of the computer to a recycle bin. The home computers I have used line up like this:
  • 1979-1980 - Tandy TRS-80 that used an audio tape for data storage. It belonged to a student who worked for me. I used it as a virtual terminal, connected to a projection TV, to teach FORTRAN programming. It took about a minute to load the program from the audio tape. I don't know what amount of data the tape could store. A few dozen Kbytes, most likely. At that time, the mainframe computer that I used at that University had two disk drives the size of washing machines, that held 100 Mbytes each. A third drive used removable disk packs that could hold 50 Mbytes. I still have one in a closet.
  • 1981-1987 - The first computer I owned was a Texas Instruments TI-Pro. It had two 5¼" floppy disk drives. Each held 360 Kbytes. A 10-Mbyte hard disk drive would have cost another $1000 so I passed on that option. I would put a disk with WordPerfect in one drive and a data disk in the other to hold my word processing files. A similar procedure was used for any program I wished to run. I had a large box full of software disks. During this time the University got a new mainframe, with disk drives that held 640 Mbytes each, five of them. This was a 3.2 Gbyte data store! In 1987 I gave the TI-Pro to my dad so he could learn to use e-mail and basic word processing. He used it for about five years.
  • 1987-1995 - I got an Acer "PC Compatible" that had a turbo mode of 10 MHz. Its disk drive was 40 Mbytes. This one has also been sent to recycle, after removal of the disk. I never recycle those! At this time, I was working for an oil company as a supercomputer systems analyst, and they had a huge data center because of the data-intensive seismic processing data they had to store: hundreds of tapes and dozens of large disk drives. This was the first multi-Gigabyte datastore that I ever managed, starting eight years before Managing Gigabytes was first published.
  • Shortly after moving East we bought a computer for my wife, a HP with a 30 Gbyte disk.
  • In 1999 I replaced the Acer with a Dell that had a 40 Gbyte disk. The HP fell out of use about 2003, though we still use it on rare occasions. I still use the Dell, though it takes considerable management to keep it running. There are so many patches to that version of Windows XP that it runs quite slowly.
  • In 2007 I bought a 120 Gbyte external disk for the Dell, to which I moved the larger datasets, such as my 6 Gbyte music collection and about 7 Gbytes of photos. I do not use mainframe computers at work any more. My work PC is a laptop with an 80 Gbyte disk, which is now four years old. It accesses about a Tbyte of networked disks, where I keep all the data files for my job (a few hundred of us share the main "personal" drives).
  • I recently bought a Lenovo laptop, on which I'm writing this. Its disk is 160 Gbytes. My son has a slightly smaller capacity laptop we got him for college a couple years ago. He has bought a 500 Gbyte external drive for it. His music collection is much larger than mine. These days a 1 Tbyte disk is less than $100…
A quick search online located numerous web sites with articles titled "Managing Terabytes" (mostly dated around 2002) and "Managing Petabytes"; a Pbyte is 1000 (or 1024) Tbytes. The next prefixed level is Exabyte (Ebyte). The Google Earth photo archive is probably a large fraction of an Ebyte. I don't know if it is the largest private data store on Earth, but I think it likely.

Only one entity is likely to have more data on hand than Google. We used to joke that if an infinite-capacity "God-disk" were ever invented, the government would order two of them. These days, many large corporations, Google in the forefront, would also be lining up to buy rooms full of God-disks.

No comments: