Monday, September 15, 2008

Making that picture perfect . . . in size

kw: methods, instruction, image optimization

This tip is too good to pass up. There is a little, unobtrusive icon in the Picture toolbar in the Microsoft Office products, which I only noticed recently. Though I've been using Office 2003 for five years, it simply escaped my notice. When your mouse cursor hovers over the icon I have highlighted here (the tenth), a hint pops up: "Compress Pictures".

In the past, based on advice I received a decade ago or more, I'd pre-optimize all the images I was going to use in a PowerPoint or Word document by changing to a resolution consistent with the final purpose (print or projection) and using either GIF or JPG compression to minimize the number of bits needed to store the image once it was embedded.

As a side note for those who need it: GIF uses run length encoding, so it is best for images such as screen shots or graphics such as those produced by Visio or a CAD program. JPG uses fourier series optimization, which drops out more detail as you use a smaller "quality" criterion. A "quality=90%" JPG looks as good as either a GIF image or an uncompressed (BMP, for example) image. However, for most screen shots the GIF version will be smaller, while for most photographs the JPG will be smaller.

The screen shot above is 416x63 pixels, which would take about 26 Kbytes to store as an 8-bit BMP file and three times that much for a 24-bit BMP. As a JPG file it is 10 Kbytes, and as a GIF (which I use here) it is 7 Kbytes. The GIF file for the image below has even better compression, because there is less detail: an 8-bit BMP would require 145 Kbytes, but the GIF size is 11 Kbytes.

I couldn't resist trying out the idea with a slide show I was planning. I started with images from my digital camera, some cropped. The uncropped ones were 1.5 Mbytes in size, and at 3000x2000 pixels, much too large for projection or printing once embedded in PPT slides. Even the smallest crops were close to a Mbyte each. The final file size was 27 Mbytes, yet it has only 25 slides! I first made a print version. Clicking on the Compress Pictures icon yielded this dialog box:

The default setting is for Print resolution. I changed only the first item, selecting "All pictures in document", and left the resolution at Print (200 dpi). I saved the result with a modified file name.

An image that sits nicely into a titled PPT slide is about 4x6 inches (around 10x15 cm), so at 200 dpi such a picture would need only 800x1200 pixels. That alone reduces the number of pixels by a factor of 6.25. The resulting file's size is 4.5 Mbytes, about 1/6th the size of the original file.

Then I redid the compression using Web/Screen resolution, which is 96 dpi. The number of pixels to be stored is now 1/27th of the original. Saving with yet another file name, I saw that the file size is now 1.3 Mbytes. This file, when projected, looks as good as the Print-resolution version (and the orignal), though there is a difference visible when pages from the two files are printed.

Next I tried this method with a large Word file I've recently been asked to edit at work. It is humongous: 100 Mbytes, yet is only 54 pages, and has 57 screen shots. Looking at them, I could see that they were pasted in, and had probably not been saved first as a file or otherwise optimized. They are mostly 1024x768, 24-bit screen shots. I tried Compress Pictures, and Print resolution (the document is intended for printed use). Guess what? No help whatever. The new file was just as large as before!

I tried another method I'd heard of for finding out which images in a Word file are too large. I saved it as HTML. Amazingly, all the images were modestly-sized. None were larger than 150 Mbytes. So I did the dumb-luck thing and re-opened the HTML using Word, and saved the whole mess as a Word file. The new file is 2.6 Mbytes!

When a Word (or other Office) file is saved as HTML, all embedded objects are saved as separate files in a subdirectory with a name derived from the document's name. That is where the images were put. Some were GIF, most were PNG (similar to GIF but some files compress better that way), and a few were JPG. The HTML conversion does optimization of its own. In this case, the final compression factor is about 50. I don't know why. I don't plan to go to the work to figure it out, either. That'd be looking a gift horse in the mouth.

I have since used the Compress Pictures to save space for several files, and I'll keep the to-and-from-HTML method in reserve in case I get another pathological file.

No comments: