MESSAGE
DATE | 2004-06-18 |
FROM | From: "Inker, Evan"
|
SUBJECT | Subject: [hangout] Putting together PDF files
|
Putting together PDF files Thursday June 17, 2004 (12:00 PM GMT) Topics: Utilities , Linux , Software , Operating Systems By: Scott Nesbitt
There are times when you need to combine multiple files from diverse sources into a single PDF file. In Windows or the MacOS it's easy -- use Adobe Acrobat. Sadly, Adobe hasn't deigned to put out a version of Acrobat for Linux, but there are a number of Linux utilities available that enable you to quickly and efficiently combine PDF files. This article looks at three command line utilities: Ghostscript, joinPDF, and pdfmeld. Each does a good job of combining PDF files, and they all pack some interesting features.
Joining PDFs the Ghostscript way
To use Ghostscript to combine PDF files, type something like the following:
gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=finished.pdf file1.pdf file2.pdf
Unless you're very familiar with Ghostscript, that string of commands won't mean much to you. Here's a quick breakdown:
gs -- starts the Ghostscript program -dBATCH -- once Ghostscript processes the PDF files, it should exit. If you don't include this option, Ghostscript will just keep running -dNOPAUSE -- forces Ghostscript to process each page without pausing for user interaction -q -- stops Ghostscript from displaying messages while it works -sDEVICE=pdfwrite -- tells Ghostscript to use its built-in PDF writer to process the files -sOutputFile=finished.pdf -- tells Ghostscript to save the combined PDF file with the name that you specified
When using Ghostscript to combine PDF files, you can add any PDF-related option to the command line. For example, you can compress the file, target it to an eBook reader, or encrypt it. See the Ghostscript documentation for more information.
The biggest advantage to Ghostscript is that it's a standard part of many Linux distributions. If you don't have it on your computer, it's easy to download and install it.
Using Ghostscript has its drawbacks, too. Unless you use Ghostscript's PDF options, the utility produces a barebones merged PDF file, and a large one at that, because by default Ghostscript doesn't compress PDF files. On top of that, some people may find typing long strings of options at the command line to be a bit of a chore.
joinPDF: Quick and simple
If you want a no-muss, no-fuss way of joining two or more PDF files together, look no further than joinPDF. It's a simple but elegant little utility that consists of a script (named joinPDF) and a compiled Java file. To run it, you only need to specify at the command line the name of the output file and the files that you want to combine. To use joinPDF you type something like this:
joinpdf myFile.pdf file1.pdf file2.pdf ...
Depending on how many PDF files you're combining and their sizes, joinPDF only takes a few seconds to merge them. JoinPDF compresses the output file it generates; while writing this article, I used with joinPDF to merge various combinations of files of various sizes, and each time, the resulting PDF file was several kilobytes to several tens of kilobytes smaller than the total sizes of the source files.
JoinPDF is a Java utility -- to use it, you need version 1.4 of the Java Runtime Environment installed. It runs on any Linux distribution, or any other operating system that supports Java. In order to use joinPDF out of the box, you have to copy the Java file to the /usr/lib directory -- that's where the joinPDF script expects to find it. If you want to put the Java files somewhere else, like the /usr/local/bin directory, you need to edit the joinPDF script to point to that directory.
The biggest advantage of joinPDF is its simplicity. There are no options to remember. Of course, some users might find joinPDF's simplicity to be a detriment. If you want options, joinPDF isn't for you. Also, joinPDF cannot join PDFs if one or more of them is encrypted.
The joinPDF package comes with another script called splitPDF. As its name implies, splitPDF is used to extract pages PDF files. A discussion of splitPDF is beyond the scope of this article, but if you need to pull pages out of your PDF files, you'll find splitPDF useful.
Merging PDF files with pdfmeld
Do you need a lot of features in the software that you use to combine your PDF files? Then consider pdfmeld. Of the three applications discussed in this article, pdfmeld is probably the most powerful and flexible.
To use pdfmeld you type something like this at the command line:
pdfmeld file1.pdf,file2.pdf,... result.pdf [options]
pdfmeld has literally dozens of options -- for a full list, check out the documentation. These options include adding bookmarks to a PDF file, encrypting the PDF file, and adding information like title, author name, and subject. While it sounds complex and difficult to use, pdfmeld really isn't. You'll quickly find that you'll only use a handful of the options regularly, and you can forget about the rest.
pdfmeld doesn't just combine PDF files. You can use it extract pages from a PDF file, rearrange the pages in a file, rotate pages, and even touch up text. In fact, pdfmeld packs many of the features of Adobe Acrobat in a package that weighs in at just over 1 MB.
pdfmeld's range of options are its greatest strength. But they come at a price, albeit a small one -- $9.95. Like joinPDF, pdfmeld automatically compresses the resulting file. It's also very fast: it only took a few seconds to mash three 20-page PDF files together on my old 300MHz Linux box.
I found very little wrong with pdfmeld. One problem that I did encounter, that I didn't see with Ghostscript or joinPDF, was the error message "Page Contents Object has Wrong Type" when I tried to open a merged PDF file in Acrobat Reader. This happens when an empty page contains contents information. This only happened twice, when I added a cover followed by a blank page to a particular document.
Other tools
These three applications aren't your only choices. Some of the other tools available for merging PDF files include pfdtk, Multivalent, and pdcat. I briefly looked at pdftk and Multivalent (pdcat is a commercial product), and found them to be solid applications.
So, which utility comes out on top? Just for its sheer number of features, you should give pdfmeld a serious look. While some people might balk at dropping $9.95 for software that does pretty much the same thing that Ghostscript does, I think the price is well worth it. Of course, being a long-time Ghostscript user I still have a soft spot for it. But typing those long strings of options really wears me down after a while. And joinPDF is perfect if you want to get the job done quickly and easily.
If you're adamant about using only free software, then go with Ghostscript or joinPDF. But if you can afford to drop 10 bucks, you'll find that pdfmeld is a great little application that can handle all of your PDF merging needs and then some.
Scott Nesbitt is a Toronto, Canada-based writer and the Toronto managing editor for the ScalableAir Network.
**************************************************************************** This message contains confidential information and is intended only for the individual or entity named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. E-mail transmission cannot be guaranteed to be secure or error-free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission. If verification is required please request a hard-copy version. This message is provided for informational purposes and should not be construed as an invitation or offer to buy or sell any securities or related financial instruments. GAM operates in many jurisdictions and is regulated or licensed in those jurisdictions as required. ****************************************************************************
____________________________ NYLXS: New Yorker Free Software Users Scene Fair Use - because it's either fair use or useless.... NYLXS is a trademark of NYLXS, Inc
|
|