EDIT: I’ve been reminded/informed that this only works in Windows (or MS-DOS anyway) since it uses .bat files. The suggestion if you’re using other OSs is to use php (but really you can use anything) to automate the command.
I’m sure everyone is familiar with Adobe Acrobat (even if they haven’t actually used it). It’s a nice GUI if you want to edit PDFs, but at least as far as I know, it does not do any batch or automation work. For a digital images project, there’s a lot of automation work that needs to be done and for image to image conversion, I was using Photoshop, but then I started dealing with PDFs. Thus, it was only natural to turn to GhostScript.
PDF to Image
So, I don’t really get any credit for this, because it’s already out there and the variables are well explained. So if you want to turn all the pages of your PDF into images, check out this Danzels Internets post. My case was a little different because I only wanted the first page turned into an image as a thumbnail for an entire file and then for an entire folder. I also prefer to do any image modification (even batch) in an image program.
FOR %%Z IN (*.pdf) DO gswin32 -sDEVICE=jpeg -dJPEGQ=95 -dGraphicsAlphaBits=4 -dTextAlphaBits=4 -dDOINTERPOLATE -dFirstPage=1 -dLastPage=1 -sOUTPUTFILE=%%Z.jpg -dSAFER -dBATCH -dNOPAUSE %%Z
So, here the major changes are “gswin32” because I use the Windows version, and the “-dFirstPage=1 -dLastPage=1” so that the first and last page it processes is page 1. You can change the output file name too, so I changed it in such a way that it takes the original file name and adds the .jpg extension.
This is kind of a side note, because I didn’t need this for my project, but I recently downloaded some articles that for some reason had each section in a separate PDF. So, I get no credit for this one either as I got this one from Real’s How-to on Merging PDFs. I put this in here only for possible improvements of what’s presented on that site.
For the merging of PDFs in a directory, for the [merge.bat], you’re supposed to have this code:
gswin32 -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=merged.pdf -dBATCH 1.pdf
FOR %%Z IN (*.pdf) DO IF NOT %%Z==1.pdf IF NOT %%Z==merged.pdf IF NOT %%Z==merged2.pdf call merge2.bat %%Z
Maybe it’s clear to other people, but the “1.pdf” is the name of the first pdf. I found that the subsequent ones will be added in alphanumeric order. Also, if you happen not to change the code, it will throw an error and insert a blank page at the beginning.
2 thoughts on “PDF Batch Automation (PDF to Image and PDF Merge)”
I receive multiple PDF’s daily and save them on my C drive,”C\files\input\pdf\”, and trying to combine them into one pdf file outputting to “C\files\pdf\output. I sometimes receive 10 to 20 at a time with different file names. Do you know of a batch (.bat) that would do that for me. I’m looking to schedule this .bat script to run automatically in the task scheduler.
I tried you gs script above but was unable to get it to work.
Thanks for your comment. If you’re looking to merge files then you’ll need to actually go look at the link I referred to. What I posted is only one file of the code but there are in fact 2 files needed. Mostly, I was commenting on what needs to be changed if you’re trying to use the code. For your purposes, since the first file will always have a different name, then you’ll probably have to play around and combine the two parts of the code instead into one .bat file.