Changing PDF Titles With pdftk

2014/11/26 at 11:00

Have you ever noticed that many PDF converters, or scanning programs, create PDF titles that are anything but meaningful? I’m using XSane for scanning, and all the PDF files get this title:

XSane scanned image

That’s totally meaningless. Others create PDF from Microsoft Word, and many of those PDF files have Microsoft Word in the title — in most cases, the title even begins with Microsoft Word, which makes it hard to identify the document you’re looking for in the window bar when you have several of them open.

With pdftk (PDF Toolkit), you can fix this easily. I’ve used it only on Linux, but apparently it’s available for other major platforms, too. Be warned that this is a command-line program.

So here’s what I do to change a PDF title.

Here’s a typical „Microsoft Word“ PDF file:

Atlas~/private/stefan> l neugier_handout.pdf
 -rw-r--r-- 1 stefan users 531831 25. Okt 2012  neugier_handout.pdf

1. First step is to dump the PDF metadata to a file which I call report.txt:

Atlas~/private/stefan> pdftk neugier_handout.pdf dump_data output report.txt

Here’s what’s in the PDF metadata:

 Atlas~/private/stefan> cat report.txt
 InfoBegin
 InfoKey: ModDate
 InfoValue: D:20081229161229+01'00'
 InfoBegin
 InfoKey: CreationDate
 InfoValue: D:20081229161229+01'00'
 InfoBegin
 InfoKey: Author
 InfoValue: Charakterstärke
 InfoBegin
 InfoKey: Title
 InfoValue: Microsoft Word - Neugier_Handout.doc
 InfoBegin
 InfoKey: Creator
 InfoValue: Word
 InfoBegin
 InfoKey: Producer
 InfoValue: Mac OS X 10.4.11 Quartz PDFContext
 PdfID0: 911d0c6f06613f3690fa270fad39d33b
 PdfID1: 911d0c6f06613f3690fa270fad39d33b
 NumberOfPages: 4

2. Second step is to edit the metadata file:

 Atlas~/private/stefan> vi report.txt

Here’s what I’ve changed. Note that I’ve used ASCII-7, because pdftk doesn’t seem to be able to properly handle UFT-8 and friends:

Atlas~/private/stefan> grep Neugier report.txt
 InfoValue: Neugier - Staerkentraining

3. Third step is to update the metadata in the PDF file. Note that the output must be written to another file — pdftk refuses to overwrite the original file:

 Atlas~/private/stefan> pdftk neugier_handout.pdf update_info report.txt output neugier_handout.pdf.copy

4. Last step is to make the copied PDF file the original PDF file:

 Atlas~/private/stefan> mv neugier_handout.pdf.copy neugier_handout.pdf

And done. Verify that the title meets your expectations in the PDF viewer of your choice:

 Atlas~/private/stefan> okular neugier_handout.pdf

The steps are easily scriptable if you’re so inclined.

Entry filed under: Computer, English, Linux, Open Source. Tags: , .

Firefox Tab Sync Issues Minecraft-Server bei Host Unlimited


November 2014
M D M D F S S
« Jan   Dez »
 12
3456789
10111213141516
17181920212223
24252627282930

Neueste Beiträge

Enter your email address to follow this blog and receive notifications of new posts by email.

Schließe dich 208 Followern an

Face


%d Bloggern gefällt das: