In last week's post, I explained how LaTeX is my tool of choice for all forms of writing. And I also provided a tip on how you can use it to combine multiple pdf files into one, regardless of whether or not any of them were originally produced with LaTeX. In this post, I provide another tip related to manipulating pdf files that again uses LaTeX, regardless of whether or not the pdf files were originally produced with it. This tip concerns adding or changing metadata embedded within the pdf, which may also be relevant to any web developers who find it useful to include on their sites some content in the form of pdf files (e.g., search engines do crawl and index the content of pdfs, including any embedded metadata).
So you have a pdf file that you want to add metadata to (e.g., author, title, subject, keywords) and for some reason you don't have an easy way to do so from the original source of the pdf. Here's an easy way to do this with pdfLaTeX. If you don't use LaTeX, don't worry about it. You don't really need to know LaTeX to use this trick, and I have a repository on GitHub with a LaTeX file you can edit with the details of the pdf and the metadata that you want to add to it. It doesn't matter how the original pdf was produced.
Table of Contents:
- How to Add Metadata to a PDF with pdfLaTeX
- Adding Metadata to a Combination of Multiple PDFs
- GitHub Repository with the full example that you can download and edit as needed
- Where You Can Find Me
How to Add Metadata to a PDF with pdfLaTeX
Here are the steps to adding metadata to an existing pdf using pdfLaTeX.
Step 0: Install a LaTeX Distribution
If you don't have LaTeX installed on your system already, then you'll need to begin by installing a LaTeX distribution. For example, TeX Live is a good choice.
Step 1: Create a LaTex Source File
Create a LaTeX source file with a tex
extension, but name it differently than the pdf you are adding metadata to. I'll assume the name metadata.tex
in the example.
In that metadata.tex
file (or whatever you named it), add the following with your favorite text editor.
\documentclass[11pt,letterpaper]{article}
\usepackage[final]{pdfpages}
\usepackage[pdftex,
pdfauthor={Your name possibly with coauthors goes here},
pdftitle={Your title goes here},
pdfsubject={Anything you want in the subject field goes here},
pdfkeywords={Your keywords go here},
pdfproducer={pdflatex or whatever you want for producer},
pdfcreator={pdflatex or whatever you want for creator}]{hyperref}
\pagestyle{empty}
\begin{document}
\includepdf[pages=-]{originalFile.pdf}
\end{document}
In the above example, we're using LaTeX's package hyperref, which has options that enable specifying metadata for the pdf. We first need the pdftex
option of the hyperref package, which is required if we're using hyperref with pdfLaTeX, and then we can set any or all of the metadata fields inside the pdf as shown above. In the statement \includepdf[pages=-]{originalFile.pdf}
, make sure you change originalFile.pdf
to however your original pdf is named.
Step 2: Run pdfLaTeX.
You can now use pdfLaTeX to create a pdf with the contents of your original pdf but with your additional metadata. At the command line, in the directory containing the LaTeX source file you created above and your existing pdf, run the following (change the metadata.tex
file to whatever filename you used above):
pdflatex metadata.tex
This will produce a pdf named metadata.pdf
, which you can easily rename as required. You can also start with the tex
file named based on your desired target file.
Adding Metadata to a Combination of Multiple PDFs
If you want to add metadata while combining multiple pdf files into one, you can combine the above trick for the metadata with the trick from my previous post on using pdfLaTeX to combine multiple pdfs:
Combine Multiple PDF Files Into One Using pdfLaTeX
Vincent A. Cicirello ・ Sep 14 '22
For example, your tex
file might look something like the following:
\documentclass[11pt,letterpaper]{article}
\usepackage[final]{pdfpages}
\usepackage[pdftex,
pdfauthor={Your name possibly with coauthors goes here},
pdftitle={Your title goes here},
pdfsubject={Anything you want in the subject field goes here},
pdfkeywords={Your keywords go here},
pdfproducer={pdflatex or whatever you want for producer},
pdfcreator={pdflatex or whatever you want for creator}]{hyperref}
\pagestyle{empty}
\begin{document}
\includepdf[pages=-]{file1.pdf}
\includepdf[pages=-]{file2.pdf}
\includepdf[pages=-]{file3.pdf}
\end{document}
The above assumes you are combining the pdfs in their entirety. You can of course also specify page ranges as needed. Finally, run the following command at the command line to generate your combined pdf with your desired metadata (just change the name of the tex file to whatever you named the file):
pdflatex metadata.tex
GitHub Repository
To get you started, I have a GitHub repository with a LaTeX file that you can download and edit with the details of your pdf and the metadata that you want to add to it.
cicirello / add-pdf-metadata
Add metadata to a pdf using pdflatex
add-pdf-metadata
Add metadata to a pdf using pdflatex regardless of how the original pdf was produced. Here are the steps:
- Make sure you have an up to date LaTeX system installed such as TeX Live.
- Read the comments in the file AddMetadataToPdf.tex.
- Edit the line in that file where indicated with the name of the source pdf that you want to add metadata to.
- Run
pdflatex AddMetadataToPdf.tex
at the command line, which will produce a file namedAddMetadataToPdf.pdf
with the contents of the original pdf file, but with the addition of your specified metadata. - Change the name of the original pdf if you want to keep it as a backup, or delete the original if you don't.
- Rename
AddMetadataToPdf.pdf
to the name of the original pdf file. - Alternatively, you could rename the original pdf before the above procedure, and then rename
AddMetadataToPdf.tex
based on how you want the…
Where You Can Find Me
Follow me here on DEV:
Follow me on GitHub:
Vincent A Cicirello
View My Detailed GitHub Activity
If you want to generate the equivalent to the above for your own GitHub profile, check out the cicirello/user-statistician GitHub Action.
Or visit my website:
Top comments (2)
I'm new to LaTeX running into difficulty getting started in your steps using hyperref with pdfLaTeX, any advise on what I am missing?
What kind of difficulty? Are you getting an error message? If so, what's the error?