Contents |
The "Portable Document Format" is almost good, in that it rather neatly achieves its crucial mission of providing a way to produce documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems. It is not perfect but that works well enough. Nowadays, one even generates pdf directly from pdflatex and things like "dvipdf -sPAPERSIZE="a4"" are bad memories of the past.
Here's an example of a pdf: Self-Interfering Wave Packets. D. Colas and F.P. Laussy in Phys. Rev. Lett. 116:026401 (2016). (using template {{pdf|File:colas16a.pdf}} to produce the on this web.)
This is a great command-line tool to handle pdf.
pdftk figures.pdf burst pdftk *.pdf cat output combined.pdf
—If PDF is electronic paper, then pdftk is an electronic staple-remover, hole-punch, binder, secret-decoder-ring, and X-Ray-glasses. Pdftk is a command-line tool for doing everyday things with PDF documents. Keep one in the top drawer of your desktop and use it to:Merge PDF Documents
Split PDF Pages into a New Document
Decrypt Input as Necessary (Password Required)
Encrypt Output as Desired
Fill PDF Forms with FDF Data and/or Flatten Forms
Apply a Background Watermark
Report on PDF Metrics such as Metadata, Bookmarks, and Page Labels
Update PDF Metadata
Attach Files to PDF Pages or the PDF Document
Unpack PDF Attachments
Burst a PDF Document into Single Pages
Uncompress and Re-Compress Page Streams
Repair Corrupted PDF (Where Possible)
Note that pdfshuffler does the same thing with a GUI, so this can be much more agreeable and/or convenient to use.
ImageMagick is bad to do so. Use sam2p (apt-get).
The following creates a single (combined) pdf out of all images in the folder:
img2pdf *.jpg --output combined.pdf
pdftoppm works well.
pdftoppm input.pdf output.pdf -png
So to convert all files in the directory:
for f in *.pdf; do pdftoppm -png "$f" > ""${f%%.*}".png"; done
Convert works sometimes.
convert file.pdf fig.png
creating one image file per page of the pdf. Quality can be poor and transparency preserved, so these options can be used:
convert -alpha off -density 150 file.pdf -quality 90 fig.png
Still some pages may not be exported well. This online tool works better.
To add a border:
for f in *.png; do convert -border 2x2 -bordercolor black "$f" ""${f%%.*}"-border.png"; done
pdfimages file.pdf fig
will extract images from file.pdf as ppm (or else using an option) with fig as header. One file per image (possibly does not find all of them).
In particular, the problem of "Embedded fonts".
Font Embedding refers to the fact that fonts are part of the pdf document, so that no local copy is assumed from the machine where the document is processed (viewed, modified, etc.) In principle, fonts should be always embedded (although they tend to make the document larger). Typically, they are not. And then good luck to you to find them online...
Embedding only a subset of the fonts is used to embed only characters that are effectively used: that allows to see, but not to change (edit).
Old versions of Mathematica were exporting eps file without fonts embedded (namely the mathfonts). See [1] for some background. The remedy is to run the program emmathfnt. Now, dvipdf might complain but they are apparently embedded.
To change metadata, use exiftool [2]. The title can be changed with our homemade script titlemypdf.
A possibility is to use pdflatex.
The source is (see <k href="file:///home/laussy/conf/2008/3--ICSCE4/3--animation">here</k>):
\documentclass{article} \usepackage{hyperref} \usepackage[pdftex]{graphicx} \begin{document} \title{Test animation} \author{\href{http://laussy.org}{F.P. Laussy}} \maketitle \href{run:movie.mpeg}{\includegraphics{screenshot}} \end{document}Then run (in a console)
pdflatex anim.texClicking on the (here) image will run externally the animation.
There are various tools to reduce the size of a pdf, including online. The simplest one command-lined based is:
ps2pdf input.pdf
which apparently doesn't affect the quality but strip down a lot of unnecessary content (forms, redundant material, not displayed, etc.)
Embedded fonts is a problem for the size as it can make a small document very bulky. It is very difficult to replace or remove an embedded font, although apparently this is something you can do with acroread (if available). Otherwise, LibreOffice Draw does something similar: open the pdf and export (as pdf) can reduces a lot the size (in some case changing the aspect badly by scrambling the font, but for exported Gmail pages for instance it works well).
Another way is to import the pdf page with Gimp, compress in jpg and then export the jpg bak into pdf. In this way you control the quality and can achieve drastic reduction. You may have to do this on a page-per-page basis, using the most suitable technique in all cases. In this way we managed to reduce a ~46MB file of 383 pages to less than 14MB as requested for its upload on some governmental server.