One of the most robust ways of programmatically generating PDFs is using a headless Chrome (or open source Chromium) browser.
HTML and CSS is by far the most widely support document formatting markup in the modern era, and Chrome is built on a multi-billion dollar rendering engine that works very well, and very fast.
Chrome’s headless option requires no window server, making it suitable to run in the background or in a console-only server environment.
Install Chrome (or Chromium)
Mac
Install chrome directly from the chrome website as you would any other desktop app.
Ubuntu
Generate PDFs via Command line
Linux
MacOS
In MacOS, the path to chrome is `/Applications/Chrome.app`
Generating with Python
Using chrome with Python is as easy as calling a subtask or subshell:
Python files
Working with Python tempfiles can be very handy in Python scripting. Chromium does require the input file to be present on disk, but named temporary files make this easy and you can keep your content in memory until the very moment of PDF rendering. An example:
Django
Lofty built django-hardcopy to implement the subprocess handling and some convenience methods for generating Django views that render templates as PDFs
https://github.com/loftylabs/django-hardcopy
Other Arguments
Chrome has quite a few arguments to its headless mode that can help with generation. A few are listed below:
- --window-size - Set the window size of the “browser” to assist in formatting output. |
- --screenshot=path.png - Use in lieu of --print-to-pdf to generate PNG screenshots rather than PDFs |
- --disable-gpu - Required by headless mode in some older versions of Chrome / operating systems |
- --virtual-time-budget - Limit the amount of time chrome will wait for the virtual page to load. Helpful when including some errant javascript libraries that block the page load event from firing in a headless environment |
- --no-sandbox - Required to run in Docker environments
You can read about headless mode and its arguments at the following links:
[https://developer.chrome.com/blog/headless-chrome/](https://developer.chrome.com/blog/headless-chrome/)
[https://developer.chrome.com/articles/new-headless/](https://developer.chrome.com/articles/new-headless/)
CSS
Since chrome is rendering the input as a browser, you can style your document with CSS (and Javascript, for that matter).
It's important to note that Chrome runs in the `@print` CSS media, just like if you hit the print button on a webpage, so you can take advantage of the print specific features of CSS (like pagebreaks) in your input to style the output PDF.
The most important rules to consider are :
Some notes on that are here:
https://developer.mozilla.org/en-US/docs/Web/CSS/page-break-after
https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_paged_media
https://developer.mozilla.org/en-US/docs/Web/Guide/Printing