Python PDF generation with Snakelets
In the process of converting my web site from a FileMaker hack to a Django hack, I’m going to need dynamic PDF generation. The Dining After Midnight site desperately needs an update, but one of its features is a three-fold PDF of late-night restaurants.
A Google search indicated that the tool I need to use is ReportLab Toolkit. None of the examples looked easy, but they looked easier than anything else.
What I really want to be able to do is generate PDF using a Django template. I broke this task down into two steps. One step was to be able to create a PDF in Python. The other step was to be able to embed Python into Django templates.
This article is about the first step. As a test of being able to just plain create a PDF file and display it dynamically over the web, I decided to remove Django entirely from the problem, and use Snakelets. I decided to add a PDF option to my Quick & Dirty Snakelets “blog”. Snakelets is a great way to test programming ideas, and in this case, once I got past the ReportLab examples, it turned out to be very easy.
The first step was to add a new URL to the list of snakelets in the blog app’s __init__.py:
[toggle code]
-
snakelets= {
- "pdf": blog.PDF,
- "index.sn": blog.Blog,
- "*": blog.Blog,
- }
This is the same as before, except that I’ve added a line telling it to call the PDF class in the blog.py file, when I ask for /blog/pdf/.
The PDF class itself has only two methods on it: the required “serve” method, and a makePDF method that “serve” calls to generate the PDF.
[toggle code]
- #creates a PDF version of the site
-
class PDF(Blog):
-
def serve(self, request, response):
- response.setContentType("application/pdf")
- out=response.getOutput()
- self.makePDF(out)
-
def serve(self, request, response):
This, again, is fairly simple. Except for the “self.makePDF()” method, which I’ll get to in a bit, there isn’t anything really special here. I’ve made PDF be a descendant of my Blog class. And I’ve overriden the Blog class’s “serve” method. It sets the content type to application/pdf so that the Snakelets will tell the browser to expect a PDF document; I get the response output stream so that I can write directly to Snakelet’s version of the page, and then I send that output stream to makePDF.
That’s all standard Snakelets. The second step is to generate a PDF version of the blog. First, I need to import the necessary features from ReportLab. I’m making heavy use of Platypus, which is ReportLab Toolkit’s “easy” version.
- from reportlab.platypus import Paragraph, SimpleDocTemplate, KeepTogether
- from reportlab.lib.styles import getSampleStyleSheet
- from reportlab.lib.units import inch
- from reportlab.lib.pagesizes import letter
The SimpleDocTemplate is for generating very simple PDF documents (it probably won’t suffice for the multi-column format I’m going to need in the end). I’m going to want to keep some paragraphs together, I’ll be working in inches, and generating letter-size pages.
Here is the makePDF() method for the PDF class:
[toggle code]
-
-
def makePDF(self, destination):
- posts = []
- #create the basic page and stylesheet
- style = getSampleStyleSheet()
- pdf = SimpleDocTemplate(destination, pagesize=letter)
- #display the title and description of the blog
- title = self.getTitle()
- description = self.getDescription()
- posts.append(Paragraph(title, style["Heading1"]))
- posts.append(Paragraph(description, style["Normal"]))
- #go through each blog post and display its title and content
-
for postKey in self.app.posts.posts:
- #get the post
- post = self.app.posts.posts[postKey]
- body = post.body
- title = post.title
- #each paragraph should have a little space afterwards
- paraStyle = style["Normal"]
- paraStyle.spaceAfter = inch*.04
- #the parts of the post will go into items list
- items = []
- headline = Paragraph(title, style["Heading2"])
- items.append(headline)
-
for paragraph in body.split("\n"):
- para = paragraph.decode("ascii", "ignore")
- para = Paragraph(para, paraStyle)
- items.append(para)
- #a post should not break across pages
- item = KeepTogether(items)
- posts.append(item)
- pdf.build(posts)
- return
-
def makePDF(self, destination):
It loops through each blog post; for each blog post it adds the title as a headline. Then it loops through each paragraph and adds each paragraph with .04 inches of space between them. It creates the PDF document directly to the output stream that Snakelets gave it, and that’s it. Now when I go to http://george.local:9080/blog/pdf I get a PDF version of the blog.
- San Diego After Midnight
- There is nothing quite like the hunger you get at three in the morning when everyone else has gone to sleep. If you’re hanging with the late crowd in San Diego, come and see where you can time out for a bite after midnight!
- Snakelets
- “Snakelets is a very simple-to-use Python web application server. It provides a threaded web server, Ypages (Python HTML template language) and Snakelets: code-centric page request handlers. Snakelet’s focus is to make the creation of dynamic web sites as quick and easy as possible.”
- ReportLab Toolkit
- “The ReportLab Open Source PDF library is a proven industry-strength PDF generating solution, that you can use for meeting your requirements and deadlines in enterprise reporting systems.”
- Django
- “Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design.” Oh, the sweet smell of pragmatism.
More PDF
- Creating searchable PDFs in Ventura
- My searchablePDF script’s behavior changed strangely after upgrading to Ventura. All of the pages are generated at extremely low quality. This can be fixed by generating a JPEG representation before generating the PDF pages.
- Create searchable PDFs in Swift
- This Swift script will take a series of image scans, OCR them, and turn them into a PDF file with a simple table of contents and searchable content—with the original images as the visually readable content.
- Quality compressed PDFs in Mac OS X Lion
- The instructions for creating a “reduce PDF file size” filter in Lion are the same as for earlier versions of Mac OS X—except that for some reason ColorSync saves the filter in the wrong place (or, I guess, Preview is looking for them in the wrong place).
- Calculating true three-fold PDF in Python
- Calculating a true three-fold PDF requires determining exactly where the folds should occur.
- Adding links to PDF in Python
- It is very easy to add links to PDF documents using reportlab or platypus in Python.
- Four more pages with the topic PDF, and other related pages
More Python
- Quick-and-dirty old-school island script
- Here’s a Python-based island generator using the tables from the Judges Guild Island Book 1.
- Astounding Scripts on Monterey
- Monterey removes Python 2, which means that you’ll need to replace it if you’re still using any Python 2 scripts; there’s also a minor change with Layer Windows and GraphicConverter.
- Goodreads: What books did I read last week and last month?
- I occasionally want to look in Goodreads for what I read last month or last week, and that currently means sorting by date read and counting down to the beginning and end of the period in question. This Python script will do that search on an exported Goodreads csv file.
- Test classes and objects in python
- One of the advantages of object-oriented programming is that objects can masquerade as each other.
- Timeout class with retry in Python
- In Paramiko’s ssh client, timeouts don’t seem to work; a signal can handle this—and then can also perform a retry.
- 30 more pages with the topic Python, and other related pages
More Snakelets
- Multiple column PDF generation in Python
- You can use ReportLab’s Platypus to generate multi-column PDFs in Snakelets, Django, or any Python app.
- Quick & dirty Snakelets “blog”
- This “No Second Chances” blog engine was fun to write during spare time at ETech 2007. Snakelets appears to be a useful Python webapp server if you need a webapp server immediately.