Show HN: Papermill Press – An AI-friendly markup language for PDF generation
Posted by davidpapermill 1 hour ago
If you’ve generated PDFs from HTML, you’ll know the pain: headless Chrome in Docker, CSS hacks, content that flows over pages or table boundaries and other quality issues.
The fundamental problem is that HTML was designed for screens, not print.
We built Press, a markup-based document language where pages, content flows, and assets are first-class concepts. Content can flow across frames, columns, and pages without manual pagination. Pages are created dynamically based on the available content.
Press templates separate layout from content. You can send markdown, Press markup, or a mixture of both to the API. Data can be sent in JSON, CSV, XML.
Because Press is XML-based it can easily be generated by agents - some of our users are generating complete documents in a single shot, although the language is designed for repeatable automation.
You can also use our MCP server, which enables models to design templates.
A simple API call sends a markdown payload, which is injected into a <flows><body>…</body></flows> element in Press:
curl -X POST https://api.papermill.io/v2/pdf?template_id=papermill-modern-report \
-H "Authorization: Bearer $PAPERMILL_API_KEY" \
-H "Content-Type: text/markdown" \
--data-binary @- \
-o report.pdf <<'EOF'
# Q3 Revenue Summary
Quarterly performance across our core product lines.
| Product | Revenue | Growth |
|----------|----------|--------|
| Platform | £482,000 | +18% |
| Add-ons | £124,000 | +42% |
| Services | £67,000 | -3% |
Strong quarter overall, driven by add-on adoption.
EOF
PDF output: https://mill.pm/hnPages and frames in Press can declare dependencies on a content flow like the <body> above. By default, if there’s no content in the flow then the frame or page won’t be generated. You can run flows between frames and pages, and combine multiple flows on a page - for example, a sidebar can run across pages until no content is left, then make room for body content. This makes it possible to implement complex layouts.
You can mix markdown and Press:
# Visualisation
Sometimes it's *useful* to mix both markdown and Press:
<visualization>...</visualization>
The typesetter adapts to dynamic content (e.g. LLM output). For example, tables and columns can be automatically sized and Papermill will even auto-rotate a table and its page to fit if needed.Templates support components, repeating over data, document logic, and conditional styling. We mostly use an inline-styling approach, and provide the concept of a style “alias”, which is a bag of styling properties you can reuse and compose.
Here’s an example template written in Press, our document language. It uses the first page layout until the sidebar flow is exhausted, then switches to the second:
<press>
<document format="A4" page-margin="2cm">
<repeat flow="sidebar">
<page>
<frame direction="row">
<frame padding-left="1cm" padding-right="1cm"><flow name="body" /></frame>
<frame width="20%" background-color="#f5f5f5" padding="0.5cm" font-size="9pt"><flow name="sidebar" /></frame>
</frame>
</page>
</repeat>
<repeat flow="body">
<page><flow name="body" width="fill" /></page>
</repeat>
</document>
<flows>
<body type="markdown"><lipsum paragraphs="10" /></body>
<sidebar type="markdown"><lipsum paragraphs="3" /></sidebar>
</flows>
</press>
Papermill is a paid API with a free tier. Press is the document language.Try it for free: https://app.papermill.io/signup (no credit card needed)
Docs including MCP setup: https://docs.papermill.io
Data-only sandbox: https://app.papermill.io/demo.html (no email needed)
We're a small team based in Manchester, UK. Tom (CTO) and I are happy to answer questions about the language design, the rendering engine, or anything else!
Comments
Comment by davidpapermill 1 hour ago
Comment by tomfitzsimmons 46 minutes ago
Comment by davidpapermill 42 minutes ago
https://docs.papermill.io/mcp/#using-the-papermill-mcp-serve...
You can ask Claude to generate templates, full documents, ask it about the Press language, save templates, ask it about templates and more. There's a lot more to it - design guidelines for print vs web, recipes in Press etc.
I've tried it with Fable already and it's a noticeable improvement - we support visual feedback through MCP and I think that's helping Fable a lot.
Comment by tompapermill 39 minutes ago
One of the things we've learned is that models often need more guidance when generating documents. We give real-time feedback to models as they're constructing the document, such as "you have content here that you've done nothing with and doesn't appear in the report as it likely overflowed off the page, you should think of a way to handle this" and "this image is sized incorrectly and just will not fit in the area you've specified".
Giving models the tools to create media for print allows both the user and the model to experiment with more wild designs, knowing that things will not randomly break!
Comment by frantzalot 44 minutes ago
Comment by davidpapermill 40 minutes ago
That said, an export from Word to Papermill templates might be a nice addition...
Comment by off_by_two 22 minutes ago
Comment by tompapermill 17 minutes ago
We don't apply hard limits on PDFs. We've handled PDFs of over 1GB and close to 1000 pages.
We also have a large document service that handles extremely large PDFs in the background.
Comment by davidpapermill 7 minutes ago
Comment by tompapermill 1 hour ago
Comment by Ste_CreaTech 1 hour ago
Comment by davidpapermill 48 minutes ago
You can connect an agent to our MCP server, see: https://docs.papermill.io/mcp/#using-the-papermill-mcp-serve...
It's quite amazing to see what Claude can do when it has a tool like Papermill at its disposal - worth connecting and having a play.
In practice, you can either directly connect an LLM to the MCP server, or send via the API after (say) cleaning up LLM output, combining it with RAG and other data etc.
Comment by tompapermill 1 hour ago
Comment by lmartinneuwave 30 minutes ago
Comment by tim_at_ping 1 hour ago
Comment by Ste_CreaTech 1 hour ago
Comment by davidpapermill 1 hour ago