There are a million python libraries and tools to do some overlapping subset of the things you'd want to do with a pdf.
There are no doubt another million in other languages.
These are each basically bundles of some of the transformations you'd want to make to the same underlying data structure.
So, complex pdf scripts often need two or three different libraries to get their thing done, which is wasteful at borh a dev effort and computational level.
The ecosystem would be greatly improved if someone made a great (probably rust based) in-memory low level pdf reading and writing data structure.
PDF libraries in any language could switch to using that structure and library internally, with the carrot that the switch would result in needing less code, and likely being some combination of faster and safer.
And then if they just exposed get_structure_pointer() and set_structure_pointer(), they could all interoperate for free. (Another carrot for joining -- small libraries could usefully add features and be adopted without needing to pick an existing popular library to glom onto.)
Not sure what would economically cause this to happen, but it would be great.
[0] https://dev.to/gosukiwi/software-design-deep-modules-2on9
[1] https://developer.adobe.com/document-services/docs/overview/...
> Not sure what would economically cause this to happen, but it would be great.
Writing a library that is better than all the others is difficult to begin with. Continuing to upgrade and maintain it and fix bugs is even more difficult. Even with the right funding, you'd have to find someone who wants to keep at it year after year. When they inevitably lose interest, you'd have to find somebody else to take the reins--and weather the storm of complaints during the down time.
In short, thank you for volunteering to write and maintain this library for the rest of your life! :)
The ecosystem would be greatly improved if someone made a great (probably rust based) in-memory low level pdf reading and writing data structure.
https://github.com/J-F-Liu/lopdfAre you suggesting Adobe's Core Object Application Programming Interface (COAPI) for PDF isn't sufficient?
Kidding!
I worked on print production software in the '90s. Stuff like image positioning (eg bookwork), trapping, color separations, etc. Adobe's SDKs, for both PostScript and PDF, were most turrible. For our greenfield product for packaging (printing boxes), I wrote a minimalist PDF library, supporting just the feature set we needed. So simple.
Of course, PDF is now an ever growing katamari style All The Things amalgamation of, oops, sorry I ran out of adjectives.
Back to your point: after URLs and HTTP, the DOM is the 3rd best thing spawned by "the web".
The DOM concept itself. Isomorphism between in-memory and serialized. That its all just an object graph. Composition over inheritance.
Not the actual DOM API; gods no.
I understand that API design is wicked hard. But how is it that of the Java tools, only JDOM2 (the sequel) managed to get the class hierarchy correct? So that incorrect usage is not permitted?
(I haven't looked at popular libraries for other languages. I assume they all also fell into the trap of transliterating JavaScript's DOM's API. Like dom4j and successors did.)
I'm just repeating your point (I think) that Adobe should have staked a strong starting conceptual position on PDF internals, what a PDF is. Something more WinForms and less Win32.
30+ (?!) years later, I'm still flubbergasted by PDF's success, despite Adobe's stewardship.
PS- And another thing...
For a print description language, I greatly preferred HP's PCL-5. Emotionally, it just feels more honest somehow. Initially, Adobe couldn't decide if PDF was for print control or documents. Customers wanted documents, so Adobe grudgingly complied, haphazardly.
At least "the web" had/has committees.
Apparently people don't understand the history of PDF. PDF was originally a way to encapsulate PostScript so you could display it on a screen. Unlike PCL, Postscript (and PDF) were device-independent, with a WYSIWYG guarantee. Postscript and PDF are literally the history of WYSIWYG on personal computers and computer-based printing/typesetting.
PDF is not "print control" in the sense of a job control language. PDF has always been about documents, and the features of PDF files can be seen as an attempt by Adobe to both drive and follow the market's evolution of document handling.
PDF is complicated because it's used widely for lots of different things, including printing. And if you've never worked in the printing industry you have no idea how much of a PITA it is.
PDF succeeded for a lot of reasons, but probably the easiest explanation is that they were easier to create - you just printed it and the PDF printer driver spat out a PDF file that you could share everywhere.
The company was partially owned and housed primarily in a print shop, we worked above the press floor and I was sometimes pressed into service helping when we were slow (I had some experience working in a print shop in highschool (helping with pagemaker and helping to run the big hidleberg), similarly in college.
Nothing like ending your day writing perl cgi scripts and troubleshooting customers damn winsock configurations and then going home and coughing up whatever color was running on the presses that day.
The PDF format is frankly quite horrible, extended over the years by kludges that feels more or less like premature optimizations in some cases and bloated overkill in others.
While theoretically a nice idea, the issue is that there is just so many damn object types with specialized properties inside a PDF that you'd basically end up with all complications of a FFI for each binding you'd do to expose a sane subset.
Theoretically one could perhaps make a canonical PDF<->JSON or similar mapping from an established library that most PDF data consumers/generators could use if memory usage isn't too constrained (because the underlying object model isn't entirely dissimilar).
cpdf -output-json in.pdf -o out.json
(Modify out.json as liked) cpdf -j out.json -o out.pdf
(Disclaimer, I wrote it.)i have found them very helpful.
https://en.wikipedia.org/wiki/Poppler_(software)#poppler-uti...
It's Apache-licensed and written in C++.
That said, if you're looking for a GUI app to do simple PDF mutations it's often hard to fine a simple solid open source cross platform app.
At least I haven't found one :)
https://github.com/24eme/signaturepdf?tab=readme-ov-file#sig...
It allows installation for offline use too.
sorry, too spooky even for october. :-)
can't believe I waited so long to try it out
I was unable to find the link for OpenVMS, Apple II, and DEC Alpha binaries, could you show me where to find it?
PDFgear is free of charge, and we don’t generate income through any hidden means. We Do NOT misuse or sell user data and we Do Not display ads. Here’s how we keep operations running:
We’ve secured investment to cover operational costs, including team expenses and technology like the ChatGPT API. We’re also experienced in optimizing technology usage to manage costs more effectively.
In the future, most features will remain free, but there will be a fee for some advanced options. Paid options may include AI-driven tools requiring cloud computing and special PDF conversion features. This balanced approach will allow PDFgear to remain widely accessible while meeting users’ evolving needs with advanced solutions.
The whole purpose of a signature is that a person signed and agreed to something. That cannot be done automatically.
Its no different than the analog ages where a secretary would go through and stamp all the contracts with the CEOs signature.
Signing can be cryptographic.
Like when I hear something is the Swiss army knife of something, my take is that it does a lot of things poorly and there are better specific tools for every need. Like if you need a really terrible knife or bottle opener or screwdriver or saw, a Swiss Army knife has you covered. But it should be a tool of last resort when you have no other options.
They're great hiking, camping, traveling, in backpacks and bags.
What's wrong with it as a knife? It's perfectly sharp. Obviously it's not a full-sized chef's knife, but it will cut your apple or twine or packing tape. It's a multitool. It does lots of things. A tool of "last resort" seems to miss the point -- it's not meant to use at home, when you have a full-size screwdriver and bottle opener and corkscrew. It's for traveling with you. And it's great at that.
SAK's are iconic. I don't think your take is a common one.
It isn't as popular as ever, at least not in the Western world. I don't know what your frame of reference is, but it is positively non-existent compared to a couple of decades ago. Approximately zero kids, give or take a few, put one on their Christmas list, where when I was a kid it was many kid's dream item. I would say the most common buyer today are middle-aged men who buy it just as a thing to own because they remember how desirable they were when they were in Scouts in their teens.
>A tool of "last resort" seems to miss the point
It is quite literally a tool of last resort, and in practice people who actually own one (such as myself) have often never, ever actually used any of the options available on it because they're terrible options and we always have something better available.
Like a legitimate folding camping knife, which we all have in our camping supplies. An infinitely better knife. A tiny multi-screwdriver kit. The Leatherman brand went big by making a legitimately good, well constructed pair of pliers that they add some "in a pinch" options.
Serious campers who portage and go deep country have a proper assortment of gear and never lean on their SAK. The rest of us usually get there in a car and have a...proper assortment of gear.
But again, if you're in a situation where you have to use one of the tools on a SAK, you probably screwed up and it's a serious compromise. It just isn't a compelling metaphor for software tooling.
Your take is idiosyncratic. Using a SAK doesn't mean "you probably screwed up". That's truly a bizarre thing to say.
A SAK is a perfectly fine metaphor. That's why it's a popular one. It's a small tool that does lots of things. I think you're overthinking this.
This doesn't repudiate anything I said, and it's a particularly weird canard.
>That's why it's a popular one
Increasingly the only ones I see leveraging the metaphor are English as a second language writers (note that the idiom originates in English and is a calque in other languages) who perhaps came across it somewhere. I would hardly call it "popular", and I pointed out the reality that many readers, such as myself, find it a negative description, similar to someone calling themselves a "jack of all trades". Your defensiveness of SAK does not change this, and your attempts at invalidating my statement borders on bizarre.
Feel free to continue. I'm done here.
Your prejudice is showing. Where would you even get an idea like that?
I hope you understand that people whose first language isn't English also use SAKs. It's not just an English thing. They're not trying to repeat some unknown object they've only encountered in metaphor. The tools are literally Swiss. And popular around the entire world.
Arguing that my observations are invalid because you were in a Victorinox shop in Switzerland is the chef's kiss on this ridiculous discussion.
In the future, just move along. The other argumentative guy had no reason to get defensive about SAK, and this whole worthless discussion, from a basic observation about idioms and ill-suited tools, is a waste of bits.
It does repudiate it, directly. What are you on about?
But secondly, even that site claimed they have what, a 20% marketshare of multitools from once owning the market entirely to themselves? Even if we were so profoundly simple that we believed that being the biggest vendor in a market validates the market, this particular example is hilarious.
> The Swiss Army Knife (multi-tool) market, currently valued at $402 million in 2025
Nearing half a billion dollars doesn't sound like buggy whips to me.
And the bar chart clearly extrapolates the market continuing to grow. Not shrink.
But you still think the #1 brand in a large and growing market is "positively non-existent"...?
Again, for convenience:
https://www.marketreportanalytics.com/reports/swiss-army-kni...
More picnic less camping in the wild.
Obviously it's not the only game in town ever since Leatherman made the pliers-style tool popular as well.
But you can just look up the various brands on Amazon to see that SAK's continue to sell very well, by "x bought in the last month."
It's nowhere near 1%, I don't know where you're getting that.
Edit: according to [1] Victorinox has the #1 spot in market share in multitools. The share is a bit higher than it is for SOG and Leatherman, though they're both close.
[1] https://www.marketreportanalytics.com/reports/swiss-army-kni...
Amazed, but corrected.
After all, I’ve never handled a petard, but I like to deploy the phrase “hoist on his own petard”.
And too quickly smothered in copycats for its name to become the new metaphor.
Ring neck pillows, maybe.
It's not quite fully automatic, but it certainly saves a lot of time over doing it completely by hand.
PDF is absolutely mint for display but it really suffers when parsing is involved
- source file is .md
- file is compiled to .pdf _and_ the .md source file is included as an attachment
- when working with the file beyond viewing as a .pdf the .md is extracted and used instead of the .pdf
The LaTeX folks have a similar system ages ago where the .tex source would be included in a .pdf made from a .tex file for embedding in documents so that it could be sent in say an e-mail and then edited by the recipient --- absolutely awesome for discussing math via e-mail.
Not sure if this particular library is an improvement, but even if it serves nothing but the author’s enjoyment, or education, it’s a win.
And it’s natural to then build a cli tool on top of the library they already made.
This looks dead simple to use! LOVE IT.
The one feature request I have is for adjusting margins (adding/removing fixed amount of space from every page, optionally adding/removing different amounts from odd numbered pages). Target audience: People who want to read PDFs on small ebook readers.