Overall, as a technical writeup I enjoyed the article; however, I would caution that the author seems to approach publishing from an amateur perspective.
Text is either embedded, in which case it's baked in, or linked, in which case you have to manually tell ID to update the link to reload the text.
But InDesign's EPUB output is horrifically terrible, especially if you're trying to use custom fonts/graphics for page headings. (Basically - no.)
And the CSS is... really not great.
The best fiction off-the-shelf option for EPUB gen is Vellum. It's a one-off payment of around $250 and you can get an EPUB-only version, or EPUB+PDF for print. It's not very customisable, but the presets - there aren't many - all look good.
For anything more sophisticated, options are limited. I spent far too long creating a non-fiction EPUB in ID a couple of years ago. I got there in the end but it was an extremely painful process and I ended up automating a lot of the workflow in JSX.
For fiction I created my own MD -> EPUB pipeline with a custom MD -> HTML parser for custom markup not handled by pure MD. Then a custom EPUB builder which does all the wrapping and general EPUB bureaucracy based on my own CSS.
Python has libraries for Pandoc, native DOCX, and MD (up to a point) so the basics were all there. The rest was glue.
It was a moderately-sized hobby project - would probably go much faster with AI now.
Oh how often I keep saying that these days... "All the parts are there! Why hasn't anyone piped this into that?"
I also worked at a publishing company (for ~6 years) in the early 2000s. While you are right that the pros have some tricks to make the process easier, the fact remains that the process is not easy at all. Unlike in academic publishing, where nothing stands between the author and the reader, at a commercial publishing company (at least one of the majors), there are legions of people working behind the scenes. Editors communicate with authors; editorial assistants help the editors with fact-checking, drafts, basic organization and comprehensibility; copyeditors get all pedantic about formatting and word choice (sometimes resulting in arguments with authors that the editors need to smooth over); production departments that make the books look pretty, contain images whose copyrights are cleared and that can be legibly printed within a reasonable budget; graphic designers who develop house styles or even a custom style for a book and even original cover art; lawyers who negotiate copyrights for excerpts, images, and other ancillary materials; and on and on.
I know all this because I worked on a custom content management system for this company and in so doing I discovered that the process was incredibly complex. One of the major pet peeves of everybody involved was when an author thought they were doing anybody a favor by trying format things in Microsoft Word. Most of that information was thrown away and the real layout was done by people who thought in terms of widows, orphans, kerning, and leading (and so on). Once you know what all the people in a top publishing company do, the difference between an amateur publication and a professional one becomes immediately apparent. So I don't fault the author for getting a bit technical. The SE approach sounds like an epic attempt to make a complicated subject at least somewhat approachable.
Any advise for developing this sense?
I will never work in a top publishing company but I have been able to approximate good design by first studying the fundamentals, then reproducing the layouts I see in popular media. I can make text into a beautiful book, and I see poor design choices in the corporate communication billion dollar companies.
But it feels like there’s a lot more I don’t know, and you never know what you don’t know, and it makes me wish I could absorb more from working under an expert.
For ebook production, you could definitely do worse than follow Standard Ebooks' method. That will get you a decent standards-compliant file with basic accessibility features accounted for.
It does not auto-update. Even if it did, you wouldn't necessarily want it to auto-update, because it's very hard to tell if changing one sentence in your manuscript has borked the layout of dozens of pages. Once you have rules set up around widow and orphan control, it's very easy for even tiny text changes to have large downstream layout effects.
Also, frankly, InDesign is kind of flaky and will sometimes change layout or make other visual changes in response to apparently nothing at all. I ran into a bug where it would just silently drop underlines on some elements and jiggling them a bit would bring them back.
For my two books, I ended up writing a script that would generate a visual diff of the entire book from the PDF export of the InDesign files so that I could tell for certain if InDesign had gotten itself confused. InDesign can produce beautiful output, but like a lot of Adobe software, it's temperamental and opaque.
I've been publishing print and ebooks since 2015, and I can attest to the fact the Word to PDF X-1/a to epub/kindle pipeline is painful. Making minor edits after publication is also painful, as the author notes, and can be error prone if you fail to make identical changes to all formats.
The problem was bad enough that I built by own markdown to HTML to PDF/X-1a processor using Python, WeasyPrint, and ghostscript. This also allows me to use git for version control, and I can make formatting changes using vanilla CSS. My tools are currently too crude for the average non-tech writer to use, but they save me hours every time I use them.
For any of you hackers out there looking for an untapped market, try making a user-friendly tool that converts Word, PDF and/or similar formats to the print-ready PDF/X-1a, PDF/X-3 and PDF/X-4 formats. At the moment, all the existing tools are proprietary and expensive, and many are difficult to use. This won't be a big money maker, but it will certainly be welcome by many indie authors.
https://frequal.com/forwriters/
I used it for a recent novel: https://www.amazon.com/dp/B0GYCZJVGX
One of the best examples of this that I've ever seen is The Sourdough Framework [0] -- really impressed with the way that versioning and publishing is integrated in that book.
And yes -- I know it sounds like yet another Javascript library -- but it's actually a book about sourdough bread making. It's been discussed here several times before, but this one from 2023 [1] may have been the most popular (103 comments)
[0] - https://github.com/hendricius/the-sourdough-framework [1] - https://news.ycombinator.com/item?id=35961590
But softwrap definitely has its advantages: no hard line breaks makes copying the text into other mediums easier, git diffs show only which paragraphs you edited and not a bunch of line diff noise no matter which engine you use. Only problem is it breaks my yy, dd, cc muscle memory, as AFAIK you can't force those to work on virtual (vs logical) lines.
D. J. Speckhals is the author of the “Witnesses of the Light” historical fiction trilogy, which transports readers to fifteenth-century Europe to explore the resilient faith of the Waldensians.I love buying and reading physical books. However, about half of the books (I read mostly programming books) have letters that are printed pixelated. This is infuriating to me. No one bothers to run a trial print and see what comes out?
The root cause of this: PDF will look fine, but the text color is usually set slightly off black (why!!??). The eye couldn’t really see the difference and PDF renders smoothly. However, commercial printers couldn’t handle that properly.
Solution: set the text color to full black, you are using (most of the time) black and white printer!
You might need to have two PDF versions: one for printing and one for digital distribution (but why would you have off-black text anyway?).
This can be cause by colour management. If the black is defined in terms of RGB and then converted to CMYK as part of the pre-press workflow, you'll typically have a mix of all four inks, and not necessarily 100% K - it depends on the colour profiles. For a black-only print job the C, M and Y channels will then be discarded, leaving a maybe-not-pure black.
Because pure black causes eye strain. Dark gray on white is superior for long reading sessions when your paper is white. The contrast really hurts after a while if you do pure black on pure white. This is a known phenomenon.
In fact, there's experimental evidence (https://www.nature.com/articles/s41598-018-28904-x) that this high contrast plays a hand in the onset of myopia, which in extreme forms is correlated with glaucoma and other vision disorders.
Color is the ink's job. Approximating a lighter shade of black than the ink produces by speckling the output with tiny white pixels is definitely not an improvement in readability.
Most paper in Books isn’t pure white. Leave the text completely black.
I guess this is like medical researchers "discovering" basic calculus or an office worker discovering that SFTP, sshfs, and git work fine and they don't need Dropbox after all.
What's common knowledge in one field can apparently still be alien to people outside the field, even in the age of LLMs.
Just wait until the author finds out about Overleaf...
Or what every researcher has been doing for literally decades (except with other versioning systems, but still typesetting without Word or Adobe).
No need for techbros to pat themselves on the back as innovators.
I typeset my novels in LaTeX and use GIT. I even just clone a base repo whenever I'm going to release another.