There are some real rough spots - for instance, the Latin texts are generated via OCR from scanned documents directly; they’re not from some other scholarly corpus that’s been checked. I only looked at a few, but they all have significant transcription difficulties. Sources are linked, and those sources seem to be archive.org scans. Of course, getting a fluid-sounding translation out of a somewhat shitty transcription is something AI will do for you happily, but it’s harder to get it to tell you where it’s gone off the rails.
That’s not the thing that comes to mind, though. What comes to mind is that projects like this are super useful scaffolding, and I hope it’s built as such. Transcription will get better. Actually I’m pretty sure it could be better now, given the output quality. Translations of better transcriptions will be better. Plus we will likely have higher quality translation tech available.
So, I’d like to see a project like this lean in to that iterative side of this kind of scholarship/hobby/historical work and make versioning and logging of updates part of the interface. Starting in the late 1990s many academic projects did this with large corpuses of documents, (I’m familiar at the least with the Yale Jonathan Edwards project), and used crowd sourced support — there’s no reason not to include facilities that interleave the AI and interested Latin/Roman scholars here.
In my mind with that done, this could turn into a genuinely useful tool. Which would be cool!
> A "sow's matrix" (or vulva in Latin) is a dish from ancient Rome consisting of the uterus of a sow (a female pig), often specifically from one that has never farrowed or that was slaughtered shortly after farrowing. It was considered a delicacy among the wealthy elite and was a common dish served at lavish Roman banquets and dinner parties, often used as a sign of luxury, wealth, and status.
And it is (300,755++ lines from Claude): https://github.com/CraigVG/roman-letters-network
Here, I am sorry, but I just cannot consider it serious nor accountable, since I just cannot trust its data.
If all the information there is valid and verified, every single letter and the authors' word after the LLM's processing, then the "AI" may be dimmed.
Yet, I don't believe so, knowing how unlimitedly every subjective word may change contexts, and using objectified and limited LLM for it?
There's `?scholarly=true` GET parameter mentioned in the `:/CLAUDE.md`, but a quick check of its behavior didn't result in any change.
Regardless, the idea and overall intention that highlights the impact and importance of history, and presents connections between infinitely unique and miraculous people around the infinite world... where every single word carries a life moment... is ineffably magnificent...
Thank you, Craig Vander Galien, for the idea and love in history!
---
> Modern English translations were produced using Claude (Anthropic), working from either the Latin/Greek original or an existing 19th-century English version. Translation work was guided by two internal documents: a translation guide covering late antique epistolary conventions, rhetorical register, and how to handle common formulaic phrases; and a modern voice guide specifying tone, vocabulary level, and how to avoid archaism while remaining faithful to the original.
>
> AI-generated translations are clearly marked in the interface. They are provided for accessibility and research convenience, not as authoritative scholarly translations. The original Latin or Greek is preserved alongside every translation, and 19th-century English versions are shown where available. Corrections from domain experts are welcome.
>
> Source: https://romanletters.org/about/They have noticed the design, recognized it as the output of an LLM, then proceeded to discover that an LLM was involved in much of the creation of the project. This is an academic project. Whatever the pedigree of the researcher is, this implies to the grandparent that the final result of the work may be amateurish or worse, to an extent generated. Therefore, he's concerned that it puts the legitimacy of the research outcomes (e.g. completeness, contents of letters, classification, maybe even hallucinations in the thesis proper).
Preemptive arguments:
1. "The author's a researcher, not a programmer; therefore it's fine to use an LLM. It is preposterous to ask each researcher to learn web development to publish their research." You are right, but given the amount of vibe-coded websites we see, and them all having the default (Astro?) style, the grandparent all the same has the right to associate that style with untrustworthy crap. I'm not saying that this academic website is necessarily crap. However, I think it's useful for the grandparent to share their sentiment, because the researcher might not know.
2. "A lot of pages have links to sources; you could verify the legitimacy yourself". perhaps, but doubting the veracity of research is a bad first impression, isn't it?
It's a bit sad, because the website is non-trivial, and would have taken quite a bit of effort without an LLM. But it is difficult to separate webdev enablement with the rest of the LLM baggage.