Who knew that a tool that relies on probability could make such a mess?
Wow, 25% corrupted seems like a lot. The abstract and the intro of this paper emphasizes "documents" and it's Microsoft, so I assumed Word docs, but that's not true, they used a wide variety of things, graphs, text files, possibly images, or some machine readable description of textile weaving. A proof reader might not catch 25% corrupted textile description file, or 25% corruption in a graph.
Is this "corruption" what in text files we've all been taught to call "hallucinations"?