am i right that the narration is generated with AI? sometimes there might be small pronunciation quirks, but overall the quality already sounds pretty good from what i tried.
one thing that would make this even more useful for learning (at least for me) is word-level explanations. for example clicking a word and seeing a simple definition in the same language (like german → german explanations in learner dictionaries), not just translation. that really helps build intuition.
the idea of learning languages from AI didn't quite sit right with me. but that might be something to circle back to.
integrating learner dictionaries does sound like a fantastic idea. will definitely explore that!
really nice project overall.
got all of the audio alignment, translation, and asset generation working on my gaming computer. pretty happy with the pipeline, except for the sometimes subpar translations.
if anyone is interested in the details I am happy to write them up!
if you are into language learning, I would love to hear if this could be useful to you!
Just a heads up: the text is not displayed on my Firefox (140.8.0esr 64bit, Win11) however. On Edge it is displayed correctly.
will make sure it falls back to fully visible text.
https://developer.mozilla.org/en-US/docs/Web/CSS/Reference/P...
are you looking for stories in a specific language?
I've been meaning to learn Spanish, and this looks super useful.
Would love to learn more about your pipeline [selfishly, I was looking to build (free) ebooks -> audio for my own purposes as a side project]
What were the most challenging aspects? What assumptions failed / held true? Any experiences to share? Thx
went through quite a few iterations of aligning text to speech. found that ai transcription was really good most of the time but would hallucinate quite a bit towards the start and end of books. which I think might be related to those models being partially trained on audiobooks, and only having the book text itself, without any of the intro or credits.
in the end I landed on extracting text from ebooks, using rule based and language specific segmentation, and espeak based alignment. pretty basic, but it worked wonders in terms of reliability and accuracy.
if you are looking to generate audio from ebooks this is probably not too helpful. it is something I tried to avoid. something about learning a languages from generated audio didn't sit right with me haha.