I’ve got the easy stuff working perfectly. If the DOM says <p>Hello World</p> and the code contains return <p>Hello World</p>, my AST chunker + vector embeddings find it instantly.
But I’m hitting a wall with dynamic data.
If the user sees Tuesday, Dec 23 on the screen, the code responsible is usually something like:
{date ? new Date(date).toLocaleDateString(...) : 'TBD'}
My current approach (chunking the file and embedding the chunks) fails hard here.
Embeddings fail: The semantic vector for "Tuesday" is miles away from toLocaleDateString.
Grep fails: Obviously, the string "Tuesday" doesn't exist in the file.
I really want to avoid instrumenting the build pipeline (e.g., no custom Babel plugins adding data-source attributes) because I want this to work on "plain" repos.
Has anyone successfully bridged this gap? I'm considering using an LLM to "hallucinate" the source code based on the UI text before searching, but that feels slow/expensive.
How would you solve this?