I asked Claude to do some heavy work, and it was done perfectly. When I asked Claude to write a Python script to do the same task with proper prompts and output so I don't have to use Claude, it messed up and triggered a chain of failures on multiple attempts.
AI has non-deterministic output. It can give different results for the same prompt, so for different prompts it can definitely produce artefacts with the different quality.
I think you just need to tune your prompt for script or ask Claude to fix it.