I prefer claude for generation / creativity, codex for bull-headed, accurate complaining and audit. Very rarely claude just doesn't "get it" and it makes sense to have codex direct edit. But generally I think it's happiest and best used complaining.
Even with the same model (--self-review), that makes a huge difference, and immediately highlights how bad the first iterations of an LLM output can be.