What always surprised me is how small changes in horizontal offsets can completely change the perceived 3D structure.
Did you generate the depth map first and then derive the repeating pattern, or does the generator work directly from an image?
The generator first needs a depth map, and then derives the repeating pattern from that. A normal RGB image would be far too noisy; the fine texture variations would break the repetition needed for the brain to fuse the patterns correctly.