FilterHN

Ask HN: Anyone doing production image editing with image models? How?

3 points

6 hours ago

| 0 comments

Hey HN — I’m building an app where users upload “real life” clothing photos (ex. a wrinkly shirt folded on the floor). The goal is to transform that single photo into a clean, ecommerce-style image of the garment.

One key UX requirement: the output needs to be a PNG with transparency (alpha) so we can consistently crop/composite the garment into an on-rails UI (cards, outfit layouts, etc.). Think “subject cutout that drops cleanly into templates.”

My current pipeline looks like: 1. User-uploaded photo (messy background, weird angles) 2. User-upload is matched to “query” image (style target) + promptCurrently using Nano Banana) 4. Background removal model to get transparency and save as RGBA PNG

This works, but it feels hacky + occasionally introduces edge artifacts. Also, the generation model sometimes invents shadows/background cues that confuse the background removal step. It feels like the two steps are fighting one another.

I’m trying to understand what “good” looks like in production for this kind of workflow:

Are people still doing gen/edit → separate background removal as the standard?

Are any of you using alpha-native generation (RGBA outputs) in production? If so, what’s the stack/workflow?

If you’ve done “messy UGC photo → catalog asset” specifically: what broke most often and what fixed it?

I’m not looking for vendor pitches—mostly practical patterns people are using (open source workflows, model classes, ComfyUI/SD pipelines, API-based stacks, etc.). Happy to share more details if helpful.

No one has commented on this post.