FilterHN

Show HN: Fixing Google Nano Banana Pixel Art with Rust

90 points

by HugoDz

4 days ago

| past

| 7 comments

| github.com

| HN

▲

vunderba

4 hours ago

[-]

Nice. There's a couple of these (unfake which uses pixel snapping/palette reduction, sd-palettize which uses k-means to palette reduce, etc.) that I've used in the past in a Stable Diffusion -> Pixel Art pipeline.

I think it'd be worth calling out the differences.

[1] - https://github.com/jenissimo/unfake.js

[2] - https://github.com/Astropulse/sd-palettize

▲

krisoft

1 hour ago

[-]

It feels weird to me that on the before/after comparision they felt the need to zoom in on the “before” but not on the “after”.

Either both should have the magnifying glass or neither. This just makes it hard to see the difference.

▲

jasonjmcghee

3 hours ago

[-]

I can't explain it, but it's like uncanny valley pixel art. Like the artist hasn't done the final polish pass maybe?

Maybe it's the inconsistent lights/shadows?

Maybe a pixel artist has the proper words to explain the issues

▲

SXX

3 hours ago

[-]

Not pixel artist, but game dev working with pixel art:

1 - AI just try to compress too many details into so few pixels.

When artists create pixel art they usually add details along the way and only important ones because otherwise it will look like rubbish on some screens.

Also it's easier to e.g add different hats or heads or weapons on the same body. AI generated ones is always too unique.

2 - AI try to mimic realistic poses that look like art supposed to be animated in 3D.

For a real game if you make lets say isometric tactical game you'll never make tiles larger than 64x64 because of how much labour they will take to animate. Each animation at 8fps take hours of work.

So pixel art is usually either high-fidelity and static or low-fi and animated in very basic ways.

▲

smusamashah

3 hours ago

[-]

The skeleton has issues, floor tiles are very inconsistent for example. I haven't looked more carefully. We probably notice something wrong subconsciously but it takes time to point those out.

Generated pixel art for now is 80-90% done state. To use them in prod, issues should be fixed which seems to be the palette and some semantic issues. If you only generate small parts of the big picture with AI, it will be perfectly usable.

▲

threeducks

5 hours ago

[-]

Could you explain a bit how the code works? For example, how does it detect the correct pixel size and how does it find out how to color the (potentially misaligned) pixels?

▲

razster

4 hours ago

[-]

That's an actually nice setup. Have you looked at Z-Image and the Pixel LoRA that was released? I've found it works fairly well at keeping the pixels matched with the grid.

▲

vunderba

3 hours ago

[-]

The Z-image turbo model is pretty heavily distilled. I can't imagine using it for any marginally complicated prompts.

Are you talking about the LoRA by LuisaP?

Somewhat ironically, that LoRA's showcase images themselves exhibit the exact issues (non-square pixels, much higher color depth than pixel art, etc) that stuff like this project / unfake.js / etc. are designed to fix.

https://imgur.com/a/vfvARkt

▲

29athrowaway

4 hours ago

[-]

Another annoyance of Nano Banana (and its Pro version) is that it cannot generate transparent pixels. When it wants to, it creates a hallucinated checkerboard background that makes it worse.

▲

vunderba

3 hours ago

[-]

Yep. Your best bet is to ask for "solid white/black background" and then feed it into something like rembg [1]. It's an extra step but it'll get you partly there.

On the OpenAI side, the gpt-image-1 model has actually had the ability to produce true alpha transparent images for a while now. Too bad quality-wise they're lagging pretty badly behind other models.

[1] - https://github.com/danielgatis/rembg

▲

SXX

3 hours ago

[-]

Ask it for just white background. Works good for both art and to-be-3d-models.

▲

cipehr

4 hours ago

[-]

Is it possible that some of the reason pixels are messed up is because of the watermarking? https://deepmind.google/models/synthid/

Or is it purely because the models just don't understand pixel art?

▲

skavi

3 hours ago

[-]

I wonder if this would be a simple (limited) example of defeating the watermarking? Surely there's no way SynthID is persisting in what is now a handful of pixels.

▲

29athrowaway

4 hours ago

[-]

They also don't understand spritesheets.