It Took Me 30 Years to Solve This VFX Problem – Green Screen Problem [video]
270 points
4 days ago
| 18 comments
| youtube.com
| HN
Springtime
18 hours ago
[-]
In an earlier video they made a couple years back about Disney's sodium vapor technique Paul Debevec suggested he was considering creating a dataset using a similar premise: filming enough perfectly masked references to be able to train models to achieve better keying. So it was interesting seeing Corridor tackle this by instead using synthetic data.
reply
somat
18 hours ago
[-]
With regards to the sodium vapor process, an idea has been percolating in the back of my head ever since I saw that video. But I don't really have the budget to try it out.

theory: make the mask out of non-visable light

illuminate the backing screen in near Infra-Red light. (after a bit of thought I chose near-IR as opposed to near-UV for hopefully obvious reasons)

point two cameras at a splitting prism with a near IR pass filter(I have confirmed that such thing exists and is commercially available)

Leave the 90 degree(unaltered path) camera untouched, this is the visible camera.

Remove the IR filter from the 180 degree(filter path) camera, this is the mask camera.

Now you get a perfect non-color shifting mask(in theory), The splitting prism would hurt light intake. It might be worth it to try putting the cameras really close together , pointed same direction, no prism, and see if that is close enough.

reply
overvale
18 hours ago
[-]
Debevec tried a version of this: https://arxiv.org/abs/2306.13702
reply
dgently7
9 hours ago
[-]
im familiar with this work and specifically they tried replicating the sodium vapor style approach but what worked for poppins level isnt actually good enough for today. Specifically you still end up with light spill that contaminates the foreground, especially for things like the fresnel reflections on the side of a face. the magenta idea was to still do what is basically a color difference key, but increase the color separation between fg and bg by lighting the two with different opposite colored lights. then using a ml model to recover the original fg color.
reply
wiml
17 hours ago
[-]
This approach was used in the 1950s/60s with ultraviolet light (rather than IR) to create a traveling matte. I'm not sure why visible-light techniques won out. Easier to make sure that the illumination is set up correctly, maybe?
reply
regularfry
6 hours ago
[-]
Or maybe they didn't want to blind the talent. UV isn't something you want to bathe your retinas in.
reply
randyrand
2 hours ago
[-]
Couldn't this be summarized as the Sodium Vapor technique but with near-IR? Or do I misunderstand something?
reply
diacritical
18 hours ago
[-]
Don't humans and other warm objects also radiate IR?
reply
somat
17 hours ago
[-]
That is far-IR, thermal stuff, Near-IR, 700 nanometer-ish is right below red in human vision.

Camera sensors can pick up a little near-IR so they have have a filter to block it. If that filter was removed and a filter to block visable light was used in place you would have a camera that can only see non-visable light. Poorly, the camera was not engineered to operate in this bandwidth, but it might be good enough for a mask. A mask that does not interfere with any visible colors.

reply
fc417fc802
16 hours ago
[-]
> Poorly, the camera was not engineered to operate in this bandwidth

At least for cheap sensors in phones and security cameras that engineering consists of installing an IR filter. They pick it up just fine but we often don't want them to.

Keep in mind that sensors are inherently monochrome. They use multiple input pixels per output pixel with various filters in order to determine information about color.

reply
throwway120385
17 hours ago
[-]
You can actually dimly perceive near-IR LEDs -- they'll glow slightly red in darkness.
reply
adrian_b
4 hours ago
[-]
That depends on how "near" they are.

The sensitivity to red light decreases quickly at wavelengths greater than 650 nm, but light can still be perceived if it is strong enough, up to around 780 nm.

Many so-called near-IR LEDs may actually be somewhere around 750 nm, so they are still visible on a dark background, even if they are perceived as extremely dim.

On the other hand, there are many near infrared LEDs around 900 nm and those are really invisible. Near-infrared LEDs around 1300 nm or around 1550 nm are also completely invisible.

An invisible near-infrared laser beam could become visible due to double-photon absorption, but if a beam of such intensity as to cause double-photon absorption hits your retina, there are more serious things to worry about.

reply
diacritical
16 hours ago
[-]
I remember reading some people can perceive some near IR, but mostly that near-IR LEDs actually leak some red themselves due to imperfections in manufacturing or something?
reply
actionfromafar
17 hours ago
[-]
I'll do you one better, which requires no special cameras (most have IR filters) nor double cameras or prisms.

Shoot the scene in 48 or 96 fps. Sync the set lighting to odd frames. Every odd frame, the set lights are on. Every even frame, set lights are off.

For the backing screen, do the reverse. Even frames, the backing screen is on. Odd frames, backing screen is off.

There you go. Mask / normal shot / Mask normal shot / Mask ... you get the idea.

Of course, motion will cause normal image and mask go out of sync, but I bet that can be remedied by interpolating a new frame between every mask frame. Plus, when you mix it down to 24fps you can introduce as much motion blur and shutter angle "emulation" as you want.

reply
ryandamm
17 hours ago
[-]
This is called “ghost frame” and already exists in Red cameras and virtual production wall tools like Disguise.
reply
actionfromafar
15 hours ago
[-]
You need to basically timecode/genlock the greenscreen "illumination LEDs" to the camera so the greenscreen lights up only exactly at every other frame. Not sure if there exists any off the shelf solution which can do that but if not it can't be super hard to cobble together.
reply
cma
14 hours ago
[-]
reply
eichin
11 hours ago
[-]
Somebody recently used a variation of this to get good video of welding - basically a camera synced with a very bright (strobe-ish) light, brighter than the weld itself, so you adjust the camera to the ludicrous-but-consistent brightness level and get details of the weld and the surroundings. https://www.youtube.com/watch?v=wSUxK8q4D0Q (Chronos "Helios", from early 2025)
reply
shdudns
14 hours ago
[-]
Two problems:

- It'll bleed on fast motion. Hair in the wind would just not work.

- Incandescent lights are out.

You could solve both by having two ghost frames shot very close to the real frame (no need to evenly space the frames, after-all) and using strobing a high powered laser.

You'd need very fast sensor or another one optically on the same position.

reply
actionfromafar
5 hours ago
[-]
At some point higher fps solves it. Is 240 fps enough?
reply
amluto
17 hours ago
[-]
Surely this makes your actors feel sick? And wouldn’t it make your motion blur look dashed and also cause artifacts at the edge of the mask if there’s a lot of motion?
reply
throwway120385
17 hours ago
[-]
You could strobe at some multiple of the sensor frame rate as long as your strobes are continuous through the integration period of the sensor and the lighting fades very quickly. This probably wouldn't work with incandescents but people strobe LEDs a lot to boost the instantaneous illumination without going past the continuous power rating in the datasheet.
reply
amluto
15 hours ago
[-]
You mean do strobe, strobe, strobe, strobe, pause, pause, pause, pause? I bet that's at least as bad as holding the source on for the first four intervals and then off for the latter four intervals.

In any case, if you actually have a scene bright for 1/24th of a second and then dark for 1/24th of a second, repeating, you're well within photosensitive epilepsy range. Don't do that to your actors unless you've discussed it with them and with your insurance company first.

reply
actionfromafar
14 hours ago
[-]
So, shoot at 240 fps and strobe set lights for 1/240s and backdrop for 1/240s.
reply
wlesieutre
12 hours ago
[-]
And if you want a slower than 1/240th second shutter speed, no you don't
reply
actionfromafar
7 hours ago
[-]
Or... you frame blend in Fusion or go full hog in Nuke.

( https://www.nukepedia.com/tools/gizmos/time/vectorframeblend... )

reply
kibibu
16 hours ago
[-]
Incandescent and fluorescent lights already flicker at your AC power frequency. Just gotta be higher than that
reply
amluto
15 hours ago
[-]
No.

Incandescent lights flicker at twice your AC power frequency -- to a decent approximation, their power is proportional to V^2. But this is input power -- the cooling of the filament is slowish and the modulation depth is low. Most people aren't bothered by this.

Fluorescent lights with old or very crappy "magnetic" ballasts flicker at twice the mains frequency, with deep modulation. The effect on people varies from moderate to extremely unpleasant, and it's extra bad if anything is moving quickly (gyms, etc). There are even studies showing that office workers perform worse under such lighting even if they don't experience personally perceptible symptoms. The effect is so severe that people invented the "electronic ballast", which flickers at much, much higher frequency and avoids low-frequency components. Phew. (The light might still be a nasty color, but the temporal output is okay.)

"Driverless LEDs" are deeply modulated at twice the mains frequency. These are very nasty.

If you actually have a light that flickers at the AC power frequency (certain LED sources in a two-brightness diode-dimmed kitchen appliance fixture will do this, as will driverless LEDs with certain types of failures), then it's extra nasty.

There are plenty of people around who find (depending on the actual waveform) 60Hz flicker intolerable and 120Hz flicker extremely unpleasant. And there are plenty of people who can often perceive flicker under appropriate circumstances up to at least several hundred Hz and even into the low kHz with certain shapes of light sources. You can read up on IEEE 1789 to find a standard based on actual research on what lighting waveforms should look like.

The effect of 120 Hz flicker is bad enough that energy codes in some places (e.g. California) have started to require that LED sources minimize this flicker, but, sadly, it's poorly enforced.

reply
kibibu
11 hours ago
[-]
Hey thanks for clearing this up. I had no idea that CFLs and fluorescent lights with electronic ballasts now flicker at ~ 20kHz.
reply
SoftTalker
13 hours ago
[-]
The fluorescent light strobing is why you often see fluorescent tubes in pairs. They will be wired in opposite phase to cancel the strobing.
reply
tlb
3 hours ago
[-]
I think the total light output of each bulb in the pair is the same at all points in time, but the orange-blue gradient is reversed. So when one is orange at one end, the bulb beside it is blue at that end.

IIRC, the end that's negative looks orange, because the electrons emitted from the filament haven't gotten up to speed yet and can't ionize the mercury atoms at that end to the highest states.

If you didn't do this, you'd see 60 Hz strobing when you looked at one end.

reply
toss1
12 hours ago
[-]
Also, the human eye sees flicker much better at the periphery than in the central area. The Rod receptor cells respond more rapidly than the Cone color-sensitive cells, and the peripheral vision is also more tuned to quick motions (much advantage in having faster detection of peripheral motion, so positive selection evolutionary pressure).
reply
joecool1029
15 hours ago
[-]
phosphors and capacitors are a thing that mask that, so is high frequency switching way above this rate…

Anyway, an old HN submission I still use when buying light bulbs: https://news.ycombinator.com/item?id=14023196

reply
actionfromafar
15 hours ago
[-]
Feel sick? Possibly. People are more or less sensitive to imperceptable flicker.

Artifacts?

I bet that can be remedied by interpolating a new frame between every mask frame. Plus, when you mix it down to 24fps you can introduce as much motion blur and shutter angle "emulation" as you want.

Motion blur can also be very forgiving. You are more likely to notice artifacts in still or slow moving scenes and then the problem goes away.

reply
dgently7
9 hours ago
[-]
this is the approach that stop motion uses, except they get to keep the camera in the same place. its still not perfect because of spill from the background onto the foreground and requires additional masking and cleanup.
reply
ErroneousBosh
15 hours ago
[-]
Corridor Crew cover this in one of their VFX breakdowns where I can't remember the film but it was supposed to be filmed on a rapidly rotating platfom.

There were a large number of lights around it and each one was blinked on for an instant while the camera shot at an insanely high frame rate - something like 288 frames per second with twelve lights.

This meant that after the fact you could pick any one of the twelve frames for that 1/24th of a second, to choose the angle the light was hitting at.

reply
diacritical
18 hours ago
[-]
From ~04:10 till 05:00 they talk about sodium-vapor lights and how Disney has the exclusive rights to use it. From what I read the knowledge on how to make them is a trade secret, so it's not patented. Seems weird that it would be hard to recreate something from the 1950's.

I also wonder how many hours were wasted by people who had to use inferior technology because Disney kept it secret. Cutting out animals and objects from the background 1 frame at a time seems so mindnumbingly boring.

reply
meatmanek
17 hours ago
[-]
The lights are relatively easy to get. iirc (it's been a bit since I watched their full video on the subject[1]) the hard part to find was the splitter that sends the sodium-vapor light to one camera and everything else to another camera.

1. https://www.youtube.com/watch?v=UQuIVsNzqDk

reply
aidenn0
17 hours ago
[-]
It would seem to me to be relatively easy to build something like that if you're okay shooting with effectively a full stop less light (just split the image with a half-silvered reflector and use a dichroic filter to pass the sodium-vapor light one one side.

The splitter would have to be behind the lens, so it would require a custom camera setup (probably a longer lens-to-sensor distance than most lenses are designed for too), but I can't think of any other issues.

reply
toast0
14 hours ago
[-]
At the end of this video they link to another video from a year ago [1] (this is the same link as the comment you were commenting on, whoops), where they recreate the sodium vapor process with a rig with a beam splitter, one side had a filter to reject sodium vapor light and the other has one to reject everything but sodium vapor light, and then a camera on each side.

The Disney process had the filter essentially built into the beam splitter, but afaik, nobody knows how to make that happen again (or nobody who knows how, knows it's a desirable thing). Seems like the optics might be cumbersome, but the results seem wortwhile.

Also, you need still need careful lighting, you don't want your foreground illuminated by sodium vapor, but I wonder if you could light the background screen from behind (like a rear projection setup) to reduce the amount of sodium vapor light that reflects from the foreground to the camera.

[1] https://m.youtube.com/watch?v=UQuIVsNzqDk

reply
aidenn0
14 hours ago
[-]
We know how to make dichroic prisms (Technicolor used them when filming, as did "3 CCD" digital cameras), but I imagine that to have a sufficiently narrow rejection band for the sodium-vapor prociess, you would need to be smart about where you place the prism, since the stop-band of a dichroic filter changes with angle of incidence.
reply
diacritical
17 hours ago
[-]
Yup, I wanted to say that the prisms are hard to recreate, not the light itself.
reply
somat
14 hours ago
[-]
It is a well known process, Not a lot of general use so costs are not low, but not nearly as high as the original disney prism, I would guess around a 1000 USD for one. As far as I can tell any well equipped optics laboratory could make a beam splitter with whatever frequency gate they want.

https://accucoatinc.com/technical-notes/beamsplitter-coating...

I have no idea about that specific company I just picked it after a search for "beam splitter"

After I saw that video on the sodium vapor illumination process I was curious as to what if you could instead use near-IR light as the mask illumination. In theory you would have a perfect mask(as in the disney process) and no color interference. I found that frequency gated beam splitters are a fairly common scientific instrument.

reply
diacritical
14 hours ago
[-]
Thanks for the info.

As for the IR idea, I wonder if there's something like a crowdfunding/crowdsourcing site for ideas where the person who had the idea doesn't really want to do it, but leaves it open to others to try. You said you "don't really have the budget to try it out", but let's say even if you had the money, it wouldn't be a priority for you, as you're not an expert or you have better things to do or whatever. Is there a place to just shout ideas into and see if any market-oriented entity would take it upon themselves to try doing it? Besides forums full of ideas like "tinder but for X" and such crap? Because, imagine if your idea really is a great one. A couple hours from now it would be buried in HN.

reply
jasonwatkinspdx
17 hours ago
[-]
Yeah, that's just nonsense. We used sodium vapor monochromatic bulbs in my high school physics class to duplicate the double slit experiment.

I suspect the real reason is that digital green screen in the hands of experienced people is "good enough" vs the complication of needing a double camera and beam splitting prism rig and such.

reply
like_any_other
15 hours ago
[-]
Even if it had been patented, patents from the 1950s would have long expired. In fact, patents from 2005 would have expired - the US patent term is only 20 years.
reply
diacritical
14 hours ago
[-]
Didn't know that, thanks. Although 20 years seems too much for some things, especially computer-related fields that move much quicklier than other fields. But now pretty much everything depends on or runs on computers so 20 years seem too much. I don't know if I phrased it correctly, but I mean to say that before computers, things moved much more slowly. Even a century for a patent would've been fine 500 years ago, but now almost every field has been changed by computers in the past few decades and will change even more rapidly. Letting 1 company have the advantage for 20 whole years now is much more impactful than it must've been before.
reply
jayd16
15 hours ago
[-]
As far as alternatives, I wonder if anyone has tried a screen that cycles through colors in a known sequence. Using this modulating-color screen, it might actually be easier to separate the subject because you get around the "green shirt over green screen" problem. You might even be able to use a time sampling to correct the light cast on the subject from the screen as you would have a full spectrum of response.

I could also imagine using polarized light as the backdrop as well.

reply
dgently7
10 hours ago
[-]
the general problem with any technique that isnt just throw some vaugely green thing behind our actors is that setting up complicated tech like this on an actual film set is extremely expensive. both the time it would take and the risk of it not working. so you end up with a dedicated permanent stage install but now you need to get the actors and crew to that place. better keys isnt a bad enough problem to justify that effort/cost. even the highly touted "virtual production" mandalorian stuff where you just put a big led wall behind the actors has shown to be more expensive than traditional vfx unless you tightly control the creative or approach.
reply
lynnharry
8 hours ago
[-]
It’s fascinating to see the bridge between academic research and industry application here. While Image Matting is a massive research area in Computer Vision, academia often focuses on solving perfect 'benchmarks.' Corridor Crew effectively took that foundational research, like neural unmixing and synthetic training, and adapted it to solve the 'messy' reality of production, like tracking markers and motion blur. It’s a great example of using open-source deep learning resources to build a tool that prioritized workflow over just a high accuracy score.
reply
jcmoscon
37 minutes ago
[-]
It is refreshing to see problems being solved by AI that are not LLMs. There are so many day-to-day challenges that we could solve using data, machine learn and some creativity.
reply
mk_stjames
3 hours ago
[-]
In case anyone wanted technical details of the NN, I dug into the repo:

Its a transformer, with a CNN refiner after. Specifically, a ViT using the Hiera architecture (https://github.com/facebookresearch/hiera)

The Hiera ViT has dual decoder heads, one for the alpha and one for the RGD foreground, and then a small CNN refiner network to solve some artifacting in the output from the Hiera model.

I'd be very interested to see a long form tech talk of Niko explaining his process of learning ML ropes and building this model.

reply
swframe2
12 hours ago
[-]
The model in this repo seems pretty good: https://github.com/xuebinqin/DIS
reply
vsviridov
18 hours ago
[-]
The community has managed to drastically lower hardware requirements, but so far I think only Nvidia cards are supported, so as an AMD owner I'm still missing out :(
reply
mouth
14 hours ago
[-]
This works on macOS, as well, via Apple Silicon.
reply
superjan
18 hours ago
[-]
Watched this a few days ago. The video is light on technical details, except maybe that they used CGI to generate training data.
reply
rhdunn
17 hours ago
[-]
The idea behind a greenscreen is that you can make that green colour transparent in the frames of footage allowing you to blend that with some other background or other layered footage. This has issues like not always having a uniform colour, difficulty with things like hair, and lighting affecting some edges. These have to be manually cleaned up frame-by-frame, which takes a lot of time that is mostly busy work.

An alternative approach (such as that used by the sodium lighting on Mary Poppins) is that you create two images per frame -- the core image and a mask. The mask is a black and white image where the white pixels are the pixels to keep and the black pixels the ones to discard. Shades of gray indicate blended pixels.

For the mask approach you are filming a perfect alpha channel to apply to the footage that doesn't have the issues of greenscreen. The problem is that this requires specialist, licensed equipment and perfect filming conditions.

The new approach is to take advantage of image/video models to train a model that can produce the alpha channel mask for a given frame (and thus an entire recording) when just given greenscreen footage.

The use of CGI in the training data allows the input image and mask to be perfect without having to spend hundreds of hours creating that data. It's also easier to modify and create variations to test different cases such as reflective or soft edges.

Thus, you have the greenscreen input footage, the expected processed output and alpha channel mask. You can then apply traditional neural net training techniques on the data using the expected image/alpha channel as the target. For example, you can compute the difference on each of the alpha channel output neurons from the expected result, then apply backpropagation to compute the differences through the neural network, and then nudge the neuron weights in the computed gradient direction. Repeat that process across a distribution of the test images over multiple passes until the network no longer changes significantly between passes.

reply
comex
18 hours ago
[-]
See also this video comparing Corridor Key to traditional keyers:

https://www.youtube.com/watch?v=abNygtFqYR8

reply
Sniffnoy
8 hours ago
[-]
Summary: He created 4 hard-to-key shots, and on each of them tried KeyLight, IBK, and Corridor Key. Overall on 3 of them he judged that Corridor Key had done the best job, on one of them he judged that IBK had done the best job. I think on all of them he judged that more work was still necessary, none of them was fully usable as-is.
reply
amelius
16 hours ago
[-]
Is it a coincidence that the result is stable between subsequent frames?
reply
qingcharles
11 hours ago
[-]
I use the Adobe version of this in Photoshop every day and I assumed that Adobe solved this the same way, but used professionals to cut out the subjects from the backgrounds then fed both versions into their AI.

Since they added it a year or so ago it has been game-changing. I'm cutting out portraits every day and having a magical tool that cuts out the subject with perfect hair cut out with a single click is sci-fi.

Here's a demo of Photoshop's tool:

https://www.youtube.com/watch?v=SNVJN6PKeGQ

(the other magical Photoshop tool is the one that removes reflections from windows, which is even more insane when you reverse it and tell it you only want the reflection and not what's on the other side of the glass)

reply
IshKebab
18 hours ago
[-]
Pretty impressive results! Seems like someone has even made a GUI for it: https://github.com/edenaion/EZ-CorridorKey

Still Python unfortunately.

reply
BoredPositron
17 hours ago
[-]
Like 90% of the other tooling in VFX...
reply
IshKebab
16 hours ago
[-]
Is it? That's a shame. I assumed this is Python because of Pytorch.
reply
amelius
17 hours ago
[-]
There's still a bug: the glass with water does not distort the checker pattern in the background at 24:12.
reply
nstart
11 hours ago
[-]
Good spot! That is the product working as intended though. The background doesn't exist except as an asset that replaces the green screen. The tool is meant to replace the green screen without the need for manual rotoscoping. Even in a traditional process, the distortion needs to be done by VFX as a separate process. To do that though, they still need the green screen keyed out and this tool does that.
reply
jweir
16 hours ago
[-]
True, but with visual art there is what is correct and what looks correct. When things are moving and the area small no one is going to notice.

But now that is problem is solved a director will come along and say... I want a scene with a big glass of water and the camera will zoom in on it and will see the monster refracted through the glass.

reply
gmueckl
16 hours ago
[-]
At that point it's better to do the glass entirely in post.
reply
nkrisc
2 hours ago
[-]
Not a bug, creating distortion in the comped in backgrounds is not what this tool does. It creates a transparency mask. How do you propose a transparency mask captures distortion artifacts?

That distortion to the new background would have to be added in by the artist.

reply
catapart
16 hours ago
[-]
I wouldn't call it a bug. This is a first step, not a final step. Maintaining the refraction might be more realistic, but it's not necessarily what the creator wants.
reply
CharlesW
16 hours ago
[-]
When you watch the video it becomes pretty clear why it wouldn't be able to do that, although it's fun to think about how a future iteration or alternative might be able to credibly (if you don't look too hard) mimic that someday.
reply
DrewADesign
16 hours ago
[-]
You’d have to track it, render it, and comp it in. It’s not ridiculously difficult, but there’s no way that’s going to happen automatically.
reply
orbital-decay
16 hours ago
[-]
>there’s no way that’s going to happen automatically

They train their model in a pretty straightforward way, it can also be used to capture the distortion as well, just use a non-monochrome (possibly moving) background optimized for this. It's a matter of effort and attention to detail during training (uneven green screen lighting, reflections, etc), not fundamental impossibility

reply
amelius
16 hours ago
[-]
Yes. But the main issue is in the way they formulate the problem. Their output is always a transparency mask, which of course will never handle distortions.
reply
dgently7
9 hours ago
[-]
youd have to train it to also generate and st map of the distortions but creating the ground truth version of that from the synthetic data would add a lot more to render. also its very easy to plausibly fake, its not something humans are good at seeing and knowing its wrong. you can tell its completely missing but accurate vs just distorted in a plausible way is not something most brains are tuned to notice.
reply
DrewADesign
14 hours ago
[-]
Right. Things like this are why it’s difficult integrating AI into professional movie pipelines— they’re super complex in ways AI cannot (yet) replicate for very good reasons that seem superfluous or trivially replaceable by people not familiar with them.
reply
orbital-decay
9 hours ago
[-]
People in ML have this kind of belief rooted in the bitter lesson, that everything will eventually sort itself out given enough scale and data. That often makes them ignore the nuances of particular problem domains. CC is the opposite of that, it's just impossible to do everything at once.
reply
DrewADesign
2 hours ago
[-]
It’s certainly a big part of the ML scene, but to a slightly lesser extent, a cultural facet of development in general. It’s not all bad! Many people have solved problems that nobody in their right mind would have attempted knowing the nitty gritty details; often the problem they solved wasn’t the one they intended to solve, or they only solve one small subset of it, but were still valuable advancements. Unfortunately, that also leads to reinforcing some people’s Dunning-Krueger-fueled insistence that they can solve another field’s difficult problems with a few thought experiments, and the only reason it hasn’t already been solved is because nobody thought to ask a developer as smart as them to momentarily consider the problem. Non-developers in tech often bear the brunt of it: moving into design after a decade of dev work, that irritating mindset was one of the reasons I left tech altogether a couple years later.
reply
orbital-decay
16 hours ago
[-]
Sure, because they used monotone backgrounds and never really captured any distortion.
reply
dylan604
16 hours ago
[-]
The sad thing about this is the problems encountered during post from the production team saying "fix it post" during the shoot. I've been on set for green screen shoots where the lighting was not done properly. I watched the gaffer walk across the set taking readings from his meter before saying the lighting was good. I flip on the waveform and told him it was not even (which never goes down well when camera dept tells the gaffer it's not right). He put up an argument, went back and took measurements again before repeating it was good. I flipped the screen around and showed him where it was obviously not even. A third set of meter readings and he starts adjust lights. Once the footage was in post, the fx team commented about how easy the keys were because of the even lighting.

The problem is that the vast majority of people on set have no clue what is going on in post. To the point, when the budget is big enough, a post supervisor is present on production days to give input so "fixing it in post" is minimized. When there is no budget, you'll see situations just like in the first 30 seconds of TFA's video. A single lamp lighting the background so you can easily see the light falling off and the shadows from wrinkles where the screen was just pulled out of the bag 10 minutes before shooting. People just don't realize how much light a green screen takes. They also fail to have enough space so they can pull the talent far enough off the wall to avoid the green reflecting back onto the talent's skin.

TL;DR They solved something to make post less expensive because they cut corners during production.

reply
weinzierl
16 hours ago
[-]
I fully agree but I think for them making it possible to cut corners during production is the whole point. Think about it: The choice is between 5 minutes of work plus a one time purchase of a decent GPU and a big room with a complex lighting setup with a post supervisor present. Now, quality of the end result will not be the same, for sure. You and me would opt for the quality setup whenever we can, but many others won't.
reply
dylan604
16 hours ago
[-]
If you're on such a low production budget that you just physically do not have the lamps to light a screen, then you really have to ask if green screen is the right option. Maybe flip it and shoot black limbo so you do not need lights, and the lights you do have can be better used as key lights for separation. You also don't have to worry about the color cast from your light screen. Essentially, you just need a garbage matte for the key, and then clean up what might be getting keyed that you don't actually need. Detecting foreground subject from background is so capable now that a screen isn't necessary, and matte clean up is pretty much unnecessary. Of course you lose street cred of not being able to say you used green screen, but who cares as long as the shot works out.
reply
bonoboTP
15 hours ago
[-]
Sometimes the cost is human expertise. If the tech allows you to get stuff done with less competent staff, it's a win.
reply
CharlesW
16 hours ago
[-]
> TL;DR They solved something to make post less expensive because they cut corners during production.

FWIW having watched the entire thing, they never blamed bad production staff or unavoidable constraints. Those are things that anyone working with others experiences when making anything, whether it's YouTube videos or enterprise software products. My TLDR is: "Chroma keying is an fragile and imperfect art at best, and can become a clusterf#@k for any number of reasons. CorridorKey can automatically create world-class chroma keys even for some of the most traditionally-challenging scenarios."

reply
dgently7
9 hours ago
[-]
didnt dune win a vfx oscar and their screens werent even green at all? they were tan like sand.
reply
plastic3169
7 hours ago
[-]
Yes and it was a massive manual effort. In a way they acknowledged that keying does not really work all the way and having that unnatural color everywhere in the set is not worth it. It’s a massive production with heavy VFX work so not something you can apply to your own production. Sand screen and roto sections of this discussion are interesting.

https://youtu.be/UARrOsNPviA

reply
bbstats
15 hours ago
[-]
their green screen is good. that has nothing to do with this video.
reply
bbstats
16 hours ago
[-]
... did you watch the video?
reply
MrVitaliy
15 hours ago
[-]
Anyone tried using lidar and just cut/measure distance to the object?
reply
Coeur
1 hour ago
[-]
Well there was ZCam, which was a time-of-flight add-on for ENG cameras. It couldn't really handle edges perfectly and existing bluescreen tech was good enough for TV production, so they pivoted into gaming and sold to Microsoft for the Kinect.

https://en.wikipedia.org/wiki/ZCam (Demo: https://www.youtube.com/watch?v=s7Kcmx29RCE )

reply
rcxdude
11 hours ago
[-]
Apparently they used something similar for production on avatar: stereo cameras for depth estimation which allowed realtime depth composition of CG characters onto the shots they were taking, which makes it a lot easier to get everyone on the same page about the scene, especially with characters that are outside normal human proportions. But it wasn't good enough for the final shots.
reply
wizzledonker
15 hours ago
[-]
That would require calibration with the camera, and even then the camera and lidar sensor can’t be in exactly the same place. I doubt results would be better.
reply
summarity
14 hours ago
[-]
Well sort of, the industry tried to go way beyond that by capturing the entire light field: https://techcrunch.com/2016/04/11/lytro-cinema-is-giving-fil...
reply
dgently7
10 hours ago
[-]
per pixel depth does not solve for semi-transparency.
reply
ralusek
18 hours ago
[-]
I'm a software engineer that, like the vast majority of you, uses AI/agents in my workflow every day. That being said, I have to admit that it feels a little weird to hear someone who does not write code say that they built something, without even mentioning that they had an agent build it (unless I missed that).
reply
tekacs
16 hours ago
[-]
Worth bearing in mind that people in VFX are often relatively technical.

From their own 'LLM handover' doc: https://github.com/nikopueringer/CorridorKey/blob/main/docs/...

> Be Proactive: The user is highly technical (a VFX professional/coder). Skip basic tutorials and dive straight into advanced implementation, but be sure to document math thoroughly.

reply
steve_adams_86
6 hours ago
[-]
Back when I played with animation and post pipelines, I was writing a decent amount of python. It's part of how I got into programming. At the time I would have said I can't program, and I suspect this guy is similar.
reply
adamtaylor_13
16 hours ago
[-]
This is interesting. I had the exact opposite reaction.

You don't hear architects get hounded because they say they "built" some building even though it was definitely the guys swinging hammers that built it. But yet, somehow because he didn't artisanally hand-craft the code, he needs to caveat that he didn't actually build it?

reply
plopz
16 hours ago
[-]
architects i know say "i designed that", "i worked on that", "i specified that" or "i chose that", they don't say "i built that"
reply
kalaksi
16 hours ago
[-]
Maybe it's a language thing. Architects saying they built something sounds a bit off to me. In my native language, and in everyday language, I don't think people would use "built" like that. I don't know how architects talk with each other, though.
reply
krackers
8 hours ago
[-]
It's actually a bit refreshing that they didn't brand this with the usual "LLM hype". And it's actually a good example of someone using LLMs to solve a problem by bringing in their domain knowledge. (The solution is surprisingly simple though, I wonder if other people have done this before but kept it proprietary/in-house).
reply
F7F7F7
11 hours ago
[-]
Is a carpenter who relied on a CNC to cut all of their pieces a builder?
reply
jrm4
17 hours ago
[-]
I mean, the heading of the video says "he solved the problem," which I think is wise to pay a lot of attention to.
reply
Computer0
18 hours ago
[-]
Looking forward to trying it out, 8gb of vram or unified memory required!
reply