As such I always use this prompt as a test: "A video of a jenga brick tower falling over as a brick is removed. The physics of each brick must be realistic."
It gave me a video of where bricks suddenly disapper or morph into others[1]. The linked video is after 2-3 iterations of me insisting on realistic physics. If you are just glancing at this, you would believe it is realistic.
That said this is still very impressive and one more step towards .. IDK what. But I am a bit reasurred that at least my job won't be fully replaced with AI :)
I honestly can't comment with certainty that training from videos alone and whatever tokenization scheme they're using will ever get perfect dynamics.
However it is worth noting that transformers can do a pretty good job at learning dynamics with the right pipeline (not video): https://arxiv.org/pdf/2605.15305 https://arxiv.org/pdf/2605.09196
My point here being that representationally, it might be possible to learn good dynamics without a radically different approach/arch. There are already models that extract 3D tracking points from videos, so they could possibly be leveraged for learning dynamics (which on its own gives precedent for end-to-end approaches also possibly working).
I’ve often thought it would be very handy to have a proper simulator for being able to simulate and identify inefficiencies in one’s technique, but no idea whether it would be feasible to do.
Proper simulators for those exist, you essentially need an engine with a compliant contact model. MuJoCo is the goto here, see:
https://mujoco.readthedocs.io/en/stable/modeling.html#muscle... https://mujoco.readthedocs.io/en/stable/computation/fluid.ht...
These explicitly model biological muscles. IIRC it was originally created to model human hands (I could be misremembering though).
Really depends on the fidelity you want.
Edit: I also work in rigid body simulation for robotics.
We were sharing game clips with each other and after a while realised our old clips were just gone, being deleted after 30 or 90 days or something.
The other problem is Seedance is heavily censored because of copyright concerns.
> The other problem is Seedance is heavily censored because of copyright concerns.
Instead of censoring, wouldn't it make sense to simply not train on copyrighted materials?Which considering just how pretty and detailed this whole thing looks, imo points at a fundamental issue at how these things are trained - it's as if there's no structure to its knowledge and training, like how an artist trained to draw would first try to understand simple 2d composition, then perspective, then light and shadow, mastering each concept and gradually building up a hierarchical understanding - it seems like its trying to learn everything at once.
I would rather see an AI model that I could give a floorplan of a building and it would generate an accurate flythrough on any path, even if it looked like butt.
Im not just talking out of my arse, I did work for a while in data science/engineering, and one of the big lessons people needed to be reminded of is to clean/downsample the data - a dataset consisting of a million samples could very well take 1000x as long to process as if we downsampled the whole thing to just a couple of thousand samples and we could learn the same conclusions with the fraction of expended time/effort.
I'm sure there's a similar logic in RL, that if you dump a trillion samples into the datacenter that consumes the same power as a city, what the model learns is what it could've learned with a much more curated training set and directed approaches.
There's got to be a reason this is phrased so insanely, right?
> Prompt: A skeuomorphism stop motion explainer about how the brain hippocampus works with a compelling voiceover. Don’t add seahorses. No voice cuts at the end. Don’t add text
Seahorses???
Oh god...
May as well power off the whole grid now and have the Amish start teaching us how to survive
Video, more than anything else, is the place where I really care if something is AI or not. If I could get a TikTok that had no AI usage -- I'd be in. Which is weird for me, because I'm typically the guy who is all-in on AI.
Funny enough, this is actually one of the few things which has bothered me with the AI boom, and I'm mostly pro-acceleration. A lot of what's happening seems inevitable. But surprisingly, knowing that cat or dog or bird or lizard or butterfly or whatever has a strong chance of being generated really does take something out of it to my mind. And I say that also knowing the extreme amount of staging which has long gone on with traditional nature videography. Somehow, knowing the animal is real means something... I'm still trying to figure out how to better understand and express this.
I eventually picked one and opened the comments and the top comment was something like "This is obviously an AI video. Who watches this?" and the reply was along the lines of "me because I like seeing thieves get what's coming to them".
So you, like me, aren't interested in AI videos but I think there's a lot of people who don't care if it's real or not.
Thankfully, YouTube eventually stopped showing those to me. Now it thinks I'm interested in road rage videos. My YouTube feed outside of the three of four channels I've subscribed to is terrible.
I really wish a subject matter expert would pitch in to tell us what this is about?
like a totally made up thing that is fake, somehow gives a sense of justice and satisfaction?
is it something about imagining it happening in reality, or what?
for me, if I see that something is AI, it's like I just feel nothing. because there's nothing in it, it has nothing of real value? like it doesn't evoke anything in me, it doesn't make me think "this was a great find!" or make me want to send a link over to my friends, etc.
Now you can have people producing videos without needing a crew of people.
model card: https://deepmind.google/models/model-cards/gemini-omni-flash...
It could make the comments section even more fun.
I did not create any videos yet.
Google, building great AI that nobody can try out.
But thx for the press release.
This tech won’t change anything.
I didn't see the quote you did but he probably confused the fact that PHM used physical elements in place of some CGI in certain scenes and the separate fact that a realistic physical puppet was used on set for reference. Some parts of that puppet are seen on-screen in some shots but most of the creature in most shots was CGI or CG enhanced (which looked great thanks to the ideal in-camera puppet reference it replaced). I explained more here: https://news.ycombinator.com/item?id=48198851
About the misunderstood puppet: A real Rocky puppet was indeed used on set (actually a few different puppets) and some of the puppet is sometimes seen on camera. But most of the puppet was digitally replaced with CGI or CGI-enhanced in most of the scenes. However, using a much more realistic puppet on set is indeed notable but not because the character wasn't CGI. The puppet is worth talking about because it directly enabled the final mostly-CGI character be really good CGI. It's good because shooting the physical puppet gave the VFX character animators an ideal reference that's "grounded" in the physical reality of the set, camera and lens. The subtle interplay of light, shadow, texture and specularity in the CGI are all grounded in reality. The puppet also let the actor interact with something closer to reality. It's a wonderful technique and should be celebrated instead of obfuscated to promote a "No CGI!" falsehood that trends well on social media.
Also, PHM did use real sets (like most movies) and they were able to avoid using green screen for some of the ship exteriors but those backgrounds were still digitally replaced with CGI rendered elements, they just didn't use green screen to pull the matte. But on social media, "No green screen" (true) was conflated into "No CGI" (false). Instead of green screen they used a black backdrop with careful lighting and some hand rotoscoping to extract the digital mattes. Doing it this way had the advantage of not needing to digitally remove green spill on reflective surfaces by hand and it saved money over doing a StageCraft virtual volume at that size. Done well, a green screen could have produced the exact same shot but it would have cost more and taken longer.
But influencers and media are unintentionally perpetuating "No CGI" myths instead of focusing on the actually interesting, more nuanced reality. Using more and better physically grounded references on-set IS a breakthrough that helps turn bad CGI into great CGI. Another example is Top Gun where "artfully misleading but technically true wording" in studio press releases grew into outright falsehoods online. Tom Cruise was truthful in saying that he was flown in a jet right alongside other REAL jets doing simulated dog-fighting. The lost nuance is that all the other jets Cruise flew with in those dog fight scenes were old Soviet trainer jets that look quite different and are much smaller than real MIGs. So the trainer jets were entirely replaced by CGI MIGs in post and are never seen in the final film. And we couldn't tell because the digitally removed jets provided ideal grounded reference for the CGI pixels that replaced them. And that's how we ended up with several famous YouTubers proclaiming "These are REAL jets, not CGI!" while showing 100% CGI jets. Same with Wicked and the CGI tulips. The fact that Wicked used thousands of specially grown tulips on-set (true) was confused into proclaiming "ALL these tulips are real, no CGI!" (false) while showing a scene where >90% of the tulips were CGI.
Creates can these video gen AI in various ways. There are some youtube channels of people using these in creative workflows that are really impressive, from mocap replacement, character insertion, background replacement, changing camera angle in post, animating/inserting characters from character boards, animated between stills generated in traditional methods, etc. It's not just "prompt and generate". It can be, because it's easy, but it also doesn't have to be. It's a tool.
Do you have any examples of those creative workflows that have made it into Hollywood for example?
[0] e.g. Don't Look Up
https://blog.google/innovation-and-ai/products/identifying-a...
(and the previous SynthID: https://deepmind.google/blog/identifying-ai-generated-images...)
But it very much is "close the barn door after the horse has bolted and the barn has otherwise burned down".
Certainly not me - you have to be a great artist /designer to even imagine what to do with it.
I used to joke that was the moment we discovered "for most people that's a pretty big limit."