Sure, it solves for the primary view, but this is claiming it is a 3D scene reconstruction/inference technique and in that claim it only sort of works.
For example: https://i.postimg.cc/43tj36jv/Screenshot-2025-03-20-at-8-52-...
In the long run, yeah this *exact* application is sort of pointless. I expected to see the lens parameters factored into the process. It's not. This would mean that everything is not only dimensionally inaccurate since there's no reference measurement, but also proportionally inaccurate to other things in the scene. You can actually see the effect of that on the "flower car" example. (the entire shape of the car is warped) Let alone the fact that the entire scene that can't be seen in the original photo is made up.
Maybe someone would use this to make game assets? But you'd need to fix them up a ton before using them. Other sibling comments make the point that there's no wireframes... so we can assume the polygon count here is insane.
Either way... it's just neat.
Money.
Every single time a new "Generate 3D" thing appears, they never show the wireframes of the objects/scenes up front, always you need to download and inspect things yourself. How is this not standard practice already?
Not displaying the wireframes at all, or even offer sample files so we could at least see it ourselves, just makes it look like you already know that the generated results are unusable...
There are no materials channels or wireframe. It’s a volumetric 3D representation, like a picture made up of color blobs.
This method uses ai to generate even more unseen structure, so with relatively few images you can still represent a real scene with some level of fidelity. It will never need dynamic lights or animation because the point is just to look as close as possible to a still image. Splats do that FAR better and more efficiently than you ever could with dynamic lighting, triangulated models, and visual effects.
Plus, scaling dynamic lighting up has always been the Big Bad of computer graphics, and precomputation will always give us an amazing heuristic to use against it. Everything else basically tends towards not mattering: we can only absorb a finite number of details, but we live in a world with virtually infinite lights.
I bet in a couple of years it'll be standard for estate agents to show 3D views like this on their web sites, architects converting quick paintovers of existing sites to 3D models, improvements to Street View, and so on. Anywhere where you want a quick 3D view of a space based on a few photos taken on a smartphone and where accuracy isn't 100% important.
For things like games, it still follows the existing photogrammetry workflow (with all of those problems), but it might reduce the number of photos needed to create a point cloud.
I much prefer pointclouds and nurbs over meshes
Not everything is gamedev
Agree, I'm not sure why you'd think that's the only use case for 3D, unless I misunderstand your argument here.
How would you handle visual effects with point-clouds for example? There are so many use cases for proper 3D, and all I can think of as use-case for point clouds are environments with static lightning, which seems like a really small part of what people generally consider "3D scenes".
Maybe I missed the mark on “gamedev”, but 3D is larger than just “aesthetically pleasing 3D VFX” for its own sake
Often I’m trying to use something as a reference for a design where a 3D model isn’t the actual end goal, or I’m performing analytics on a 3D object (say in my case for a lot of GIS and simulation work)
The whole “mesh is the be all and end all of 3D modelling” irks me as while yes it’s a really important way of representing an object (especially with real time constraints), it doesn’t do justice to the full landscape of techniques and uses for 3D
It would be like 2D sprite artists from the gamedev world saying “what’s the point of all this vector art you illustrators are doing” or “what’s the point of all these wireframe designs you graphic designers are doing” - “these aren’t raster images!”
I suppose my snipe was trying to communicate the idea that 3D is larger than just a vehicle for entertainment production. It intersects many industries that may eschew polygons because real time rendering is irrelevant
3D tooling has uses beyond producing 3D scenes, just as Photoshop is used for more than touching up photographs
Edit: for anyone stuck in a rut with meshes come join the dark side with nurbs - it makes you think about modelling in a radically different way (unfortunate side effect is it makes working with meshes feel so so “dirty”)
No one said this, it seems like you are making up fake questions and not dealing with the actual questions that the person you replied to asked.
You can view point clouds and you can warp them around, but working with them and tracing rays becomes a different story.
Once you need something as a jumping off point to start working with, point clouds are not going to work out anymore. People use polygons for a reason. They have flexible UVs, they can be traced easily, they can be worked with easily, their data is direct, standard and minimal.
This is because a point cloud does not represent a surface or a volume until the points are connected to form, well, a surface or a volume.
And physical problems are most often defined over surfaces or volumes. For instance, waves don't propagate over sparse sets of points, but within continuous domains.
However, for applications where geometric accuracy is needed, I think you wouldn't want to use a method based on a minimal number of photographs anyways. For instance, the Lascaux cavern was mapped in 3D a decade ago based on "good old" algorithms (not machine learning) and instruments (more sophisticated than a phone camera). So these critiques are missing the point, in my opinion. These Gaussian Splatting methods are very impressive for the constraints they operate under!
But the meshes produced are not easy to edit.
There is no good method to take a 3D scan and make a sensible mesh out of it. They tend to have far more vertices than necessary and lack structure.
This is a separate problem from triangulation (turning point clouds into meshes) done with entirely different algorithms. It's likely the software you used for this assumes the user will then turn to other software to improve their surface mesh.
Even for operations that are naturally in sequence, you will often find the software to carry out those steps is separated. For instance turning CAD into a surface mesh is one software, turning a surface mesh into a tetrahedral volume mesh another (if those are hexahedra, then yet another), and then optimizing or adapting those meshes is done by yet another piece of software. And yet these steps are carried out each time an engineer goes from CAD to physical simulation. So it's entirely possible the triangulation software you used does not implement any kind of surface optimization and assumes the user will then find something else to deal with that.
With splats you can have incredibly high fidelity with identical lighting and detail built in already. If you want to make a game or a movie, don't use splats. If you want to recreate a static scene from pictures, splats work very well.
If you're using it to render video you don't need to go into the mesh world.
Apparently you can clone and run the demo locally. But wasn't clear at a glance how much is local and what hardware required.
This is a previous paper/work by the lead author a year before they interned at Google Research and produced Bolt3D.
Bolt3D appears to be his intern research project done in conjunction with a bunch of other Google and DeepMind researchers.
I don't suspect there will ever be publicly available code for this.
Given that it's a work done at Google I will not expect them to release source code. But it will be reproduced by someone else soon enough.
> Our method takes 6.25 seconds to reconstruct one scene on a single H100 NVIDIA GPU or 15 seconds on an A100.
How do you know it's the actual implementation?
I agree, this is the way forward: - "some photos" as imput. - Convenient, a camera is in every pocket (Smartphone).
On WE, I have been trying for years to generate 3D from photos.My tool now works well, but there is still this big problem of the time it takes to "recreate" the 3D mesh from photos. I remind that photos are in ... 2D.Not convenient. Here is an example of my Tool's generation : https://free-visit.net/fr/demo01
Here, Bolt3d takes away the 4 hours combersome work into a automatic process. Wahoo !
So Bravo to the Bolt3d team of researchers.