1. Have gpt5 or really any image model, midjourney retexture is also good, convert the original image to something closer to a matte rendered mesh, IE remove extraneous detail and any transparency / other confusing volumetric effects
2. Throw it in meshy.ai image to 3D mode, select the best one or maybe return to 1 with a different simplified image style if I don't like the results
3. Pull it into blender and make whatever mods I want in mesh editing mode, eg specific fits and sizing to assemble with other stuff, add some asymmetry to an almost-symmetric thing because the model has strong symmetry priors and turning them off in the UI doesn't realllyyy turn them off, or model on top of the AI'd mesh to get a cleaner one for further processing.
The meshes are fairly OK structure wise, clearly some sort of marching cubes or perhaps dual contouring approach on top of a NeRF-ish generator.
I'm an extremely fast mechanical CAD user and a mediocre blender artist, so getting an AI starting point is quite handy to block out the overall shape and let me just do edits. EG a friend wants to recreate a particular statue of a human, tweaking some T-posed generic human model into the right pose and proportions would have taken me "more hours than I'm willing to give him for this" ie I wouldn't have done it, but with this workflow it was 5 minutes of AI and then an hour of fussing in Blender to go from the solid model to the curvilinear wireframe style of the original statue.
Sounds interesting. Do you have any example images like that you could share? I understand the part about making transparent surfaces not transparent. But I’m not sure how the whole image looks like after this step.
Also, would you be willing to share the prompt you type to achieve this?
This tool is maybe useful if you want to learn Python, in particular Blender Python API basics, I don't really see other usage of this. All examples given are extremely simple to do; please don't use a tool like this, because it takes your prompt and generates the most bland version of it possible. It really takes only about a day to go through some tutorials and learn how to make models like these in Blender, with solid color or some basic textures. The other thousands of days is what you would spend on creating correct topology, making an armature, animating, making more advanced shaders, creating parametric geometry nodes setups... But simple models like these you can create effortlessly, and those will be YOUR models, the way (roughly, of course) how you imagined them. After a few weeks you're probably going to model them faster than the time it takes for prompt engineering. By that time your imagination, skill in Blender and understanding of 3D technicalities will improve, and it will keep improving moving onward. And what will you learn using this AI?
I think meshy.ai is much more promising, but still I think I'd only consider using it if I wanted to convert photo/render into a mesh with a texture properly positioned onto it, to then refine the mesh by sculpting - and sculpting is one of my weakest skills in Blender. BTW I made a test showcasing how meshy.ai works: https://blender.stackexchange.com/a/319797/60486
I think you might be projecting your abilities a bit too much.
As someone who wants to make and use 3d models, not someone who wants to be a 3d model artist, this tech is insanely useful.
All examples are really just primitives either extruded in one step or the same and maybe 5 of them together.
I don't want to sound mean but these are reachable with just another day at it. They really are.
Semi-related, understanding Sketchup took a couple of false starts for me. The first time I tried it, I could not make heads or tails of what I was doing. I must have spent hours trying to figure out how to model a desk, and I gave up. Tried again a year or two later, and it just didn't click.
The third try, a couple years later, it suddenly made sense, and what started as modeling a new desk out turned into modeling my room, then modeling the whole house, and now I've got a rough model of my neighborhood. And it's so easy, once you know how - there's obviously a rabbit hole of detail work one can fall down into, but the basics aren't bad.
I'm sure the most difficult part here is just understanding blender UI. Clearly more difficult than picking up a pencil. But, a tutorial video should suffice.
For the chair example you pick a face on the default cube and the use the extrude tool on the left. Now you have a base.
Add 4 more cubes, and do the same. Now you have legs.
Then boolean them.
For the hat? Use a sphere, go to the sculpt tab and go ham.
There are way better ways to do this, of course. But really, there is not such a high degree of skill involved here, nor that being just a little more patient (one more day of trying) is that much to ask.
But that misses the fact that this is only the beginning. These models will soon generate entire worlds. They will eventually surpass human modeller capabilities and they'll deliver stunning results in 1/100,000th the time. From an idea, photo, or video. And easy to mold, like clay. With just a few words, a click, or a tap.
Blender's days are long in the tooth.
I'm short on Blender, Houdini, Unreal Engine, Godot, and the like. That entire industry is going to be reinvented from scratch and look nothing like what exists today.
That said, companies like CSM, Tripo, and Meshy are probably not the right solutions. They feel like steam-powered horses.
Something like Genie, but not from Google.
This is a pretty sweeping and unqualified claim. Are you sure you’re not just trying to sell snake oil?
I claimed three years ago that AI would totally disrupt the porn and film industries and we're practically on the cusp of it.
If you can't see how these models work and can't predict how they can be used to build amazing things, then that's on you. I have no reason to lift up anybody that doubts. More opportunity on the table.
We have been on the cusp of some things for literal decades.
Meh. We were on the cusp 5 years ago. Five years later, we're still on the cusp?
Maybe I'm working with a different meaning of "cusp", but to me "On the cusp of $FOO" means that there is no intervening step between now and $FOO.
The reality is that there are uncountable intervening steps between now and "film industry disrupted".
Have you ever asked yourself why this revolution hasn't come yet? Why we're still "on the cusp" of it all? Because you can't push a button and generate better pornography than what two people can make with a VHS camera and some privacy. The platonic ideal of pornography and music and film and roleplaying video games and podcasting is already occupied by their human equivalent. The benchmark of quality in every artistic application of AI is inherently human, flawed, biased and petty. It isn't possible to commoditize human art with AI art unless there's a human element to it, no matter how good the AI gets.
There's merit to discussing the technical impetus for improvement (which I'm always interested in discussing), but the dependent variables here seem exclusively social; humanity simply might never have a Beatlemania for AI-generated content.
Right now if I go on LinkedIn most header images on people's posts are AI generated. On video posts on LinkedIn that's a lot less, but we are beginning to see it now. The static image transition has taken maybe 3 years? The video transition will probably take about the same.
There's a set of content where people care about the human content of art, but there is a lot of content where people just don't care.
The thing is that there is a lot of money in generating this content. That money drives tool improvement and those improved tools increase accessibility.
> Have you ever asked yourself why this revolution hasn't come yet?
We are in the middle of the revolution which makes it hard to see.
> Why OnlyFans May Sell for 75% Less Than It’s Worth [1, 2]
> Netflix uses AI effects for first time to cut costs [3]
Look at all of the jobs Netflix has posted for AI content production [4].
> Gabe Newell says AI is a 'significant technology transition' on a par with the emergence of computers or the internet, and will be 'a cheat code for people who want to take advantage of it' [5]
Jeffrey Katzenberg, the cofounder of DreamWorks [6]:
> "Well, the good old days when, you know, I made an animated movie, it took 500 artists five years to make a world-class animated movie," he said. "I don't think it will take 10% of that three years out from now," he added.
I can keep finding no shortage of sources, but I don't want to waste my time.
I've brushed shoulders with the C-suite at Disney and Pixar and talked at length about this with them. This world is absolutely changing.
The best evidence is what you can already see.
[1] https://www.theinformation.com/articles/onlyfans-may-sell-75...
[3] https://www.bbc.com/news/articles/c9vr4rymlw9o
[4] https://explore.jobs.netflix.net/careers?query=Machine%20Lea...
[5] https://www.pcgamer.com/software/ai/gabe-newell-says-ai-is-a...
[6] https://www.yahoo.com/entertainment/cofounder-dreamworks-say...
The C-suite who don't realize how wrong they are about AIs potential are going to be facing a harsh reality. And artists will be the first to be hurt by their HYPE TRAIN management style and mindset.
Edit: most of all, the 3d generation in this LLM3d model is about the same as the genAI 3d models from a year ago... And two years ago... A good counterpoint would be Tubi's recently released, mostly AI gen short films. They were garbage and looked like garbage.
Netflix's foray, of memory serves, was a single scene where a building collapses. Hardly industry shattering. And 3d modeling and genAI images/videos are substantially different.
You'll literally be proven wrong simply because the AI will take time to generate things even if the quality of the output is high.
Two Girls One Cusp.
Hate to disappoint you, but as the models get better, and eventually deliver the results, you won't have to wait a microsecond until the masses roll in to take advantage.
That's like saying we'll add internet browsing and YouTube to Lotus 1-2-3 for DOS.
It's weird, legacy software for a dying modality.
Where we're going, we don't need geometry node editing. The diffusion models understand the physics of optics better than our hand-written math. And they can express it across a wide variety of artistic styles that aren't even photoreal.
The future of 3D probably looks like the video game "Tiny Glade" mixed with Midjourney and Google Genie. And since it'll become so much more accessible, I think we'll probably wind up blending the act of creation with the act of consuming. Nothing like Blender.
How well will they do with something like creating two adjacent mirrors with 30 degree angle between them with one of them covered with varying-polarized red tinted checkerboard pattern?
They do a better job of fluid sim than most human efforts to date. And that's just one of thousands of things they do better than our math.
Now I know you're too far down the hype rabbit hole. Either that, or you lack a cursory understanding of diffusion models.
What's the use case of generating a model if all modelling and game engines are gone?
No. The Roblox of this space is going to make billions of dollars.
There's going to be so much money to make here.
Some what related: im still amazed that no one has made a Roblox competitor, as in, a vague social building game that tricks children into wasting money on ridiculous MTXs. Maybe you are right, but I think that taking an already sorry state of affairs, and then removing the only imagination or STEM skills required by giving children access to GenAI.... is a really depressing thought.
I kinda meandered with my point lol.
> ...with a majority of users under the age of 13? I'm sure that WILL make a lot of money. > ... will find this comparison impossibly stupid.
I'm ignoring the insinuations here for obvious reasons.
1. Roblox is the newest (note: not necessarily the best) iteration of the genre that Secondlife & (to a limited extent) modded Minecraft servers occupy: An interactive 3D platform that permits user-generated content.
2. Generative models just accelerate their development up to the brick wall of complexity much faster.
> Some what related: im still amazed that no one has made a Roblox competitor
This comment is just the HN Dropbox phenomenon, *again*, only this time from the angle that thinks it's easy to build a "pseudo game-engine" from scratch.
https://news.ycombinator.com/item?id=8863
Few competitors exist because of the moat they have built in making their platform easy to develop on, so much so that kids can use them with little issue.
> , as in, a vague social building game that tricks children into wasting money on ridiculous MTXs.
This part is entirely separate from the technical aspects of the platform. Roblox is a feces-covered silver bar, but the silver bar (their game platform) still exists.
> Maybe you are right, but I think that taking an already sorry state of affairs, and then removing the only imagination or STEM skills required by giving children access to GenAI.... is a really depressing thought.
This is a hyper-nihilistic opinion on children laid bare.
To think that the children (*with the dedication to make a game in the first place*) wouldn't try to learn about debugging the code that the models are spitting out, or that 100% of them would just stop writing their own code entirely, is a cynical viewpoint not worth any attention.
Because using LL3M-style technique will probably be cheaper and better (fidelity and consistency and art direction wise) than generating the entire video/game with video generation model.
They may. It's hard to expect this when we already see LLMs plateauing at their current abilities. Nothing you've said is certain.
Hey, I heard that one before! The entire financial industry was supposed to have been reinvented from scratch by crypto.
I was considering this path a few years ago but all my research pointed to me being taxed for moving my own money from one country to another. Which would’ve cost significantly more than a good ol’ bank transfer. (I needed the fiat on the other end)
My understanding was that as far as the receiving bank is concerned, the converted crypto would’ve appeared out of an investment/trading platform and needed to be taxed
The bank transfer cost like a couple of bucks anyway so it wasn’t worth the risk of trying the crypto route in the end for me.
This – but in real-time – is the future of gaming and all media
https://x.com/elonmusk/status/1954486538630476111I perfectly understand the time/patience/energy argument and my bias here. But even Spore (video game) editor with all its limitations gives you a similar result to the examples provided, and at least there you are the one giving the shape to your work, which gives you more control, and your art more soul, and moreover puts you on a creative path where the results are getting better.
Will the AI soon surpass human modeller? I don't know... I hear so much hype for AI, I have fallen victim to it myself where I spent quite some time trying to use AI for some serious work and guess what - it works as a search engine, it will give me a ffmpeg command that I could duckduckgo anyway, it will give me an Autohotkey script that I could figure out myself after a quick search etc. The LLM fails even at the tasks that seem optimal for it - I have tried multiple times to translate movie subtitles with it, and while the translation was better than using machine learning, at some point the AI goes crazy and decides to change the order of scenes in a movie - something that I couldn't detect until I watched the movie with friends, so it was a critical failure. I described a word, and the AI failed to give me the word I couldn't remember, and a simple search on thesaurus succeeded instead. I described what I remembered from a quote, but the AI failed to give me the quote, but my googlefu was enough to find it.
You probably know how to code, and would cringe if someone suggested to you to just ask the AI to write you the code for a video game without you yourself knowing how to code to at least supervise and correct it, and yet you think the 3D modelling will be good enough without intervention of a 3D artist; maybe, but as someone experienced in 3D I just don't see it, just like I don't see AI making Hollywood movies even though a lot of people claim it's a matter of years before that becomes the reality.
Instead what I see is AI slop everywhere and I'm sure video games will be filled with AI crap, just like a lot of places were filled with machine-learning translations because Google seriously suggested on its conferences that the translations are good enough (and if someone speaks only English, the Dunning-Kruger effect kicks in).
Sure, eventually we might have AGI and humanity will be obsolete. But I'm not a fan of extrapolating hyperbolic data; one Youtuber made an estimation that in a couple decades Earth will be visited by aliens, because there won't be enough Earthlings to satisfy his channel viewership stats.
"AI allows those excluded from the guild" is total BS.
Gut figures, ~85% of creativity comes from skill itself. ~10% or so comes from prior arts. And it's all multiplied by willingness[0, 1] which >99.9999% of population has << 0.0001 as the value. Tools just don't change that, it just weighs down on the creativity part.
There are models in the examples that require e.g. extrusion (which is literally: select faces, press E, drag mouse).
Some shapes are smoothed/subdivided with Catmul-Clark Subdivision Surface modifier, which you can add simply by pressing CTRL+2 in "Object Mode" (the digit is the number of subdivisions, basically use 1 to 3, you may set more for renders).
Here's a good, albeit old tutorial: https://www.youtube.com/watch?v=1jHUY3qoBu8
Yes I made some assumptions when estimating it takes about a day to learn to make models like this: you have a free day to spend it in its entirety to learn, and as a hackernews user your IQ is over average and you're technically savvy. And last assumption: you learn skills required evenly, rather than going deep into the rabbit hole of e.g. correct topology; if you go through something like Andrew Pierce's doughnut tutorial, it may take more than a day, especially if you play around with the various functions of Blender rather than strictly following the videos - but you will end up making significantly better models than the examples presented, e.g. you will know to inset cylinder's ngons to avoid the Catmul-Clark subdiv artifacts you can see on the 2nd column of hats.
> this tech is insanely useful.
No, it isn't, but you don't see it, because you don't have enough experience to see it (Dunning-Kruger effect) - this is why I mentioned my experience, not to flex but to point out I have the experience required to estimate the value of this tool.
That crosses into personal attack. Please don't do this. You can make your substantive points without it.
I play guitar, it's easy and I enjoy it a lot. I've taught plsome friends to play it, and some of them just... don't have it in them.
Similarly,.I've always liked drawing/painting and 3d modeling. But for some reason, that part of my brain is Just not there. I just can't do visualization. I've even tried award winning books (drawing with the right side of the brain) without success.
Way back in the day I tried 3D modeling with AW maya, 3d studio max and then Blender. I WANT to convert a sphere into a nice warrior, I died to make 3d games: I had the C/C++ part covered, as well as the opengl one. But I couldn't model a trash can,.after following all tutorials and.books.
This technology solves that for us who don't have that gift. I understand that for people that can "draw the rest of the fking owl" it won't look as much, but darn, it opens a world for me.
The thing is, I've actually worked as a 3D artist for a number of years. Some people even tell me I'm good. I suppose if that's true at all, it's because I've learned to use the computer to do the visualizing for me.
For some other artists, their process seems to be that they first picture a 'target' image in their mind, and then take steps towards that target until the target is reached. That seems impossible to me -- supernatural stuff. I almost don't believe they can really do it.
My process is closer to first finding some reference images, then taking a step in a random direction and asking whether I'm closer or further away from those references. I'm not necessarily trying to copy the references exactly, I'm just trying to match their level of quality. Then I take another random step, and check again. If you repeat this process enough times, you'll edge closer and closer to something that looks good. You'll also develop a vague sense of 'taste', around which random movements tend to produce more favourable results, and which random movements tend to produce more ugly results. It's a painful process, but it's doable.
I guess what I'm trying to say is that the ability to visualize isn't a prerequisite for 3D modeling.
I wish I could see you struggling to model a trash can and see if maybe you didn't have too high requirements for the quality of said trash can. After all it's just taking a cylinder, insetting the top face and extruding it down, and the top you can model in the exact same way. The rest is detail that the AI tool in question is terrible at. https://i.imgur.com/xeFrgpP.gif
The teacher then asked me for my pencil, and started doing some adjustments in my drawing. The shitty cup just became alive with some touches here and there. All I could ask was HOW ??? how did she SEE that?
The book "drawing with the right side of the brain" goes over it: A lot of who are strongly (brain) left-sided see a Cup and "abstract" away the forms, we are constatly drawing "lines" (like, drawing a sticky-figure person,a head is a circle, then body is a line, girl skirt is a triangle, etc) and just cannot actually get past that reasoning in our brain.
The way I see, and I think the way most people see, is that I have subpixels, not distributed in a square grid and small enough, too many to be able to count them - but I can see them when I close my eyes, it's somewhat similar to looking at a colored noise - something like this: https://i.imgur.com/1P3n80k.gif except you would have to display it on a ridiculously high resolution display (I don't know, 64k or maybe more) and it would represent just a small fragment of view.
Of course this unordered constellation of cones can be mapped into a grid of pixel or a space on a paper, so the only problem is I can't make a measurement in my head and I need to calibrate "eye-balling" measurement to figure out where on paper should I put what I see and I deal with it typically by imagining vertical and horizontal lines to subdivide my view, and then I likewise subdivide the paper.
So I don't really have a problem drawing what I see, the problem I have is the missing technique of how to use a pencil to draw what I actually want to draw.
I think most people work the same way but apparently you don't?
...And here's where AI comes into play, If AI could be contained into steps: - Input node: describe where the starting data comes from and AI automatically loads a file from hard drive or Internet or generates a primitive - Select node: describe a pattern by which to select elements of the geometry - Modify Geometry node: perhaps should be split into multiple nodes as there's so many ways to modify the geometry. - Sample/connect data: create an attribute and describe a relation of it to something else to create an underlying algorithm populating this attribute. - Save node: do you want to output the data through the usual pipeline, or maybe export to a file, or save to a simulation cache?
This way AI could do low-level stuff that I think it excels at, because this low-level stuff is so repeatable AI can be well trained on it. While the high-level decision making would be in control of an artist.
What this means is that making even a 2 minute short animation is out of reach for a solo artist. Your only option today is to go buy an asset pack and do your best. But then of course your art will look like the asset pack.
AI Tools like this reduce one of the 20+ stages down to something reachable by someone working solo.
Is it truly the duration of the result that consumes effort and the number of people required? What is the threshold for a solo artist? Is it expected that a 2 minute short takes half as much effort/people as a 4 minute short? Does the effort/people scale linearly, geometrically, or exponentially with the duration? Does a 2 minute short of a two entity dialog take the same as a 4 minute short of a monologue?
> Your only option today is to go buy an asset pack and do your best. But then of course your art will look like the asset pack.
What's more valuable? That you can create a 2 minute short solo or that all the assets don't look like they came from an asset pack? The examples shown in TFA look like they were procedurally generated, and customizations beyond the simple "add more vertexes" are going to take time to get a truly unique style.
> AI Tools like this reduce one of the 20+ stages down to something reachable by someone working solo.
To what end? Who's the audience for the 2 minute short by a solo developer? Is it meant to show friends? Post to social media as a meme? Add to a portfolio to get a job? Does something created by skipping a large portion of the 20+ steps truly demonstrate the person's ability, skill, or experience?
There is a real possibility the assets generated by these tools will look equally or even more generic, the same generated images today are full of tells.
> What this means is that making even a 2 minute short animation is out of reach for a solo artist.
Flatland was animated and edited by a single person. In 2007. It’s a good movie. Granted, the characters are geometric shapes, but still it’s a 90 minute 3D movie.
https://en.wikipedia.org/wiki/Flatland_(2007_Ehlinger_film)
Puparia is a gorgeous 2D animated film done by a single person in 2020.
https://en.wikipedia.org/wiki/Puparia
These are exceptional cases (by definition, as there aren’t that many of them), but do not underestimate solo artists and the power of passion and resilience.
Other than the obvious (IDEs), wish there were more tools like Fusion360's ai auto-constraints. Saves so much time on something that is mostly tedious and uncreative. I could see similar integrations for Blender (honestly the most interesting part of what op posted is changing the materials... could save a lot of time spent connecting noodles).
Without question.
The current 3D GenAI isn't that good. And if when they eventually become good enough they won't be very cheap to run locally, at least for quite a while. You just need to wait & hoard spare cash. Learning how to use the current models is like trying to get GPT1 to write code for you.
What if I don't want to learn guitar? What if I just want to spend a couple of hours and get something that sounds like guitar?
I tend to say in this situation: you can do that. Nobody's stopping you. But you shouldn't expect wider culture to treat you like you've done the work. So what new creative work are you seeking to do with the time you've saved?
This kind of work will only improve, and we're in early days for these kinds of applications of LLM tech.
You don’t know if it will get better. Even if it does, you don’t know by how much or the time frame. You don’t know if it will ever improve enough to overcome the current limitations. You don’t know if it will take years.
In the meantime, while someone is sitting on their ass for years waiting for the uncertain future of the tool getting better, someone else is getting their hands dirty, learning the craft, improving, having fun, collaborating, creating.
There is plenty of garbage out there where we were promised “it will only get better”, “in five years (eternally five years away) it will take over the world”, and now they’re dead. Where’s the metaverse NFT web3 future? Thrown into a trash can and lit on fire, replaced by the next embarrassment of chatting with porn versions of your step mom.
https://old.reddit.com/r/singularity/comments/1mrygl4/this_i...
You are _technically_ correct but if I base my assumptions on the fact that almost all worthwhile software and technology has gotten better over the years, I feel pretty confident in standing behind that assumption.
> In the meantime, while someone is sitting on their ass for years waiting for the uncertain future of the tool getting better, someone else is getting their hands dirty, learning the craft, improving, having fun, collaborating, creating.
This is a pretty cynical take. We all decide where we prioritize our efforts and spend our time in life, and very few of us have the luxury to freely choose where we want to focus our learning.
While I wait for technologies I enjoy but haven't mastered to get better, I am certainly not "sitting on my ass".. I am dedicating my time to other necessary things like making a living or supporting my family.
In this specific case I wish I could spend hours and hours getting good at Blender and 3D modelling or animation. Dog knows I tried when I was younger.. But it wasn't in the cards.
I'm allowed to be excited at the prospect that technology advancements will make this more accessible and interesting for me to explore and enjoy with less time investment. I also want to "get my hands dirty, learn, improve, have fun, create" but on my own terms and in my own time.
Any objection to that is shitty gatekeeping.
No one told you weren't allowed to be excited, but you took it that way anyway.
You only know if it was worthwhile in hindsight. We aren’t there yet. And “better” is definitely debatable. We certainly do more things with software these days, but it’s a hard sell to unambiguously say it is better. Subscriptions everywhere, required internet access, invasions of privacy left and right, automated rejections, lingering bugs which are never fixed…
Your exact argument was given by everyone selling every tech grift ever. Which is not to say this specific case is another grift, only that you cannot truly judge the long-term impact of something while it is being invented.
> Any objection to that is shitty gatekeeping.
If gatekeeping is what you took from my comment, you haven’t understood it. Which could certainly mean my explanation wasn’t thorough enough. My objection is to the hand-wavy “this will only improve” commentary which doesn’t truly say anything and never advances the discussion, yet is always there. See the “low-hanging fruit” section of my other comment.
Yes that's because, so far, we haven't been able to see the future. Which is why we base predictions and assumptions on past performance and on lived experience. Sometimes we will be wrong.
You're also arguing in the abstract here while I am speaking about this specific topic of using LLMs to improve 3D modelling tooling.
Are you arguing that neither LLMs or 3D modelling tools are "worthwhile"?
Are you suggesting that improvements that make these tools more accessible, even incrementally, are a bad thing?
Or are you just challenging my right to make assumptions and speculate? I realize that we may not be even on the same page here.
You're also cherry-picking a limited number of examples where software isn't better, and I agree with all those (and never said that software universally gets better), but those examples are a tiny subset of what software does in today's world.
It's starting to feel like you just want to argue.
My opinion is that, broadly speaking, software advancements have improved the world, and will continue to improve the world, both in big and small ways. There will of course be exceptions.
These kinds of advancements lower the bar for entry and reduce the effort required to achieve better results, and so make the tools more accessible to more people.
And that is a good thing. The scenario you are implying is laying the choices of people at the feet of the tools, rather than holding the people accountable.
Also: history has, so far, mostly proven that the end result of better tools is better experts, not less experts.
I’m inclined to agree with the general sentiment. However, it is not a given that LLMs are better tools. You don’t really get better at them in the same sense as before, you just type something and pray. The exact same thing you typed may produce exactly what you wanted, and aberration, or something close with subtle mistakes that you don’t have the expertise to fix. Other tools made you better at the craft in general.
I very much disagree with this. I've spent the last 18 months working with LLMs daily at my work (I'm not exaggerating here) and while the models themselves have certainly gotten better, I have personally learned how to extract better results from the tools and improved my proficiency at applying LLMs as part of my overall toolchain and workflow.
LLMs are very much like many other tools in some ways, that the more you learn how to use them, the better results you will get.
I do agree that their non-deterministic nature makes them less reliable in some contexts as well though, but that's a trade-off you work into your approach, just like other trade-offs.
There is an insane amount of low-hanging fruit right now, and potentially decades or centuries of very important math to be worked out around optimal learning strategies, but it's clear that we do have a very strong likelihood of our ways of life being fundamentally altered by these technologies.
I mean already, artists are suddenly having to grip with all sorts of new and forgotten questions around artistic identity and integrity, what qualifies as art, who qualifies as an artist... Generative technology has already made artists begin to question and radically redefine the nature of art, and if it's good enough to do that, I think it's already worth serious consideration. These technologies, even in current form, were considered science fiction or literal magic up until very recently.
> Generative technology has already made artists begin to question and radically redefine the nature of art (…)
So which is it? Are artist not understanding the potential of the technology, or are they so flabbergasted they are redefining the nature of art? It can’t be both.
> There is an insane amount of low-hanging fruit right now
Precisely. What I’m criticising is the generic low-effort response of assuming that being able to pick low-hanging fruit now indicates with certainty that the high-hanging fruit will be picked soon. It doesn’t. As an exaggerated analogy, building the first paper airplane or boat might’ve been fun and impressive, but it was in no way an indication we’d be able to construct rockets or submarines. We eventually did, but it took a very long time and completely different technology.
To really drive the point home, my comment wasn’t specifically about art or LLMs or the user I replied to. What I am against is the lazy hand-wavy extrapolation which is used to justify anything. As long as you say it’s five years away, you can shut down any interesting discussion.
> These technologies, even in current form, were considered science fiction or literal magic up until very recently.
I don’t recall science fiction or magical stories—not any that weren’t written as cautionary tales or about revolution, anyway—which had talking robots which were wrong most of the time yet spoke authoritatively; convinced people to inadvertently poison or kill themselves and others; were used for spam and disinformation on a large scale; and accelerated the concentration of power for the very few at the top. Not every science fiction is good. In fact, plenty of it is very very bad. There’s a reason the Torment Nexus is a meme.
Different people have different relationships with generative technologies and have different experiences. It can be the same at both times, because there are a lot of people out there, and there are many artists with both technical and artistic interests.
> What I’m criticising is the generic low-effort response of assuming that being able to pick low-hanging fruit now indicates with certainty that the high-hanging fruit will be picked soon.
I mean if we can have both more compute and more efficient compute due to those low-hanging fruits and Moore's law, it's reasonable to assume that things can get much better to the point of being economical, useful or even essential for a new generation of people, even if we don't see any massive shifts in the current paradigm.
> I don’t recall science fiction or magical stories—not any that weren’t written as cautionary tales or about revolution, anyway—which had talking robots which were wrong most of the time yet spoke authoritatively
Because people had an understandably naive understanding of what AI progress might look like, or the timescales involved.
> Not every science fiction is good. In fact, plenty of it is very very bad
I'm not sure what this has to do with anything. In general, I don't understand or appreciate your hostile tone and would like for this conversation to take a more positive tone or for us to just drop it.
Everyone else still knows what art is. The only reason artists grapple with it is because it's existential for them. Artists think of themselves as the gatekeepers of art but the only thing that qualified them for that (in their minds) was the ability to produce it.
Now that everyone is producing generic entry level art (what most artists do anyway), they are losing their identity. The "What is art?" conversation isn't about art, it's about gatekeeping.
What is art?
> Artists think of themselves as the gatekeepers of art > The "What is art?" conversation isn't about art, it's about gatekeeping
Can you answer that question without yourself being a gatekeeper? Your post definitely comes off as judgemental and gatekeeping about what art is. But what if you're wrong? The people pondering what it may be certainly seem to have a more open mind, being more considerate of all the ways that expression and significance can be found.
I agree with the parent comment. This might be neat to learn the basics of blender scripting, but it's an incredibly inefficient and clumsy way of making anything worthwhile.
Or maybe applications will develop new interfaces to meet LLMs in the middle, sort of how MCP servers are a very primitive version of that for APIs..
Future improvements don't just have to be a better version of exactly what it is today, it can certainly mean changing or combining approaches.
Leaving AI/LLM aside, 3D modeling and animation tech has drastically evolved over the years, removing the need for lots of manual and complicated work by automating or simplifying the workflow for achieving better results.
This is like training an AI on being an Excel expert, and then ask it to make Doom for you: You're gonna get some result, and it will be impressive given the constraints. It's also going to be pure dog shit that will never see the light of day other than as a meme.
True. And it if it stops to be a path forward because a better approach is found (something more likely than not), then this is also the best it will ever be.
And in 3D, you won’t be able to do that without understanding, as an operator, what you’d want to ask for. Do you want this specific part to be made parametrically in a specific way for future flexibility, animation, or rendering? When? Why? And do these understandings of techniques give you creative ideas?
A model trained solely on the visual outcome won’t add these constraints unless you know to ask for them.
Even if future iterations of this technology become more advanced and can generate complex models, you need to develop a skillset to be able to gauge and plan around how they fit into your larger vision. And that skillset requires fundamentals.
My hot-take: this is the future of high-fidelity prompt-based image generation, and not diffusion models. Cycles (or any other physically based renderer) is superior to diffusion models because it is not probabilistic, so scene generation via LLM before handing to off to a tool leads to superior results, IMO - at least for "realistic" outputs.
1. Generation something that looks good in 2D latent space
2. Generation 3D representation from 2D
3. Next time the same scene is shown on the screen, reuse information from step 2 to guide step 1
Why are those two the only options?
I made no such claim. The only thing I declared is my belief in the superiority of PBR over diffusion models for a specific subset of image-generation tasks.
I also clearly framed this as my opinion, you are free to have yours.
Yes, thank you for your generosity.
You very clearly framed your opinion as a dual choice ("this instead of that", "rather than", "either or"). The most natural way to read your comment is that way: one or the other.
That's the way the English language works. If you meant something else, you should have said something else.
Hot take: a piece of commentary, typically produced quickly in response to a recent event, whose primary purpose is to attract attention
There’s something uniquely spirited about narrowly focused experts feeling threatened by “AI” and responding like this wherein some set of claims are made (“AI will never be able to [insert your speciality]” or “This can’t do the [specific edge case in an expert at]”)
except for the root claim:
“This threatens both my economic stability because it will shift my market, and my social status because I have devoted 100,000 hours to practicing and expertise so that I can differentiate myself because it’s part of my identity”
Why the quotes there? I do feel threatened by AI, but not economically, not in the context of this article at least.
> AI will never be able to
I said no such thing.
> This can’t do the
I also didn't said that, I didn't focus on any particular thing it can't do. In general I don't voice my opinion, I think here the urge was just too strong because the examples were just too crude and I had to say if you think those are impressive, you can literally do it after a day's worth of learning; and I only flexed my expertise to give some strength to that statement and motivate people to really try.
> “This threatens both my economic stability because it will shift my market, and my social status because I have devoted 100,000 hours to practicing and expertise so that I can differentiate myself because it’s part of my identity”
I'm an OSHA inspector/instructor. I don't think what I learned in 3D will go to waste even if AI will actually start creating good 3D models (I wouldn't bet against it).
They have an Aesprite plugin that generates pretty nice looking sprites from your prompts.
I asked GPT which one is the most scriptable CAD software, it's answer was Freecad. Blender is not a CAD software as far as I understand, the user cannot make measurements like Freecad.
Unfortunately Freecad's API is a little bit scattered and not well organized, GPT has trouble remembering/searching and retrieving the relevant functions. Blender is a lot more popular, more code on the internet, and it performs much better.
I haven't managed yet to write Rust and interoperate with FreeCAD or Blender, so if it is easier to write Rust for OpenSCAD I might settle on OpenSCAD. I have to experiment somewhat to find the most compatible with Rust between the three.
Blender cannot do that as far as I understand.
Something like that for example: [1]
LLMs alone will never make visual art. They can provide you an interface to other models, but that's not what this is.
Such tasks can be "not making visual art", but that doesn't mean they aren't useful.
Not exactly sure what your point is. If an LLM can take an idea and spit out words, it can spit out instructions (just like we can with code) to generate meshes, or boids, or point clouds or whatever. Secondary stages would refine that into something usable and the artist would come in to refine, texture, bake, possibly animate, and export.
In fact, this paper is exactly that. Words as input, code to use with blender as output. We really just need a headless blender to spit it out as a GLTF and it’s good to go to second stage.
Then you have sub specialties. Rigging, animation, texturing, environments, props, characters, effects.
It’s a fascinating process.
the detour via language here is not needed, these models can speak geometry more and more fluently
Anything that simplifies using advanced tools is useful.
You would basically need a life simulator for that, with rules like fish exist, fish live in water, fish chase other organic things in the water, a worm attracts fish, the hook makes the fish stick if it bites, do you see what I'm saying? You can't just have all this stuff happening organically. It's possible to have systems where the possibilities are so vast that the dev can't consider all of them, like spore or path of exile or whatever, lots of games with lots of options for customization. But you can't just have a real world simulator, the real world is too complex to simulate, let alone in real time.
Are there some datasets where some LLM learn to interact with some slicers, some Finite Elements Analysis software, or even real world 3d-printed objects to allow the model to do some quality assessment ?
We slowly started building more and more agents. Everything we tried just worked (kinda amazing). We first started by trying to incorporate visual understanding via VLMs. Then we slowly added more and more agents, and the BlenderRAG gave a huge boost.
One of my suspicions about these "can we make an LLM do something that isn't text?" projects is that underpinning it is something that isn't to do with AI at all.
Instead it's that a lot of specialist programmers really really loathe GUI paradigms for anything, consider text interfaces inherently superior, think the job of a GUI is only to simplify tasks and hide complexity, and so think all complex GUIs that are not immediately intuitive are categorical failures.
In rejecting learning GUI tools they rule out the possibility that GUI interfaces support paradigms text cannot, and they rule out the possibility that anyone who has deep skills in a particular GUI knows anything more than what all the switches and buttons do, when a Blender user is very evidently engaging in a similar kind of abstract thought as programming involved.
It is much the same with FreeCAD. Does the FreeCAD GUI still suck in several places? Yes. It was confusing and annoying until I learned that it is not trying to hide complexity. The inherent complexity is in the problem domain. But a programmer with a bit of maths and logic knowledge can, IMO, easily learn it. And then you discover that the FreeCAD UI is what it is because it is a set of tools designed by CAD-focussed programmers that attempts little to no magic, and suddenly you are using the tools to solve your own problems.
In short a lot of these projects have a whiff of "but I don't wannnnna learn a GUI". The LLM or AI generator offers a way to defer properly learning the tools or techniques that are not so difficult to learn, and so attracts a lot of attention.
I think it's clear that the human brain is modular, and that at least one module in the human brain shares similarities to the LLM. So the key is really to build the other modules and interconnect everything.
All these kinds of generators are LLM-via-text-proxy in the sense that people are using LLM's excellent text generation properties to generate via scripting interfaces in various tools.
I want a Blender plugin that assist me, I don't want to click to generate a model
But this generation is too young to remember those.