The game Myst is all about this magical writing script that allowed people to write entire worlds in books. That's where it feels like this is all going. Unity/Blender/Photoshop/etc.. is ripe for putting a LLM over the entire UI and exposing the APIs to it.
This is probably the first pitch for using AI as leverage that's actually connected with me. I don't want to write my own movie (sounds fucking miserable), but I do want to watch yours!
It is terrifyingly good at writing. I expected Freshmen college level but it's actually close to professional in terms of prose.
The plan is maybe transition into children's books then children shows made with AI catered to a particular child at a particular phase of development (Bluey talks to your kid about making sure to pick up their toys)
Maybe today only a few people can do this, but five years from now? Ten? What sucker would pay for any TV shows or books or video games or anything if there's a comfy UI workflow or whatever I can download for free to make my own?
The bottleneck is no longer on labor to turn ideas into reality, the bottleneck is imagination itself. It's incredible. The cost to produce/consume going down along with many other facets of the economy translates into deflation.
If you make less money, or work less hours, or only have a single person in your family work - that's ok because money will go further. That's the whole idea behind Star Trek, the first step though was intelligent computers, automation and robots. Harnessed in a way that doesn't backfire on us.
It's the human competition. Every human in the world is competing - on some level - with every other human.
Your described widespread "benefits" to society just change the playing field all humans are competing on. The money will go from that to something else - to pick something topical say, housing.
Precisely. Being able to more easily produce and consume just means more production and consumption. The "rat race" competition will continue. No one will work less.
You don't understand what that means. That means your soul's worth is measured with more precision and accuracy. That conjoined to free market economy implicates that individuals and groups of people producing less can be more openly considered to be objectively less human.
I don't personally care, but that's not... It seem to me that vast majority of people around here already don't exactly like what the Internet had always rewarded, nor how fast it's evolving, nor where it's headed. I think that accelerating that only accelerates that, and I suspect it might not exactly be what you would reflect on positively later.
I think it's about time the industry faced that risk. They have it coming in spades.
For example, LOST wouldn't have been such a galactic waste of time if I could have asked an AI to rewrite the last half of the series. Current-generation AI is almost sufficient to do a better job than the actual writers, as far as the screenplay itself is concerned, and eventually the technology will be able to render what it writes.
Call it... severance.
Even if AI prose weren’t shockingly dull, these models all go completely insane long before they reach novel length. Anthropic are doing a good job embarrassing themselves at an easy bug-catching game for barely-literate 8-year olds as we speak, and the model’s grip on reality is basically gone at this point, even with a second LLM trying to keep it on track. And even before they get to the ‘insanity’ stage, their writing inevitably experiences regression towards the average of all writing styles regardless of the prompt, so there’s not much ‘prompt engineering’ you can do to fix this.
I also think that using AI would only lengthen the learning period. It will get some kind of results faster, though.
This is what Windows Copilot should have been!
“Territory” shall mean the worldwide territory, excluding the territory of the European Union, United Kingdom and South Korea.
You agree not to use Tencent Hunyuan 3D 2.0 or Model Derivatives: 1. Outside the Territory;
Since they don’t fully comply with EU AI regulations, Meta preemptively disallows their use in those regions to avoid legal complications:
“With respect to any multimodal models included in Llama 3.2, the rights granted under Section 1(a) of the Llama 3.2 Community License Agreement are not being granted to you if you are an individual domiciled in, or a company with a principal place of business in, the European Union. This restriction does not apply to end users of a product or service that incorporates any such multimodal models”
https://github.com/meta-llama/llama-models/blob/main/models/...
But even as such a strategy, I don't think that would hold if the Commission decided to fine Tencent for releasing that in case it violated the regulation.
IMHO it's just the lawyers doing something to please the boss who asked them to “solve the problem” (which they can't, really).
North Korea? Maybe. Uk? Who gives a shit
I am impressed, it runs very fast. Far faster than the non-turbo version. But the primary time is being spent on the texture generation and not on the model generation. As far as I can understand this speeds up the model generation and not the texture generation. But impressive nonetheless.
I also took a head shot of my kid and ran it through https://www.adobe.com/express/feature/ai/image/remove-backgr... and cropped the image and resized it to 1024x1024 and it spit out a 3d model with texture of my kid. There are still some small artifacts, but I am impressed. It works very well with the assets/example_images. Very usable.
Good work Hunyuan!
I see plenty of GitHub sites that are barely more than advertising, where some company tries to foss-wash their crapware, or tries to build a little text-colouring library that burrows into big projects as a sleeper dependency. But this isn't that.
What's the long game for these companies?
https://www.joelonsoftware.com/2002/06/12/strategy-letter-v/
Are any of them better or worse with mesh cleanliness? Thinking in terms of 3d printing....
I wondered initially how they managed to produce valid meshes robustly, but the answer is not to produce a mesh, which I think is wise!
1. It does a pretty good job, definitely a steady improvement
2. The demos are quite generous versus my own testing, however this type of cherry-picking isn't unusual.
3. The mesh is reasonably clean. There are still some areas of total mayhem (but these are easy to fix in clary modelling software.)
This looks better than the other one on the front page rn
https://huggingface.co/spaces/tencent/Hunyuan3D-2
The meshes are very face-rich, and unfortunately do not reduce well in any current tool [1]. A skilled Blender user can quickly generate better meshes with a small fraction of the vertices. However if you don't care about that, or if you're just using it for brainstorming starter models it can be super useful.
[1] A massive improvement in the space will be AI or algorithmic tools which can decimate models better than the current crop. Often thousands of vertices can be reduced to a fraction with no appreciable impact in quality, but current tools can't do this.
No AI, just clever algorithms. I'm sure there are people trying to train a model to do the same thing but jankier and more unpredictable, though.
Something about the way this project generates models does not mesh, har har, with the algorithms of Quad Remesher.
There are meshes examples on the Github. I'll toy around with it.
-loads the diffusion model to go from text to an image and then generate a varied series of images based upon my text. One of the most powerful features of this tool, in my opinion, is text to mesh, and to do this it uses a variant of Stable Diffusion to create 2D images as a starting point, then returning to the image to mesh pipeline. If you already have an image this part obviously isn't necessary.
-frees the diffusion model from memory.
Then for each image I-
-load the image to mesh model, which takes approximately 12GB of VRAM. Generate a mesh
-free the image to mesh model
-load the mesh + image to textured mesh model. Texture the mesh
-free the mesh + image to textured mesh model
It adds a lot of I/O between each stage, but with super fast SSDs it just isn't a big problem.
Positive: "White background, 3D style, best quality"
Negative: "text, closeup, cropped, out of frame, worst quality, low quality, JPEG artifacts, PGLY, duplicate, morbid, mutilated, extra fingers, mutated hands, bad hands, bad face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck"
Thought that was funny.
So based on this your 4080 can do shape but not texture generation.
Also, in general, why not?
There are various reasons:
- Premature optimization will take away flexibility, and will thus affect your ability to change the code later.
- If you add features later that will affect performance, then since the users are used to the high performance, they might think your code is slow.
- There are always a thousands things to work on, so why spend effort on things that users, at this point, don't care much about?
Also:
> since the users are used to the high performance, they might think your code is slow.
I wouldn't worry about it in general - almost all software is ridiculously slow for the little it can do, and for the performance of machines it runs on, and it still gets used. Users have little choice anyway.
In this specific case, if speed is makes it into a different product, then losing that speed makes the new thing... a different product.
> There are always a thousands things to work on, so why spend effort on things that users, at this point, don't care much about?
It's R&D work, and it's not like they're selling it. Optimizing for speed and low resource usage is actually a good way to stop the big players from building moats around the technology, and to me, that seems like a big win for humanity.
Yes, of course people care about performance. Generating the mesh on a 3060 took 110+ seconds before, and now is about 1 second. And on early tests the quality is largely the same. I'd rather wait 1 second than 110 seconds, wouldn't you? And obviously this has an enormous impact on the financials of operating this as a service.
What makes you think this is true?