FilterHN

Text-to-CAD

80 points

by softservo

3 days ago

| past

| 10 comments

| github.com

| HN

▲

david_mchale

3 hours ago

[-]

I've been using Claude Opus 4.7 into OpenSCAD for creating hacked connectors for vibrating mesh nebulizers. It's incredibly powerful but still needs heavy manual checking to generate anything usable, but holy COW is it powerful when armed with the right info.

▲

softservo

3 hours ago

[-]

The purpose of this repo (harness and skills) is really to just give the models more direct tools to generate and inspect STEP files. It basically generates a topology sidecar for every STEP file that can be used to quickly read the BREP (faces/edges/vertexes) without loading in the full STEP.

There's also a bunch of work going into the SKILL.md to plan for more complex parts (this is mostly a stop gap while the models don't have amazing spatial reasoning).

▲

david_mchale

1 hour ago

[-]

I appreciate that effort, seeing Claude start to prototype physical objects that can get mass-produced is unbelievable but wow it uses up tokens like crazy.

I'm using Opus 4.7 w/ the 1M context option on the vibrating mesh nebulizer repo and have hit compacting pretty often which is a restart-the-conversation flag for me on relatively small OpenSCAD files like the adapters and enclosures here which are like 10-40kb: https://github.com/dmchaledev/VibratingMeshNebulizerControll...

▲

behaviors

5 hours ago

[-]

I've been using an OpenSCAD container with various local models. Dumping the render.png straight to the model, allowing it to modify the code and try again. Made some interesting things, but the main purpose was to fix things I've already made and have some weird single issue that cascades to a broken model if I touch it. OpenSCAD is the first step, FreeCAD and similar(now starting to see more CAD LLM work) are still a WIP. Since january we've solved 4 solid issues I've left on backburner. I use the docker container version with some Custom wrap/bridge work for the render dumps.

▲

SOLAR_FIELDS

4 hours ago

[-]

The problem is that the jump to OpenSCAD to a BRep based modeler is not quite a jump. It’s more like scaling a 10,000 foot sheer cliff in terms of the level of difficulty difference. You’ll be on that WIP for quite a long time

▲

brookst

6 hours ago

[-]

I built https://github.com/brookstalley/cordyceps to do CAD work using claude code.

It's not perfect by any stretch, but it is surprisingly strong. It was able to create and debug some pretty complicated geometry by iterating with screenshots, adjusting view angle and zoom and rendering mode, updating parametric geometry generation, and working to fairly complex goals.

▲

amelius

4 hours ago

[-]

Without benchmarks and/or a whole suite of non-cherrypicked examples, this means nothing because you can trivially make an AI generate anything from text.

▲

softservo

3 hours ago

[-]

Working on benchmarks at the moment! Always open to feedback / PRs.

▲

carterschonwald

2 hours ago

[-]

im def working on benchmarks for how my own general harness improves task performance vs same model in a commodity setup. its hard to do!

i will say that my current harness: https://github.com/cartazio/oh-punkin-pi is a testbed for a bunch of 2nd gen harness tech, largely optimized for reasoning llms only. the next one after this harness is gonna be epicccc

▲

lsch1033

6 hours ago

[-]

You'll know how incapable it is when it doesn't seem to understand how servo motors work in the Demo Project.

▲

softservo

3 hours ago

[-]

▲

Eisenstein

6 hours ago

[-]

I have been using Claude to generate OpenSCAD for 3D printing. It works decently when the jobs are simple and can be easily described, but the description part really makes it clear how little vocabulary the ordinary person has to compose a good picture of any real item that isn't just a basic shape. It seems that the trick, like most things with getting LLMs to do something complicated and have it work well, is to be an expert in the field already.

▲

emporas

2 hours ago

[-]

The trick might be to put a multimodal A.I. to describe what it sees in an image, and employ another LLM to put the textual representation into code. Multimodal A.I.s are good at describing images.

Even a handwritten sketch could be a very good starting point for an image recognition from an A.I.

▲

XiZhao

7 hours ago

[-]

I just posted this somewhere else -- but overall big fan of these text to cad rigs as projects.

Obligatory mention of https://zoo.dev/ who went to extreme lengths on this.

I will say I explored this reasonably deeply and came away with the conclusion that even though we have OpenSCAD and all these examples, LLMs are still very weak at spatial reasoning compared to diffusion models.

You can do all sorts of tricks like have a parts library to get around this and do physics checks but another inconvenient truth is whenever you design a complex assembly, every change to that part needs to be aware of the other parts in the design -- thus you need a global part-aware editing capability from diffusion.

That's getting solved already in china leading labs, and bottlenecked by the lack of good training data, which china is solving with mass labor.

This will be solved overseas first before we will in the US.

p.s. I am not affiliated with zoo or any of these other things FYI was just very curious about this whole area

▲

btbuildem

1 hour ago

[-]

I've been watching the space as well, waiting for the day I can stop fiddling with widgets and just tell the damn thing about the shapes I want and the ways in which they will move. Alas, we're far from that yet.

> That's getting solved already in china leading labs

Care to drop a bit of info as a follow up to this claim? Curious!

▲

unholiness

6 hours ago

[-]

> That's getting solved already in china leading labs, and bottlenecked by the lack of good training data, which china is solving with mass labor.

What work are you referring to here?

▲

SpyCoder77

7 hours ago

[-]

Zoo doesn't seem to be a great website, on my normally sized display there is a small horizontal scrollbar that moves like 5 pixels

▲

mploscos

6 hours ago

[-]

overflow-x: hidden; and the pain goes away :-)

▲

ur-whale

6 hours ago

[-]

> LLMs are still very weak at spatial reasoning compared to diffusion models

Don't know what diffusion model can do, but 100% agree with the "LLMS are very weak at spatial reasoning" comment.

I build a rather complex blueprint-image-to-3D-brep-model a couple of months back using codex ... ugh the damn thing has really no idea where things are in space, something a 3 year old figures out instinctively.

It did end up saving some time as compared to modeling the object myself in a CAD package, but there was so many completely obvious thing I had to explain ... very hard to believe when compared to what codex can pull of with code.

▲

MisterMower

5 hours ago

[-]

This sounds like a cool project, I would love to hear more about it. I am trying to solve a similar problem myself.

▲

carterschonwald

2 hours ago

[-]

i might borrow the skills etc for good ideas sometime. thats a lot of integration surface

▲

amelius

4 hours ago

[-]

The demo should be a pelican on a bicycle of course.

▲

ur-whale

7 hours ago

[-]

Not sure I understand ... no mention of an actual CAD engine backend ... did I miss it?

Or is this capable of generating STEP files directly from an LLM (which I doubt)?

[EDIT]: haha. the answer is hidden in:

.agents/skills/cad/requirements.txt

TL;DR:

    build123d

    ezdxf

    numpy

    trimesh

    vtk

and the engine is build123d, which, from its home page:

Build123d is a Python-based, parametric (BREP) modeling framework for 2D and 3D CAD. Built on the Open Cascade geometric kernel, it provides a clean, fully Pythonic interface for creating precise models suitable for 3D printing, CNC machining, laser cutting, and other manufacturing processes. Models can be exported to popular CAD tools such as FreeCAD and SolidWorks.

prbly worth mentioning in the README, I can't be the only one wondering out there.

Also: these things seem to be sprouting all over the place these days (a good thing!) ... CAD modeling using LLMs is clearly an idea whose time has come.

▲

GorbachevyChase

23 minutes ago

[-]

I don’t think its time has come. I think there are a lot of software folks that don’t understand what the actual pain points of professional engineers and CAD technicians are. I think there is a niche where text-to-CAD is good: hobby users who don’t want invest in learning a CAD software UI. For professionals, where results have dollar values, there needs to be a much deeper understanding of the problem domain to understand why enterprise CAD software sucks.

▲

akiselev

6 hours ago

[-]

Based on requirements.txt it uses build123d so OpenCascade is the geometric kernel (CAD engine backend)

▲

ur-whale

6 hours ago

[-]

> it uses build123d so OpenCascade is the geometric kernel

yup, found it as you were typing this :D