Common vibe coding artifacts:
• Code duplication (from copy-pasted snippets)
• Dead code from quick iterations
• Over-engineered solutions for simple problems
• Inconsistent patterns across modules
pyscn performs structural analysis:
• APTED tree edit distance + LSH
• Control-Flow Graph (CFG) analysis
• Coupling Between Objects (CBO)
• Cyclomatic Complexity
Try it without installation:
uvx pyscn analyze . # Using uv (fastest)
pipx run pyscn analyze . # Using pipx
(Or install: pip install pyscn)
Built with Go + tree-sitter. Happy to dive into the implementation details!Since you mentioned the implementation details, a couple questions come to mind:
1. Are there any research papers you found helpful or influential when building this? For example, I need to read up on using tree edit distance for code duplication.
2. How hard do you think this would be to generalize to support other programming languages?
I see you are using tree-sitter which supports many languages, but I imagine a challenge might be CFGs and dependencies.
I’ll add a Qlty plugin for this (https://github.com/qltysh/qlty) so it can be run with other code quality tools and reported back to GitHub as pass/fail commit statuses and comments. That way, the AI coding agents can take action based on the issues that pyscn finds directly in a cloud dev env.
I focused on Python first because vibe coding with Python tends to accumulate more structural issues. But the same techniques should apply to other languages as well.
Excited about the Qlty integration - that would make pyscn much more accessible and would be amazing!
1) unfamiliar framework 2) just need to build a throwaway utility to help with a main task (and I don't want to split my attention) 3) for fun: I think of it as "code sculpting" rather than writing
So this is absolutely a utility I would use. (Kudos to the OP.)
Remember the second-best advice for internet interactions (after Wheaton's Law): "Ssssshh. Let people enjoy things."
I don't think #1 is a good place to vibe code; if it's code that I'll have to maintain, I want to understand it. In that case I'll sometimes use an LLM to write code incrementally in the new framework, but I'll be reading every line of it and using the LLM's work to help me understand and learn how it works.
A utility like pyscn that determines code quality wouldn't be useful for me with #1: even in an unfamiliar framework, I'm perfectly capable judging code quality on my own, and I still need and want to examine the generated code anyway.
(I'm assuming we're using what I think is the most reasonable definition of "vibe coding": having an LLM do the work, and -- critically -- not inspecting or reviewing the LLM's output.)
I think of coding agents as “talented junior engineers with no fatigue, but sometimes questionable judgment.”
Vibe coders don't care about quality and wouldn't understand why any of these things are a problem in the first place.
I find for every 5 minutes of Claude writing code, I need to spend about 55 minutes cleaning up the various messes. Removing dead code that Claude left there because it was confused and "trying things". Finding opportunities for code reuse, refactoring, reusing functions. Removing a LOT of scaffolding and unnecessary cruft (e.g. this class with no member variables and no state could have just been a local function). And trivial stylistic things that add up, like variable naming, lint errors, formatting.
It takes 5 minutes to make some ugly thing that works, but an hour to have an actual finished product that's sanded and polished. Would it have taken an hour just to write the code myself without assistance? Maybe? Probably? Jury is still out for me.
It's more useful as a research assistant, documentation search, and writing code a few lines at a time.
Or yesterday for work I had to generate a bunch of json schemas from Python classes. Friggin great for that. Highly structured input, highly structured output, repetitious and boring.
But in about 45 minutes I got 700 lines of relatively compact web code to use plotly, jszip, and paraparse to suck in video files, CSV telemetry, and logfiles, help you sync them up, and then show overlays of telemetry on the video. It can also save a package zip file of the whole situation for later use/review. Regex search of logs. Things linked so if you click on a log line, it goes to that part of the video. WASD navigation of the timeline. Templating all the frameworks into the beginning of the zip file so it works offline. etc.
I am not an expert web developer. It would have taken me many hours to do this myself. It looks crisp and professional and has a big featureset complexity.
(Oh, yah, included in the 45 minutes but not the line count: it gave me a ringbuffer for telemetry and a CSV dumper for it and events, too).
The last couple of revisions, it was struggling under the weight of its context window a bit and I ended up making the suggested changes by hand rather than taking a big lump of code from it. So this feels like an approximate upper limit for the complexity of what I can get from ChatGPT5-thinking without using something like Claude Code. Still, a whole lot of projects are this size or smaller.
And even the tools get better, they'll never get to the point where you don't need experts to utilize them, as long as LLMs are the foundation.
Vibe coders are the new script kiddies.
He literally bucketed an entire group of people by a weak label and made strong claims about competence and conscientiousness.
There was a time when hand soldered boards were not only seen as superior to automated soldering, but machine soldered boards were looked down on. People went gaga over a good hand soldered board and the craft.
People that are using AI to assist them to code today, the "vibe coders", I think would also appreciate tooling that assists in maintaining code quality across their project.
I think a comparison that fits better is probably PCB/circuit design software. Back in the day engineering firms had rooms full of people drafting and doing calculations by hand. Today a single engineer can do more in an hour then 50 engineers in a day could back then.
The critical difference is, you still have to know what you are doing. The tool helps, but you still have to have foundational understanding to take advantage of it.
If someone wants to use AI to learn and improve, that's fine. If they want to use it to improve their workflow or speed them up that's fine too. But those aren't "vibe coders".
People who just want the AI to shit something out they can use with absolutely no concern for how or why it works aren't going to be a group who care to use a tool like this. It goes against the whole idea.
But "vibe coding" is this vague term that is used on the entire spectrum, from people that do "build me a billion dollar SAAS now" kind of vibe coders, to the "build this basic boilerplate component" type of vibe coders. The former never really get too far.
The later have staying power because they're actually able to make progress, and actually build something tangible.
So now I'm assuming you're not against AI generated code, right?
If that's the case then it's clear that this kind of tool can be useful.
I think AI is useful for research and digging through documentation. Also useful for generating small chunks of code at a time, documentation, or repetitive tasks witb highly structured inputs and outputs. Anything beyond that, in my opinion, is a waste of time. Especially these crazy ass agent workflows where you write ten pages of spec and hope the thing doesn't go off the rails.
Doesn't matter how nice a house you build if you build it on top of sand.
"... fully give in to the vibes, embrace exponentials, and forgete that the code even exists."
If you're "vibe coding" you don't know and you don't care what the code is doing.
Prescriptive comment: Comment describes exactly what the following code does without adding useful context. (Usually this is for the LLM to direct itself and should be removed).
Inconsistent style: You have this across modules, but this would be in the same file.
Inconsistent calling style: A function or method should return one kind of thing.
(In the worst case, the LLM has generated a load of special cases in the caller to handle the different styles it made).
Unneeded "Service" class: I saw a few instances where something that should have been simple function calls resulted in a class with Service in the name being added, I'm not sure why, but it did happen adding extra complications.
Those are the ones off the top of my head.
As a senior dev, I think use of these tools can be fine, as long as people are happy to go and fix the issues and learn, anyone can go from vibe coder to coder if you accept the need to learn and improve.
The output of the LLM is a starting point, however much we engineer prompts, we can't know what else we need to say until we see the (somewhat) wrong output and iterate it.
But specialization restricts target market and requires time to develop. Its currently faar more lucrative trying to make a general purpose model and attract VC funding for market capture.
Personally, I can deal with quite a lot of jank and a lack of tests or other quality control tools in the early stages, but LLMs get lost so quickly. It’s like onboarding someone new to the codebase every hour or so.
You want to put them into a feedback loop with something or someone that isn’t you.
I'll try hooking it into my refactor/cleanup workflow with copilot and see how it works as grounding.
Great idea using it as grounding for AI-assisted refactoring! Let us know how that workflow goes.
https://aider.chat/docs/recordings/tree-sitter-language-pack...
I have a MCP server that wraps developer tool CLIs (linting, tests, etc), but this would need a textual report instead of HTML.
If cursor and Claude code can already run an executable why do I need to add an MCP server in front of it?
I feel like a lot of times it’s, “Because AI”
- Security/Speed: I leave "approve CLI commands" on in Cursor. This functions as a whitelist of known safe commands. It only needs to ask if running a non-standard command, 99% of the time it can use tools. It will also verify paths passed by the model are in the project folder (not letting it execute on external files)
- Discoverability: For agents to work well, you need to explain which commands are available, when to use each, parameters, etc. This is a more formal version than a simple AGENTS.md, with typed parameters, tool descriptions, etc.
- Correctness: I find models mess up command strings or run them in the wrong folders. This is more robust than pure strings, with short tool names, type checking, schemas, etc.
- Parallel execution: MCP tools can run in parallel, CLI tools typically can't
- Sharing across team: which dev commands to run can be spread across agents.md, github workflows, etc. This is one central place for the agents use case.
- Prompts: MCP also supports prompts (less known MCP feature). Not really relevant to the "why not CLI" question, but it's a benefit of the tool. It provides a short description of the available prompts, then lets the model load any by name. It's requires much less room in context than loading an entire /agents folder.
pyscn analyze --json . # Generate JSON report
But when i try to run analyze or check.
Running quality check...
Complexity analysis failed: [INVALID_INPUT] no Python files found in the specified paths
Dead code analysis failed: [INVALID_INPUT] no Python files found in the specified paths
Clone detection failed: no Python files found in the specified paths
Error: analysis failed with errorsI'm certainly in a folder with python files.