Benchmarking a Bug Scanner
5 points
by drob
2 days ago
| 2 comments
| blog.detail.dev
| HN
lmeyerov
6 minutes ago
[-]
The baseline Claude prompt being compared to feels pretty laughable so not sure what is learned. Maybe compare to a more realistic baseline for the DIY side for more compelling benchmarketing?

We started with a DIY code review skill because it's inherent to want to customize to our codebase and infra before trying solutions that add layers which may get in our way here. We have a 1 page skill that that does seperate passes on security, spec conformance, proper DRY & architectural abstractions, etc, and adversarial result quality passes to prune & prioritize. Others do similar.

reply
sachiniyer01
2 days ago
[-]
Author here!

Lmk if there is any qs I can answer about Detail or the post.

reply
not_right4r987
1 hour ago
[-]
Is this something paid post? asking someone with high karma to post and making it to first page of HN news?
reply