FilterHN

What I've learned about writing AI apps so far

86 points

by headalgorithm

13 hours ago

| past

| 13 comments

| seldo.com

| HN

▲

tmnvdb

11 hours ago

[-]

I disagree with several of these, and the whole thing does not seem very well-conceived, based on an "incorrect" notion of what AI "should" do to be considered impressive.

> LLMs only reliably know what you just told them, don't rely on training data

This depends a lot on the model but i've been using 4o for all kinds of information retrieval and found it to be generally reliable. At least not much worse than the general internet. You can ask it for sources and of course you should not use it as an authority, but it can often be a very good way to quickly find out a fact. You do need to developed a feeling for the kind of thing it will know realiably and the kind of thing that will cause it to start halucinating (a bit like some of your co-workers).

> LLMs cannot write for you

I disagree, LLMs can write small blocks of text very well. But there is an art to using it. Don't try to create too much at once. I often find it works better then I give less input. If you list a bunch of things it needs to include, it tends the result reads like a student trying to include all the buzzwords.

> LLMs can help a human perform tasks, they cannot replace a human

I don't think anybody claims otherwise for current public models.

> Have the LLM do as little as possible

You need to learn what LLMs do well, and then use it for that. The idea that it is most efficient to program everything by hand as much as possible does not match my experience. Writing boilerplate code of under 50 lines of code or so is something current models already do very well and very quickly. I usually just try to generate it and if it does not work I write it by hand.

Finally, LLMs now take video and audio. We use Gemini to write meeting notes from Google meet and they tend to be very high quality, a lot better then what a random person taking notes usually produces. So the models are not text-only.

▲

simonw

11 hours ago

[-]

When you say "You can ask it for sources" are you talking about the GPT-4o model or the ChatGPT feature where it can search the internet on your behalf?

ChatGPT with search is an example of RAG, which is a pattern this article is promoting: better results from the LLM because ChatGPT injected additional search result context into the model to accompany your question.

▲

SpaghettiX

11 hours ago

[-]

I agree. The author seems only happy with RAG, insert nail/hammer quote.

I think of contradictory examples where the article doesn't make sense: LLM Chat products are just the product of training data. Generative coding applications take 1 prompt and generate a lot of code.

What this article did make me think is the existing Chat UI in coding apps are too limiting. Some have image attachments, but we need to allow users to input more detail in their prompt (visually, or by having a pre-generation discussion about specifics). That's why I think product engineers will benefit from AI more than non technical folk.

Also, do you have resources that align with those opinions?

▲

slowmovintarget

6 hours ago

[-]

The real lesson should be, use an LLM when an educated guess is good enough. They're probabilistic systems doing best-fit math. If guessing is OK or even reasonable (in a feedback loop with an actual person, for example) then an LLM can be useful. If guessing is not OK, if you need algorithmic rigidity, just write code, don't introduce D&D dice rolls into your system.

Example: Should we use "AI" for authentication and authorization?

For logging in: No.

For checking authority for an operation: No.

For determining the likelihood that the IP address a login attempt is coming from is part of an attack pattern: Yes!

▲

simonw

11 hours ago

[-]

I haven't seen LLM strengths and weaknesses explained like this before and I really like it:

Is what you're doing taking a large amount of text and asking the LLM to convert it into a smaller amount of text? Then it's probably going to be great at it. If you're asking it to convert into a roughly equal amount of text it will be so-so. If you're asking it to create more text than you gave it, forget about it.

For me, I think the one exception to that is code: I can often get a few dozen lines of working code from the right single sentence prompt.

I can get better code if I feed in more examples of the libraries I'm using though, which fits this more-input-is-better rule.

▲

tomw1808

10 hours ago

[-]

I agree, mostly.

I use chatgpt like a chat-with-wikipedia. For general knowledge where I was too lazy to google or which is too far hidden in ads ridden adsense blogs, like "give me an easy recipe for waffles"

Here it becomes interesting: I have been coding with Aider and sonnet3.5 for the past 3-4 weeks. I add the few files I need to change as well as whatever structural information (db-scheme or so) and ask it to work on a task, very high level. Yes, I need to offer architectural advice or if I want library X or Y included, but overall, it produces very good (I would say junior level) code.

The thing is, it does all the long writing I got tired of over the years. The writing I don't want to do anymore. The CRUD methods. The "change the labels, and names of this to that". The "write a db connector that does XYZ". All the stuff I would normally ask a junior to implement.

The output code is generally ok. It's not an eye candy. But it does work generally, maybe with occasional hiccups.

But the time saved by prompting it with 2-3 lines and then getting code that normally takes me 30 minutes to convince myself that I really need to do that and then 10 minutes to write is now done in mere 20 seconds. Without much of a mental context switch and without being very taxing on the brain.

I am not sure if that counts as "take large amount and make it smaller" based on the extensive code it ingested during training, or if it falls in the category "create more text than I gave it", but it works for me. I would miss it if it didn't work anymore.

▲

summarity

11 hours ago

[-]

Very suspiciously similar comments here

▲

netdevphoenix

11 hours ago

[-]

You mean the ones below?

"Great article that contains some pithy aphorisms that I expect to see again and again."

"Great article that is approachable enough to share with less technical folk. Thank you!"

"Good article, I would add the good things about working with media different than text. For example, describing images."

They are similar indeed. The users have been around for a while so unless their accounts have been taken over, I wouldn't worry.

▲

tesch1

8 hours ago

[-]

Huh, that's an interesting observation- is there good signal on older accounts not being sources of astroturf? Anecdotally/datally I've dug randomly into a few suspicious comments elsewhere and though the account is old really looks sketch, like an article submission every other day for years on end but almost no comments, other odd patterns.

▲

mattsouth

8 hours ago

[-]

I cant explain the similarity of my comment to others around the same time. My reaction was a quick yay! to what I thought was a sane take on the capabilities and limitations of LLMS.

▲

tasuki

7 hours ago

[-]

> There is no way to get an LLM to perform the thought necessary to write something for you.

Sheesh. I think an AI would write a better article...

▲

pcwelder

11 hours ago

[-]

> A wild thing about LLMs is that they can observe what they've done and decide whether they've done a good job.

I've seen this behaviour in Claude but don't remember 4o doing the the same or as frequently.

▲

polishdude20

7 hours ago

[-]

I've recently seen Claude write tests inside of it's UI and then run them

▲

mattsouth

12 hours ago

[-]

Great article that contains some pithy aphorisms that I expect to see again and again.

▲

flir

11 hours ago

[-]

Solid summary, very clearly written. If you expanded it with some examples, I'd send it to non-technical family members as a rough guide to getting the best out of chatbots.

▲

tndibona

11 hours ago

[-]

While I agree that LLMs cannot replace humans and that a job cannot simply be reduced to a bunch of LLM tasks, they definitely do accelerate a human’s potential to do more tasks efficiently. This is the key for the capitalists to exploit. For a job that required 10 humans, they will let go 4 humans and equip the remaining 6 to take on the load of 10 but with assistance of LLMs. To be honest this was done even before LLMs, now LLMs are just an alibi for cost cutting.

You still have 4 people that have been let go who will need to find a way to earn a living in this competitive market.

▲

parasti

11 hours ago

[-]

That's my general impression as well. RAG is absolutely here to stay because it's an incredibly useful, practical tool.

Going off on a tangent here, but ChatGPT Search is miles ahead of the, with all due respect, garbage that Google Search spews these days. Honestly, Google Search is now in a very awkward position where using an old-style keyword search doesn't really do what I want and asking it a question ChatGPT-style doesn't do what I want, either. I don't even know how to use it anymore to get good results. And why even bother when ChatGPT Search exists?

▲

cies

11 hours ago

[-]

Funny how it is 180DEG opposite of what we hear from team-hype: AI going to replace you, and, AI can "create".

▲

croes

11 hours ago

[-]

That’s the problem between what LLMs can and what some people think it can. The latter is the real problem and danger of AI.

▲

niklasrde

11 hours ago

[-]

OP acknowledges that it can "create". It can create "endless waffle, drivel, pointless rambling, and hallucinations". So if you're in the business of that, AI can replace you ;)

▲

voidUpdate

11 hours ago

[-]

Ah, politicians!

▲

jgrall

11 hours ago

[-]

Great article that is approachable enough to share with less technical folk. Thank you!

▲

wslh

11 hours ago

[-]

Good article, I would add the good things about working with media different than text. For example, describing images.