Ask HN: What's a good format to submit CSV data for LLMs?
2 points
8 hours ago
| 2 comments
| HN
I need to submit like 1000 rows of data to an llm so I can ask it for trends within the data. If I use json, I check gpt tokenizer and thats like 40 tokens per row(cuz headers were being referenced everytime leading to inefficiency). Meaning 40k input, which definitely would put me in context rot(hallucination) territory. And I heard using csv was very inaccurate. Any suggetions
mierz00
6 hours ago
[-]
We analyse thousands of lines from a csv using an LLM. The only thing that worked for us was to send each individual line and analyse it one by one.

I’m not sure if that would work in your use case, but you could classify each line into a value using an LLM then hard code the trends you are looking for.

For example if you’re analysing something like support tickets. Use an LLM to classify the sentiment, and you can plot the sentiment on a graph and see if it’s trending up or down.

reply
eimrine
7 hours ago
[-]
you can use good old algorythms to search your special trends. just ask LLM how to code them. any algo you might need is somewhere inside of Donald Knuth's books.
reply
JimsonYang
7 hours ago
[-]
ty
reply