https://sqlite.org/appfunc.html
https://learn.microsoft.com/en-us/dotnet/standard/data/sqlit...
Maybe stick with the aggregate variety of function at first if you don't want any billing explosions. I'd probably begin with something like LLM_Summary() and LLM_Classify(). The summary could be an aggregate, and the classify could be a scalar. Being able to write a query like:
SELECT LLM_Summary(Comment)
FROM Users
WHERE datetime(Updated_At) >= datetime('now', '-1 day');
Is more expedient than wiring up the equivalent code pile each time. The aggregation method's internals could handle hierarchical summarization, chunking, etc. Or, throw an error back to the user so they are forced to devise a more rational query.Example usage:
openai-to-sqlite query database.db "
update messages set sentiment = chatgpt(
'Sentiment analysis for this message: ' || message ||
' - ONLY return a lowercase string from: positive, negative, neutral, unknown'
)
where sentiment not in ('positive', 'negative', 'neutral', 'unknown')
or sentiment is null
"
I haven't revisited the idea for fear of the amount it could cost if you ran it against a large database, but given the crashing prices of Gemini Flash, GPT-4o mini etc maybe it's worth another look!You could also expose additional functions corresponding to the external tools that you would like the agent to have access to and pass these as arguments to additional UDFs.
You could also lean into data-driven and express much of the configuration in tables and then use the enhanced SQL dialect to tie everything together at runtime. In SQLite, during UDF execution, arbitrary queries can be ran. You could pull tool descriptions, parameter lists, enums, etc. from ordinary SQL tables without having to pass explicit args to the functions.
“””
Apache AGE™ Graph Database for PostgreSQL Apache AGE™ is a PostgreSQL Graph database compatible with PostgreSQL's distributed assets and leverages graph data structures to analyze and use relationships and patterns in data. “””
But recently there has been a surge around MCPs being able to query databases provided the n-number of MCP servers popping up. An example: https://www.reddit.com/r/ChatGPTCoding/comments/1jd9lfa/lear...
So I was wondering of things like the Doris blogpost, this paper and sqlcoder are still relevant/what extra does this approach offer vs trying to build a over mcp?