I built MOL, a domain-specific language for AI pipelines. The main idea: the pipe operator |> automatically generates execution traces — showing timing, types, and data at each step. No logging, no print debugging.
Example:
let index be doc |> chunk(512) |> embed("model-v1") |> store("kb")
This auto-prints a trace table with each step's execution time and output type. Elixir and F# have |> but neither auto-traces.Other features: - 12 built-in domain types (Document, Chunk, Embedding, VectorStore, Thought, Memory, Node) - Guard assertions: `guard answer.confidence > 0.5 : "Too low"` - 90+ stdlib functions - Transpiles to Python and JavaScript - LALR parser using Lark
The interpreter is written in Python (~3,500 lines). 68 tests passing. On PyPI: `pip install mol-lang`.
Online playground (no install needed): http://135.235.138.217:8000
We're building this as part of IntraMind, a cognitive computing platform at CruxLabx. """
Re: pipe tracing, half a decade or so ago I made a little language called OTPCL, which has user-definable pipeline operators; combined with the ability to redefine any command in a given interpreter state, it'd be straightforward for a user to shove something like (pardon the possibly-incorrect syntax; haven't touched Erlang in awhile)
'CMD_|'(Args, State) ->
io:print("something something log something something"),
otpcl_core:'CMD_|'(Args, State).
into an Erlang module, and then by adding that to a custom interpreter state with otpcl:cmd/3 you end up with automatic logging every time a script uses a pipe.Downside is that you'd have to do this for every command defining a pipe operator (i.e. every command with a name starting with "|"); alternate user-facing approach would be to get the AST from otpcl:parse/1, inject log/trace commands before or after every command, and pass the modified tree to otpcl:interpret/2 (alongside an interpreter state with those log/trace commands defined). Or do the logging outside of the interpreter between manual calls to otpcl:interpret/2 for each command; something like
trace_and_interpret([], State) ->
{ok, State};
trace_and_interpret([Cmd|Tree], State) ->
io:print("something something log something something"),
{_, NewState} = otpcl:interpret([Cmd], State),
trace_and_interpret(Tree, NewState).
should do the trick, covering all pipes and ordinary commands alike.https://hackage.haskell.org/package/tracing-0.0.7.4/docs/Con...
Of the languages listed, Elixir, Python and Rust can all achieve this combination. Elixir has a pipe operator built-in, and Python and Rust have operator overloading, so you could overload the bitwise | operator (or any other operator you want) to act as a pipeline operator. And Rust and Elixir have macros, and Python has decorators, which can be used to automatically add logging/tracing to functions.
It's not automatic for all functions, though having to be explicit/selective about what is logged/traced is generally considered a good thing. It's rare that real-world software wants to log/trace literally everything, since it's not only costly (and slow) but also a PII risk.
Python's scoping and mutability make it an extremely poor language for pipelining.
Using dbg/2 [1]:
# In dbg_pipes.exs
__ENV__.file
|> String.split("/", trim: true)
|> List.last()
|> File.exists?()
|> dbg()
This code prints: [dbg_pipes.exs:5: (file)]
__ENV__.file #=> "/home/myuser/dbg_pipes.exs"
|> String.split("/", trim: true) #=> ["home", "myuser", "dbg_pipes.exs"]
|> List.last() #=> "dbg_pipes.exs"
|> File.exists?() #=> true
---1. Debugging - dbg/2
Pipelines are just a description of computation, sometimes it makes sense to increase throughput, instead of low latency, by batching, is execution separate from the pipeline definition?
I had a custom lark grammar I thought was cool to do something similar, but after a while I just discarded it and went back to straight python, and found it was faster my an order of magnitude.