Queries should start by the `FROM` clause, that way which entities are involved can be quickly resolved and a smart editor can aid you in writing a sensible query faster.
The order should be FROM -> SELECT -> WHERE, since SELECT commonly gives names to columns, which WHERE will reference.
You could even avoid crap like `SELECT * FROM table`, and just write `FROM table` and have the select clause implied.
Never mind me, I'm just an old man with a grudge, I'll go back to my cave...
Check out the DuckDB community extensions:
[0]: https://duckdb.org/community_extensions/extensions/psql.html
[1]: https://duckdb.org/community_extensions/extensions/prql.html
But I haven't found a good editor plugin that is actually able to use that information to do completions :/ If anyone knows I'd be happy to hear it
duckdb -ui
More here: https://duckdb.org/2025/03/12/duckdb-ui.html
>The order should be FROM -> SELECT -> WHERE, since SELECT commonly gives names to columns, which WHERE will reference.
Per the SQL standard, you can't use column aliases in WHERE clauses, because the selection (again, relational algebra) occurs before the projection.
> You could even avoid crap like `SELECT * FROM table`, and just write `FROM table` and have the select clause implied.
Tbf, in MySQL 8 you can use `TABLE <table>`, which is an alias for `SELECT * FROM <table>`.
It's inspired by a mish-mash of both relational algebra and relational calculus, but the reason why SELECT comes first is because authors wanted it to read like English (it was originally called Structured English Query Language).
You can write the relational algebra operators in any order you want to get the result you want.
Ultimately, yes, you can express relational algebra in any notation that gets the point across, but the parent is right that
π₁(R)
is what is commonly used. (R)π₁
not so much. Even Codd himself used the former notation style in his papers, even though he settled on putting the relation first in his query language.Π(σ(R)) instead of σ(Π(R))
and not about whether relational algebra uses prefix or postfix notation:
Π(σ(R)) vs. R > σ > Π
SQL's WHERE statement (and others) works totally differently from SELECT in that regard, so it doesn't make much sense to say that "SELECT comes first because relational algebra".
I don’t think I see what you see. From 3.3:
RANGE PART P
GET W (P.P#,P.PNAME,P.QOH):(P.QOH<25)
SELECT p.pnum, p.pname, p.qoh FROM part p WHERE p.qoh < 25
But a more direct translation would look something like: FROM part p SELECT p.pnum, p.pname, p.qoh WHERE p.qoh < 25
Even closer would be: FROM part p; SELECT p.pnum, p.pname, p.qoh WHERE p.qoh < 25
Of course they are different languages so we can only take things so far, but the pertinent bit is that the range specifier is declared first in ALPHA, whereas it is declared later, after the projection is defined, in SQL.A common misconception (that SQL is a realization of RA instead of barely based on it).
In RA, is in fact `Relation > Operator`
https://cs186berkeley.net/notes/note6/
https://web.wlu.ca/science/physcomp/ikotsireas/CP465/W1-Intr...
More likely because this order is closer to typical English sentence structure. SQL was designed to look like English, not relational algebra.
Except this works in most major vendor SQL implementations. And they all support relation aliases in SELECT... Seems the standards have long fell behind actual implementations.
SELECT id AS foo
FROM MyTable
WHERE foo = 1;
Similarly, you can’t do this: SELECT id
FROM MyTable
WHERE id = MAX(id);
Because in both cases, when the WHERE predicate is being executed, the engine doesn’t yet know what you’re asking it to find - for the former, SELECT hasn’t yet been called, so the alias isn’t applied; for the latter, the aggregation happens after filtering, so it can’t know what that maximum value is.You can of course alias columns for SELECT however you’d like, and can also use those aliases in ORDER BY. You can also trivially refactor the above examples to use subqueries or CTEs to accomplish the goal.
You can also use "correlated column aliases" (I can't recall the proper name) i.e.
SELECT
id AS foo,
foo || '_1' as foo_n,
right(foo_n, 1) as foo_ordinal
FROM MyTable
WHERE foo = 1;
Again, if this isn't all part of SQL standards, the reality is that a lot of engines have semi-standard (sometimes very proprietary too) ways of handling these now common patterns. For real-world use cases, the standards are unfortunately becoming increasingly irrelevant. I think it would be better in the long term to use standards, but if they can't keep up with actual usage then they will just get ignored. SELECT LOWER(Name) as LName FROM Product where LName like '%frame%';
Don't blame the math for sloppy implementation. Nothing in the math suggest where aliases should be specified. Don't get me wrong it is a coinvent, logical place to put them, but as you say it rather limits their use, and could have been done better.
Okay, so what?
We're not obligated to emulate the notational norms of our source material, and it is often bad to do so when context changes.
[0]: https://github.com/postgres/postgres/tree/master/src/backend...
Lots of people “like” things because they are familiar with them. And that’s a fine enough reason. But if you step out of your zone of familiarity, can you find improvements? Are you willing to forgo any prejudice you may possess to evaluate other suggestions?
Just a little willingness to see another perspective is all anyone asks.
No one was confused what you were doing.
I'm saying the historical reason for why it is the way it is, doesn't matter. I would hope we design our languages to be maximally clear and useful, not to be maximally full of historical cruft.
There's no accounting for taste, so you're welcome to like whatever you like, but I don't think you liking a backward syntax is particularly persuasive. It sounds more like you're just used to it than that you see any actual benefits to it.
Funnily enough, if you pull up a version of Postgres' parser prior to the 1995 release, you'll find that it puts the relation first.
https://learn.microsoft.com/en-us/kusto/query/?view=microsof...
Also the LINQ approach in .NET.
I do agree, that is about time that SQL could have a variant starting with FROM, and it shouldn't be that hard to support that, it feels like unwillingness to improve the experience.
As a self-taught developer, I didn't know what I was missing, but now the mechanics seem clear, and if somebody really needs to handle SELECT with given names, then he should probably use CTE:
WITH src AS (SELECT * FROM sales), proj AS (SELECT customer_id, total_price AS total FROM src), filt AS (SELECT * FROM proj WHERE total > 100) SELECT * FROM filt;
It would be equally declarative if FROM came first.
What do you mean? Both ALPHA (the godfather declarative database language created by Codd himself) and QUEL, both of which inspired SQL, put "FROM" first. Fun fact: SQL was originally known as SEQUEL, which was intended to be a wordplay on being a followup to QUEL.
Another commenter makes a good case that SQL ended up that way because it was trying to capture relational algebra notation (i.e. π₁(R), where π~select, ₁~column, R~table). Yet relational algebra is procedural, not declarative. Relational calculus is the declarative branch.
Although the most likely explanation remains simply that SEQUEL, in addition to being fun wordplay, also stood for Structured English Query Language. In English, we're more likely to say "select the bottle from the fridge", rather than "from the fridge, select the bottle". Neither form precludes declarative use.
C++ has this issue too due to the split between header declarations and implementations. Change a function name? You're updating it in the implementation file, and the header file, and then you can start wondering if there are callers that need to be updated also. Then you add in templates and the situation becomes even more fun (does this code live in a .cc file? An .h file? Oh, is your firm one of the ones that does .hh files and/or .hpp files also? Have fun with that).
> The order should be FROM -> SELECT -> WHERE, since SELECT commonly gives names to columns, which WHERE will reference.
Internally, most SQL engines actually process the clauses in the order FROM -> WHERE -> SELECT. This is why column aliases (defined in SELECT) work in the GROUP BY, HAVING and ORDER BY clauses, but not in the WHERE clause.FROM table -- equivalent to today's select * from table
SELECT a, 1 as b, c, d -- equivalent to select ... from table
WHERE a in (1, 2, 3) -- the above with the where
GROUP BY c -- the above with the group by
WHERE sum(d) > 100 -- the above with having sum(d) > 100
SELECT count(a distinct) qt_a, sum(b) as count, sum(d) total_d -- the above being a sub-query this selects from
SQL has a conceptual issue with repeating group by clauses, so maybe not that one (or maybe we should fix the conceptual issues). But any other, including the limiting and offset ones.
WHERE sum(d) > 100 -- the above with having sum(d) > 100
0 - https://www.w3schools.com/sql/sql_having.aspYou want the DB to first run SELECT * FROM <table>, and then start operating on that?
It's about how humans think about it, not about how the computer executes it.
I usually start with: ``` select * from <table> as <alias> limit 5 ```
BigQuery SQL and Spark SQL (and probably some others) have adopted pipelined syntax, DuckDB SQL simply allows you to write the query FROM-first.
Now we need to get the ANSI SQL committee to standardize it ANSI SQL 2027 or some such.
I will also hazard a guess that the total number of columns most people would need autocomplete for are rather limited? Such that you can almost certainly just tab complete for all columns, if that is what you really want/need. The few of us that are working with large databases probably have a set of views that should encompass most of what we would reasonably be able to get from the database in a query.
Even C when `property` is computed rather than materialized. And CLOS even uses that syntax for method invocation.
The only language I can think of that uses field(record) syntax for record field access is AT&T-syntax assembly.
name(object) makes as much sense as object.name
I don't think it is tough to describe many many reports of data that you would want in this way. Is it enough for you to flat out get the answer? No, of course not. But nor is it enough to just start all queries at what possible tables you could use as a starting point.
Your example is also not a complete select statement, you would need to go back and add the actual aggregate functions, and oops the zip_code column was actually called zip, so we need to remap that as well. You can almost never finish a select statement before you have inspected the tables, so why not just start there immediately?
And to be clear, I'm all for complaining about the order of a select statement, to a very large degree. It is done in such a way that you have to jump back and forth as you are constructing the full query. That can feel a bit painful.
However, I don't know that it is just the SELECT before FROM that causes this jumping back and forth and fully expect that you would jump around a fair bit even with the FROM first. More, if I am ever reworking a query that the system is running, I treat the SELECT as the contract to the application, not the FROM.
There is a bit of "you should know the database before you can expect to send off a good query", but that really cuts to any side of this debate? How do you know the tables you want so well, but you don't know the columns you are going to ask for?
Simply because there are a lot more columns than there are tables.
Of course, I sometimes forget the exact table name as well. However, this is mostly not an issue as the IDE knows all table names before anything has been entered. By simply entering `FROM user`, the autocomplete will list all tables that contain `user`, and I can complete it from there. I cannot do the same with the column selection unless I first write the FROM part. And even if I do know exactly which columns and tables I want I would still want autocomplete to work when creating the SELECT statement. Rather than typing `zip_code`, I can most likely just type `z<TAB>`.
That is why 99% of my select queries start as `SELECT * FROM mytable`, and then I go back to fill in the select statement. And it's not just me, all colleagues I've worked with does exact same thing.
For larger, more complicated queries, you'll have to go back and forth a lot as you join tables together, that is unavoidable, but 80-90% of my queries could be finished in one go if the FROM part came first.
I actually seem to recall this was done in some of the early dedicated SQL tools. It is rather amusing how much we handicap ourselves by not using some things older tools built.
Maybe in this case you know the same naming conventions are enforced across all tables. But in general it’s difficult to know the exact column name without looking it up first.
I think you're missing the point; you start off with your goal, regardless of the column name. No one has the goal "we want columsn from this specific table", it's always "we want these columns in the final output".
No. The goal is to have an output column $FOO, where $FOO is meaningful and might not even be in the database in the first place.
> my goal is not to select created_at, my goal is to select Foo's created_at.
Then your goal is to get a value out of a specific table, namely `Foo`, presumably because the end-user wants to see some value in the results.
The end-user getting the value has neither an interest in nor knowledge of your schema. `Foo.created_at` is no more meaningful in the result sets than an unadorned `created_at`.
For this specific example, the end-user might want a column in the output called `age` (if it came from an inventory table`, or `duration` (if it came from a metrics tables), or perhaps `expiry` (if it is a table containing perishable stock). These are all a column in a Foo table, but the request that caused you to write the query in the first place does not mention table at all.
You get their requirement as either `age`, `duration`, `expiry` or similar. That is what you are starting from: "a need for a specific piece of data". You are not starting from "a need from a specific table", because the information needed is coming from someone who neither knows nor cares what your schema looks like.
select name, min(price) from product join product_price on product.id = product_price.product_id
where you start with "I need a product name and minimum price" before thinking about where they come from?The more one uses SQL to filter for, the more you'll think about what you want to achieve vs how (which comes after): in a sense, the SELECT is your function return type declaration, and I frequently start my functions with the declaration.
Sure, and pretty much every time the names I wrote up were not the ones in the table so that was a complete waste of time.
> the SELECT is your function return type declaration
That might be true if “select” only contained aliases, bur that’s not the case at all, so what it is is complete nonsense.
My argument would be largely that you are doing a search of all of the tables for the columns that you want. With having to know the way to join the necessary tables along the way.
And I want to be clear, I don't think this is necessarily "the way." I more think that it is almost certainly an iterative process that can be started at either place, and will require bouncing between the two quite often.
For instance, yes, you can autocomplete column names from tables that you have put in the from. What if you can't find the column you are anticipating there? Go back and rescan the table names hoping you can guess which table should have the column you want? Then go back to autocomplete for columns to see if expected value appears? Or go to a bit more of a global search for columns?
Either way should be fine.
Back when I worked to support a data science team, I actually remember taking some of their queries and stripping everything but the select so that I could see what they were trying to do and I could add in the correct parts of the rest.
Think: I have 20 tables with the column `id`
"I want 'id,id,id'"
is bad UX, and is what here is being argued, then when the syntax guide you: "I want 'FROM a: id'" is better
Is like "to control mutation I only need to append "mut" to the name!
Even in your example, first and last could refer to student or teacher. But presumably you know you're looking for student data before knowing the exact columns.
Like, how would you send this question to someone? Or how would you expect it to be sent to you? If your boss doesn't tell you from what table they want some data, do you just not answer?
And sure, there could be ambiguities here. But these are not really fixed by listing the sources first? You would almost certainly need to augment the select to give the columns names that disambiguate for their uses. For example, when you want the teacher and the student names, both.
And this is ignoring more complications you get from normalized data that is just flat out hard to deal with. I want a teacher/student combination, but only for this specific class and term, as an example. Whether you start the from at the student, the class, or the term rosters feels somewhat immaterial to how you want to pull all of that data together for why ever you are querying it.
If they ask for "address for a customer" I can go to the customer table and look up what FKs are relevant and collect all possible data and then narrow down from there.
Realistically, I strongly suspect you could take this argument either direction. If you have someone making a query where they are having to ask "what all tables could I start from?" you are in for some pain. Often the same data is reachable from many tables, and you almost certainly have reasons for taking certain paths to get to it. Similarly, if they should want "person_name", heaven help them.
Such that, can you contrive scenarios where it makes sense to start the query from the from clause? Sure. My point is more that you almost certainly have the entire query conceptualized in your mind as you start. You might not remember all of the details on how some things are named, but the overall picture is there. Question then comes down to if one way is more efficient than the other? I have some caveats that this is really a thing hindered by the order of the query. We don't have data, of course, and are arguing based on some ideas that we have brought with us.
So, would I be upset if the order was reversed? Not at all. I just don't expect that would actually help much. My memory is using query builders in the past where I would search for "all tables that have customer_id and order_id in them" and then, "which table has customer_id and customer_address" and then... It was rarely (ever?) the name of the table that helped me know which one to use. Rather, I needed the ones with the columns I was interested in.
SELECT statments don't just use table names, they can use aliases for those table names, views, subqueries, etc.
The FROM / JOIN blocks are where the structure of the data your are selecting from is defined. You should not assume you understand what a SELECT statement means until you have read those blocks.
I can define the return format of my query in the SELECT statement, then adapt the data structure in the FROM block using subselects, aliases etc — all to give me the shape desired for the query.
If you've ever done complex querying with SQL, you'd know that you'd go back and forth on all parts of the query to get it right unless you knew the relations by heart, regardless of the order (sometimes you'll have to rework the FROM because you changed the SELECT is the point).
You can, but one of those ways is objectively worse for the reasons explained in this thread and in the article.
When you have to read a query out of order to understand it then something is wrong with the structure of the query language.
from(l in Blinq.Reservations.OrderCycleLinks)
|> join(:left, [l], r in Blinq.Reservations.Reservation, on: l.reservation_id == r.id)
|> select([l, r], %{
order_cycle_id: l.order_cycle_id,
customer_id: r.customer_id
})
|> where([l, r], l.order_cycle_id in ^order_cycle_ids and r.assignment_type == :TABLE)
|> Repo.all()
FWIW that is the approach used by LINQ.
FROM ... SELECT ... WHERE ...: Q S L
Python's list /dict/set comprehensions are equivalent to typed for loops: where everyone complains about Python being lax with types, it's weird that one statement that guarantees a return type is now the target.
Yet most other languages don't have the "properly ordered" for loop, Rust included (it's not "from iter as var" there either).
It's even funnier when function calling in one language is compared to syntax in another (you can do function calling for everything in most languages, a la Lisp). Esp in the given example for Python: there is functools.map, after all.
As for comprehensions themselves, ignoring that problem I find them a powerful and concise way to express a collection of computed values when that's possible. And I'm particularly fond of generator expressions (which you didn't mention) ... they often avoid unnecessary auxiliary memory, and cannot be replaced with an inline for loop--only with a generator function with a yield statement.
BTW, I don't understand your comment about types. What's the type of (x for x in foo()) ?
I get that it's about how it's structured and ordered, but that is true for the "for..in" loops in every language as well: you first set the variable, and only then get the context — this just follows the same model as the for loops in the language, and it would be weird if it didn't.
No, not at all. The output expression is arbitrary ... it might be f(x, y, z) where all of those are set later. You're confusing the output expression with the loop variable, which is also stated in the comprehension and may or may not be the same as the output expression or part of it. "The same model as the for loops in the language", where the language includes Python, is the comprehension with the output expression moved from the beginning to the end. e.g., (bar(x) for x in foo()) is `for x in foo(): bar(x)`. More concretely: lst = [bar(x) for x in foo()] is functionally equivalent to lst = []; for x in foo(): lst.append(bar(x))
Again, "the output expression comes before the context that establishes its meaning." ... I thought that would be clear as day to anyone who is actually familiar with Python comprehensions.
P.S. I'm not going to respond to goalpost moving.
It's like discussion of RPN or infix for calculations: both do the job, one is more rigorous and clear with no grouping/parentheses, yet we manage just fine with infix operators in our programming languages (or maybe not, perhaps all bugs are due to this? :)).
Just like you state a variable and some operations on it early in a comprehension, you do the same in a for loop: you don't know the type of it.
As you are typing the for loop in, your IDE does not know what is coming in as a context being iterated over to auto-complete, for instance (eg. imagine iterating over tuples with "for k, v in some_pairs:" — your editor does not even know if unpacking is possible).
Basically, what I am saying is that comprehensions are similarly "bad" as for loops, except they are more powerful and allow more expression types early.
C/C++ allow even crazier stuff in the "variable's" place in a for loop. Rust allows patterns, etc.
Typing is mostly a nice addendum I mentioned, that's not the core of my point.
The relative order of the variable being iterated on and the loop variable name is not relevant to OP's complaint. OP only requires that the expression which uses the loop variable comes after both, which is the case in Rust.
Reasons, sure, but whether those reasons correlate with things that matter is a different question altogether.
Python has a really strong on-ramp and, these days, lots of network effects that make it a common default choice, like Java but for individual projects. The rub is that those same properties—ones that make a language or codebase friendly to beginners—also add friction to expert usage and work.
> Failure to understand something is not a virtue.
This is what I want to say to all the Bash-haters that knee-jerk a "you should use a real language like Python" in the comments to every article that shows any Bash at all. It's a pet peeve of mine.
Pros: easy to interact with different tools, included in many Unix OSs, battle tested.
Cons: complex stuff gets messy, with weird syntax, hard to do logic, hard to escape everything correctly
Shell operates quite naturally under stream-oriented data-centric design patterns. But that's an architectural familiar to almost nobody in our OOP- and FP-happy industry these days. Try using Python like a relational database language (e.g. SQL), and complex stuff gets messy, the syntax starts feeling hostile, logic becomes obscure, etc. The purported ickyness of shell stems not from shell per se but from trying to force it into a foreign design space.
Principles that work well with shell: Heavily normalize your data; Organize it into line-oriented tables with whitespace-separated fields; Use the filesystem as a blob store; Filenames can be structured data as well; positional parameters form a bona fide list.
> hard to escape everything correctly
IME, all of the really painful escape patterns can always be avoided. You see these a lot when people try to build commands and pass them off to exec. That's mostly unnecessary when you can just shove the arguments in the positional parameters via the set builtin: e.g.
set -- "$@" --file 'filename with spaces.pdf'
set -- "$@" 'data|blob with "dangerous" characters'
set -- "$@" "$etc"
some_command "$@"
which is functionally equivalent to some_command "$@" --file 'filename with spaces.pdf' \
'data|blob with "dangerous" characters' \
"$etc"
It also helps to rtfm and internalize the parsing steps that happen between line of input and the eventual execve call.This is the biggest reason to use shell - if your task is shaped like a command line session with a bit of looping or a few conditionals, shell is perfect.
If you start nesting loops or conditionals in shell, then start considering another language.
Thus my word "suggests". Thus the comprehensive pro/con list.
I'm not here to defend tiresome strawmen.
> I'm not here to defend tiresome strawmen.
I won't point out that you already tried to (contradiction intended). Perhaps a more interesting discussion would result if we defaulted to a more collaborative[0] stance here?
An example I like to give is the wood chipper. Pros it can do a lot of useful things around the yard (that you can list individually), cons it can chop your arm off. How many pros do you need to overcome that con? Though I'll admit there's a difference between "can" and "will", the latter hinging on improper use.
tl;dr I'm a bit tired with people glorifying a semi-useful cognitive device.
Comparisons won't tell us anything. If Python were the only programming language in existence, that doesn't imply that it would be loved. Or, if we could establish that Python is the technically worst programming language to ever be created, that doesn't imply that wouldn't be loved. Look at how many people in the world love other people who are by all reasonable measures bad for them (e.g. abusive). Love isn't rational. It is unlikely that it is possible for us to truly understand.
If you insist.
> If Python were the only programming language in existence, that doesn't imply that it would be loved.
I'm not here to defend bizarre strawmen.
> It is unlikely that it is possible for us to truly understand.
If you insist.
Python with strict type checking and its huge stdlib is my favourite scripting language now.
That's quite a lot of ifs though. Tbh I haven't found anything significantly better for scripting though. Deno is pretty nice but Typescript really has just as many warts as Python. At least it isn't so dog slow.
But when I say "scripting" I don't mean shell scripting. I mean stuff like complex custom build systems or data processing pipelines. I would never do those in any shell language.
It's just object-y enough to be excellent at filtering and shunting and translating records from API endpoints to CSV files to SQL data rows, etc. I'm not sure I'd recommend it to anybody to pick up because of all the sharp ends (eg it uses Javascript falseyness except SQL NULL has the ADO.Net "DBNull" which isn't falsey) but because I'm so familiar with it I find it quite good at that stuff.
As much as people bag on the syntax (like the operators all starting with hyphen, and all the .NET functions speak with an accent), the real problem is how many edge-cases are in there, like strange behaviors of the error stream, overcomplicated scope rules, FileSystem providers for the registry and SQL server and other objects no reasonable person would ever want to use as a "file", empty-array objects getting silently converted into null, etc.
It's time to try Scala 3 with Java libs' inbound interop: https://docs.scala-lang.org/scala3/book/scala-features.html
For a language where there is supposed to be only one way to do things, there are an awful lot of ways to do things.
Don’t get me wrong, writing a list comprehension can be very satisfying and golf-y But if there should be one way to do things, they do not belong.
I would say unless you have a good reason to do so, features such as meta classes or monkey patching would be top of list to avoid in shared codebases.
I find them easier to understand and explain, too.
That's not what the Zen says, it says that there should be one -- and preferably only one -- obvious way to do it.
That is, for any given task, it is most important that there is at least one obvious way, but also desirable that there should only be one obvious way, to do it. But there are necessarily going to be multiple ways to do most things, because if there was only one way, most of them for non-trivial tasks would be non-obvious.
The goal of Python was never to be the smallest Turing-complete language, and have no redundancy.
There are real concerns about tying the language's future to a VC-backed project, but at the same time, it's just such an improvement on the state of things that I find it hard not to use.
I'm working on a python codebase for 15 years in a row that's nearing 1 million lines of code. Each year with it is better than the last, to the extent that it's painful to write code in a fresh project without all the libraries and dev tools.
Your experience with Python is valid and I've heard it echoed enough times, and I'd believe it in any language, but my experience encourages me to recommend it. The advice I'd give is to care a lot, review code, and keep investing in improvements and dev tools. Git pre commit hooks (just on changed modules) with ruff, pylint, pyright, isort, unit test execution help a lot for keeping quality up and saving time in code review.
On the other hand if you are going to be building something that is going to be long lived, with multiple different teams supporting it over time, and\or larger programs where it all doesn't fit in (human) memory, well then python is going to bite you in the ass.
There isn't a one size fits all programming language, you need at least two. A "soft" language that stays out of your way and lets you figure things out, and a "hard" language that forces the details to be right for long term stability and support.
I've had the displeasure of working in codebases using the style of programming op says is great. It's pretty neat. Until you get a chain 40 deep and you have to debug it. You either have to use language features, like show in pyspark, which don't scale when you need to trace a dozen transformations, or you get back to imperative style loops so you can log what's happening where.
Lists comprehensions were added to the language after it was already established and popular and imho was the first sign that the emperor might be naked.
Python 3 was the death of it, imho, since it showed that improving the language was just too difficult.
People complain that we can’t have nice things. But even when we do, enough developers will be lazy enough not to learn them anyway.
And one of the, admittedly many, reasons why web technologies like Electron and React Native exist is because it’s easier to find JavaScript developers vs Kotin, Qt or whatever.
So youre not wrong but you’re also downplaying the network effect that led to the web becoming dominant. And a part of that was because developers didn’t want to learn something new.
I don't think this holds.
JavaScript was created as a frontend language specifically for web browsers.
It wasn't until 2009 with the introduction of Node.js that JavaScript became a viable option for backend development.
The web was already the dominant platform by then.
Before then it was Flash and native apps. Before then, smart phones weren’t common (eg the iPhone was just over a year old at that point) and the common idiom was to write native apps for Windows Mobile.
IE was still the dominant browser too, Firefox and WebKit were only starting to take over but we are still a long way from IE being displaced.
Electron didn’t exist so desktop app were still native and thus Linux was still a second class citizen.
It took roughly another roughly 2015 before the web because the dominant platform. But by 2005 you could already see the times were changing. It just took ages for technology to catch up. The advent of v8 and thus node and Electron, for that transition to “complete”.
JavaScript was used for backend development since the late 1990s via the Rhino engine (the backends wouldn't be pure JS generally but a mix of JS and Java; Rhino was a JS engine for the JVM with Java interop as a key feature.)
That would be nice if devs always wrote code sequentially, i.e. left to right, one character at a time, one line at a time. But the reality is that we often jump around, filling in some things while leaving other things unfinished until we get back to them. Sometimes I'll write code that operates on a variable, then a minute later go back and declare that variable (perhaps assigning it a test value).
If i decide to add a new field to some class, i won't necessarily go to the class definition first, I'll probably write the code using that field because that's where the IDE was when i got the idea.
If I want to enhance some condition checking, i'll go through a phase where the piece of code isn't valid while I'm rearranging ifs and elses.
Often, not even then.
But some languages just won't let you do that, because they put in errors for missing returns or unused variables.
But you could have a compiled language where errors were limited to the function when possible, like by emitting asserts.
BABLR takes Hazel's idea and takes it about 18 steps further, potentially making embedding gaps a feature of every editor and programming language.
As long as solutions don't have a way to scale economically they'll be academic, but BABLR makes this not academic anymore.
Gaps are present in the embedded code representation, e.g. if · is a gap we would represent the result of parsing `console.log(·)` using a "gap tag" (written `<//>`) in a CSTML document that will look like this: https://gist.github.com/conartist6/75dc969b685bbf69c9fa9800c...
Syntactically invalid.
I hope someone makes a fork that lets you disable those silly errors in debug mode
Reading through the article, the author makes the argument for the philosophy of progressive disclosure. The last paragraph brings it together and it's a reasonable take:
> When you’ve typed text, the program is valid. When you’ve typed text.split(" "), the program is valid. When you’ve typed text.split(" ").map(word => word.length), the program is valid. Since the program is valid as you build it up, your editor is able to help you out. If you had a REPL, you could even see the result as you type your program out.
In the age of CoPilot and agent coders I'm not so sure how important the ergonomics still are, though I dare say coding an LSP would certainly make one happy with the argument.
{3} for {2} in {1}
which would give you code completion for {3} based on the {1} and {2} that would be filled in first.There is generally a trade-off between syntax that is nice to read vs. nice to type, and I’m a fan of having nice-to-read syntax out of the box (i.e. not requiring tool support) at the cost of having to use tooling to also make it nice to type.
This is not meant as an argument for the above for-in syntax, but as an argument that left-to-right typing isn’t a strict necessity.
I know that Python is used for many more things than just data science, so I'd love to hear if in these other contexts, a pipe would also make sense. Just trying to understand why the pipe hasn't made it into Python already.
I find myself increasingly frustrated at seeing code like 'let foo = many lines of code'. Let me write something like 'many lines of code =: foo'.
Interesting idea! However, I'm not sure I would prefer
"Mix water, flour [...] and finally you'll get a pie"
to
"To make a pie: mix water, flour [...]"
It's use is discourages in most style guides. I do not use it in scripts, but I use it heavily in console/terminal workflows where I'm experimenting.
df |> filter() |> summarise() -> x
x |> mutate() -> y
plot(y)
bake(divide(add(knead(mix(flour, water, sugar, butter)),eggs),12),450,12)
versus
mix(flour, water, sugar, butter) %>% knead() %>% add(eggs) %>% divide(12) %>% bake(temp=450, minutes=12)
So much easier!
dough = mix(flour, water, sugar, butter)
dough.knead()
dough = dough.add(eggs)
cookies = dough.divide(12)
cookies = bake(temp=450, minutes=12)
Might be more verbose, but definitely readable. result = (df
.pipe(fun1, arg1=1)
.pipe(fun2, arg2=2)
)
is much less readable than result <- df |>
fun1(., arg1=1) |>
fun2(., arg2=2)
but I guess the R thing also works beyond dataframes which is pretty cool result <- df
|> fun1(arg1=1)
|> fun2(arg2=2)
Python doesn't have a pipe operator, but if it did it would have similar syntax: result = df
|> fun1(arg1=1)
|> fun2(arg2=2)
In existing Python, this might look something like: result = pipe(df, [
(fun1, 1),
(fun2, 2)
])
(Implementing `pipe` would be fun, but I'll leave it as an exercise for the reader.)Edit: Realized my last example won't work with named arguments like you've given. You'd need a function for that, which start looking awful similar to what you've written:
result = pipe(df, [
step(fun1, arg1=1),
step(fun2, arg2=2)
])
I like exercise:
https://gist.github.com/stuarteberg/6bcbe3feb7fba4dc2574a989...
The functions with extra arguments could be curried, or done ad-hoc like lambda v: fun1(v, arg1=1)
result = df
|> fun1(arg1=1)
|> fun2(arg2=2)
If you design something to "read like English", you'll likely get verb-first structure - as embodied in Lisp/Scheme. Other languages like German, Tamil use verbs at the end, which aligns well with OOP-like "noun first" syntax. (It is "water drink" word for word in Tamil but "drink water" in English.) So Forth reads better than Scheme if you tend to verbalize in Tamil. Perhaps why I feel comfy using vim than emacs.
Neither is particularly better or worse than the other and tools can be built appropriately. More so with language models these days.
If its imperative, sure. If its declarative and designed to read like English it will be subject first.
Doesn't Kakoune reverses that, and it makes so much more sense?
Doesn't German have the main verb on the second position? (For a simple example, "I drink water" would be "Ich trinke Wasser")
import * as someLibrary from "some-library"
someLibrary.someFunction()
Which works pretty well with IDE autocomplete in my experience.[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
https://esbuild.github.io/api/#tree-shaking
Although there are associated issues but they may be specific to esbuild.
The issue you linked to is referring to the case in which you import a namespace object and then re-export it. Bundlers like webpack and rollup (which vite uses in production) can tree shake this pattern as well, but esbuild struggles with it.
If you're using esbuild instead of this:
import * as someLibrary from "some-library"
someLibrary.someFunction()
export { someLibrary }
You can still do this: import * as someLibrary from "some-library"
someLibrary.someFunction()
export * from "some-library"
export { default as someLibraryDefault } from "some-library"
Tree shaking works as expected for downstream packages using esbuild in the second case, which someone else in the linked issue pointed out: https://github.com/evanw/esbuild/issues/1420#issuecomment-96...Just do
import SomeLibrary {
asYetUnknownModule
}
Essentially, what this combinator does is allow expressing a nested invocation such as:
f(g(h(x)))
To be instead: h(x) |> g |> f
For languages which support defining infix operators.EDIT:
For languages which do not support defining infix operators, there is often a functor method named `andThen` which serves the same purpose. For example:
h(x).andThen(g).andThen(f)
0 - https://leanpub.com/combinators/read#leanpub-auto-the-thrush map : (a -> b) -> a list -> b list
instead of map : a list -> (a -> b) -> b list
The main argument (data to be operated on) is positioned last, after the others which are more like parameters that tune the function.
It's to allow chaining these things up left to right like Unix pipes: map f l |> filter g |> ...
The technique of currying method parameters such that the last one is the input to the operation also makes chaining Kleislis[0] quite nice, albeit using a bind operator (such as `>>=`) instead of the Thrush combinator operator (`|>`).
0 - https://bartoszmilewski.com/2014/12/23/kleisli-categories/
https://github.com/tc39/proposal-pipeline-operator
It would make it possible to have far more code written in the way you’d want to write it
I've seen some SQL-derived things that let you switch it. They should all let you switch it.
https://en.wikipedia.org/wiki/Non-English-based_programming_...
I prefer list/set/dict comprehensions any day. It's more general, doesn't require to know a myriad of different methods (which could not exists for all collections, PHP and JS are especially bad with this) and easily extendable to nested loops.
Yes it could be `[for line in text.splitlines() if line: for word in line.split(): word.upper()]`. But it is what it is. BTW I bet rust variant would be quite elaborate.
And these are flexible tools that you can take with you across projects.
It's the opposite, your knowledge of the standard set of folding algorithms (maps, filters, folds, traversals) is transferable almost verbatim across a wide range of languages: https://hoogletranslate.com/?q=map&type=by-algo
It's not. The author gives objective reasons why Python's syntax is inferior – namely, that it makes IDE support in the form of discoverability and auto-completion more difficult.
I use Neovim and tmux because I value snappy performance, and never having to leave the keyboard.
The reason I don’t like auto-complete is that it interrupts my thoughts. Once I’m typing code, I know what I want to do, and having things pop up is distracting. Before I’m typing, I’ll think about the problem, look at other code in the codebase, and/or read docs. None of that requires autocomplete, nor would it help me.
If you like autocomplete, great, use it, but don’t assume that it’s a binary choice between plaintext editing and Copilot.
This bit is an aside in the article but I agree so much! List comprehensions in python are great for the simple and awful for the complex. I love map/reduce/filter because they can scale up in complexity without becoming an unreadable mess!
Not possible. There are more keystrokes that result in invalid programs (you are still writing the code!!) than keystrokes that result in a valid program.
More seriously, I do think that one consideration is that code is read more often that written, so fluidity in reading and comprehension seem more important to me than “a program should be valid after each keystroke.
This line, though, seems like it's using the wrong tools for the job:
len(list(filter(lambda line: all([abs(x) >= 1 and abs(x) <= 3 for x in line]) and (all([x > 0 for x in line]) or all([x < 0 for x in line])), diffs)))
To me it's crying out for the lines to be NumPy arrays: sum(1 for line in diffs
if ((np.abs(line) >= 1) & (np.abs(line) <= 3)).all()
and ((line > 0).all() or (line < 0).all()))
There's no need to construct the list in memory if you're just counting, and dealing with whole lines at once is much nicer than going element by element. On top of that, this version is much more left-to-right. diffs.countIf { line ->
line.all { abs(it) in 1..3 } and (
line.all { it > 0} or
line.all { it < 0}
)
}
The pipe "|" as an analog to the cascade ";", but sending to the result, like a normal pipe would. This avoids having to go back and add parentheses when you have a longer expression involving keyword sends.
For example navigating a nested set of dictionaries
self classDefs at:className.
(self classDefs at:className) at:which.
((self classDefs at:className) at:which) at:methodName.
vs. self classDefs at:className.
self classDefs at:className | at:which.
self classDefs at:className | at:which | at:methodName.
[1] https://objecttive.stIt is sometimes still a problem when you define a function without reference to it (because the types are unnkown ofcourse). You will have to add types to it or call the function before implementing it.
There was also a great blogpost about it: https://www.javierchavarri.com/data-first-and-data-last-a-co...
It's true that Python didn't cater to static analysis at all.
max(map(sum, input_list.split(None)))
To decipher this the eye has to jump to the middle of the line, move rightwards, then to the left to see the "map" then move right again to see what we are mapping and then all the way to the beginning to find the "max".The author would probably suggest rust's syntax* of
values.iter().split(None).map(Iterator::sum).max().unwrap_or(0)
but I was learning q at the time so came up with the much clearer /s, right to left max((0^+)\)l
*: Though neither python nor rust have such a nice `.split(None)` built in.Sorry, I'm not sure I understand what `.split(None)` would do? My initial instinct is that would would return each character. i.e. `.chars()` in Rust or `list(s)` in Python.
Reading the docs [0] it seems `.split(None)` returns an array of the indivual characters without whitespace - so something like [c in list(s) if not whitespace(c)]
[0] https://docs.python.org/3.3/library/stdtypes.html?highlight=...
Say what you will about clarity, but my mind sort of glossed over the intention in the python and rust code, focusing instead on the syntax, while the q code made me consider what was actually happening.
Perhaps q did both force you to consider what was happening, and hide it from you...
Though hopefully, once the idiom is learned, I'll be able to remember it :)
Curious as to why a scan \ rather than a reduction /?
This may eventually be upstreamed: https://github.com/rust-itertools/itertools/issues/1026
input_groups = input_list.split(None)
group_sums = map(sum, input_groups)
max_sum = max(group_sums)
This also gives you a slightly higher-level view on how the algorithm proceeds just by reading the variable names on the LHS.I prefer python comprehensions for the same reason. I can tell that they are about crating a list or a dict and I don't need to parse a bunch of operations in a loop to come to the same conclusion.
[for b in c
let d = f(b)
for e in g if h else m
where p(d, e): b, d, e]
as alternative syntax to Python's ((b, d, e)
for b in c
for d in [f(b)]
for e in (g if h else m)
if p(d, e))
This solves five problems in Python listcomp syntax:1. The one this article is about, which is also a problem in SQL, as juancn points out.
2. The discontinuous scope problem: in Python
[Ξ for x in Γ for y in Λ]
x is in scope in Ξ and Λ but obviously not Γ. This is confusing and inconsistent.3. The ambiguity between conditional-expression ifs and listcomp-filtering trailing ifs, which Python solves by outlawing the former (unless you add extra parens). This is confusing when you get a syntax error on the else, but there is no non-confusing solution except using non-conflicting syntax.
4. let. In Python you can write `for d in [f(b)]` but this is inefficient and borders on obfuscated code.
5. Tuple parenthesization. If the elements generated by your iteration are tuples, as they very often are, Python needs parentheses: [(i, c) for i, c in enumerate(s) if c in s]. That's because `[i, c` looks like the beginning of a list whose first two items are i and c. Again, you could resolve these conflicting partial parses in different ways, but all of them are confusing.
I don’t know all of its API, but I do read the docs periodically (not all of them, but I try to re-read modules I use a lot, and at least one that I don’t).
To me, that’s learning the language. Learning its library would be more like knowing that the module `string` contains the function `capwords`, which can be used to - as the name suggests - capitalize (to title case) all words in a string. Hopefully, one would also know that the `str` class contains the method `upper`, so as not to confuse it with `string.capwords`.
I'd much rather optimize for understanding code. Give me the freedom to order such that the most important ideas are up front, whatever the important details are. I'd much rather spend 3x the time writing code of it means I spend half the time understanding it every time I return to it in the future.
Your editor can’t help you out as you write it.
You shouldn't need handholding when you're writing code. It seems like the whole premise of the author's argument is that you shouldn't learn anything about the language and programming should be reduced to choosing from an autocomplete menu and never thinking more than that. I've seen developers who (try to) work like this, and the quality of their work left much to be desired, to put it lightly.
From there you can eventually find fread, but you have no confidence that it was the best choice.
In C, you have to know ahead of time that fclose is a function that you’ll need to call once you’re done with the file.
It's called knowledge. With that sort of attitude, you're practically begging for AI to replace you.
No wonder people claim typing speed doesn't matter - they can barely think ahead one token, nevermind a statement or function, much less the whole design! Ideally your typing speed should become the bottleneck and you should be able to code "blind", without looking at the screen but merely outputting the code in your mind into the machine as fast as humanly possible. Instead we have barely-"developers" constantly chasing that next tiny dopamine hit of picking from an autocomplete menu. WTF!?
When this descent into mediocrity gets applauded, it's no surprise that so much "modern" software is the way it is.
Jokes apart, I think you're being too drastic. A good auto-complete is a nice feature, just like auto-indent, tab-complete, etc. Can it be abused? Sure. So what? Should we stop making it better for fear of abuse?
The reference to AI is far-fetched, too. We're talking about tools to help you with the syntax, not the semantic. I may forget if the function is called read, fread, or file_read, but I know what its effect is.
And finally, consider that if something is easier to parse for an editor, it most probably is for a human too. Not a rule, not working in 100% of cases, but usually exposing the user to the local context before the concept itself helps understanding.
If writing code is an automated process for you, you are also begging for AI to be replace you. Just a more advanced one than the code-monkey in the OP.
No, you are bad because you use an editor with autocomplete.
And it's not even debatable, it's like playing bowling with rails, or riding a bycicle with training wheels. Sure you can argue that you are more efficient at bowling and riding a bike with those, but you are going to be arguing alone, and it's much better to realize that python is one of the best languages at the moment and therefore one of the best languages ever, instead of being a nobody and complaining about a lanugage because you are too encumbered by your own ego to realize that you are not as good a programmer as you thought.
Nothing wrong with being an amateur programmer or vibecoding or whatever, but if you come for the king you best not miss
import { MyClass } from './lib.ts'
First you need to type what to import and then from where leaving. There is no linear way of discovering import options from the source of imports without extra jumping in the code.
Alternatively linear completion, or (TIL https://en.wikipedia.org/wiki/Progressive_disclosure) would be possible for imports of shape like:
from './lib.ts' import { MyClass }
If course authors of the import syntax had good reasons (which I don't know) to build stuff that way they've built it.
len(list(filter(lambda line: all([abs(x) >= 1 and abs(x) <= 3 for x in line]) and (all([x > 0 for x in line]) or all([x < 0 for x in line])), diffs)))
Well, point taken about the ordering, but there are many more legible ways to have written that code. Not everything has to be a one liner.even:
def f(diffs):
cond = lambda line: all([1 <= abs(x) <= 3 for x in line]) and (
all([x > 0 for x in line]) or all([x < 0 for x in line])
)
items = filter(cond, diffs)
return len(list(items))
Similarly, Python list/dict/set comprehensions are a form of for-loop syntax sugar to easily create particular structure. One can use functools.maps to get exactly the same behavior of Rust example.
If this was an all important text input microoptimization, we'd all be doing everything with pure functions like Lisp: yet somehow, functional languages are not the most popular even if they provide the highest syntax consistency.
It doesn't look anything like Lisp, though.
Although the subtitle was “programs should be valid as they are typed”, it’s weakened to “somewhat valid” at this point. And yes, it is valid enough that tooling can help, a lot of the time (but not all) at full capability. But there’s also interesting discussion to be had about environments where programs are valid as they are typed. Syntactically, especially, which requires (necessary but not sufficient) either eschewing delimition, or only inserting opening and closing delimiters together.
> Instead, you must know that functions releated to FILE tend to start with f, and when you type f the best your editor can do is show you all functions ever written that start with an f
Why do you think that this is a problem of C? no one is stopping your tools from searching `fclose` by first parameter type when you wrote `file.`. Moreover, I know that CLion already do this.
To make a long story short, we added features for "incomplete" programs in the language and tools, so that your program was always valid and could not be invalid. It was a reasonable concept, and I think could have been a game changer if AI didn't first change the game.
from line in text.Split('\n') select line.Split(null)
For example, using "rm" on the command line, or an SQL "delete". I would very much like those short programs to be invalid, until someone provides more detail about what should be destroyed in a way that is accident-resistant.
If I had my 'druthers, the left-to-right prefix of "delete from table" would be invalid, and it would require "where true" as a safety mechanism.
Python offers an "extended" form of list comprehensions that lets you combine iteration over nested data structures.
The irony is that the extensions have a left-to-right order again, but because you have to awkwardly combine them with the rest of the clause that is still right-to-left, those comprehensions become completely unreadable unless you know exactly how they work.
E.g., consider a list of objects that themselves contain lists:
toolboxes = [
Box(tools=["hammer"]),
Box(tools=["wrench", "screwdriver"])
]
To get a list of lists of tools, you can use the normal comprehension: toolsets = [b.tools for b in toolboxes]
But to get a single flattened list, you'd have to do: tools = [t for b in toolboxes for t in b.tools]
Where the "t for b" looks utterly mystifying until you realize the "for" clauses are parsed left-to-right as [t (for b in toolboxes) (for t in b.tools)]
while the "t" at the beginning is parsed right-to-left and is evaluated last.for a in b:
for c in a:
use(a,b,c)
This would be[use(a,b,c) for a in b for c in a]
Everything stays the same except the "use" part that goes in the front (the rule also includes the filters - if).
A syntax construct that requires you to think in a different syntax construct to understand it is not a good syntax construct.
> Everything stays the same except the "use" part that goes in the front (the rule also includes the filters - if).
Yeah, you consistently have to write the end first followed by the start and then the middle, it's just in most cases there's no middle so you naturally think you only have to flip the syntax rather than having to write it middle-ended like a US date.
len(list(filter(lambda line: all([abs(x) >= 1 and abs(x) <= 3 for x in line]) and (all([x > 0 for x in line]) or all([x < 0 for x in line])), diffs)))
This really isn’t fair on Python. Python isn’t very much not designed for this style of functional programming. Plus you haven’t broken lines where you could. Rewrite it as a list comprehension and add line breaks, and turn the inner list comprehensions into generator expressions (`all([…])` → `all(…)`), and change `abs(x) >= 1 and abs(x) <= 3` to `1 <= abs(x) <= 3` (thanks, Jtsummers), and it’s much better, though it still has the jumping around noted, and I do prefer the functional programming approach. I’m just saying the presentation isn’t fair on Python. len([line for line in diffs
if all(1 <= abs(x) <= 3 for x in line)
and (all(x > 0 for x in line) or all(x < 0 for x in line))])
(Aside: change the first line to `sum(1 for line in diffs` and drop the final `]`, and it will probably perform better.)I also want to note, in the JS… Math.abs(x) instead of x.abs() (as seen in Rust).
And, because nerd sniping, two Rust implementations, one a direct port of the JS:
diffs.iter().filter(|line| {
line.iter().all(|x| x.abs() >= 1 && x.abs() <= 3) &&
(line.iter().all(|x| x > 0) || line.iter().all(|x| x < 0))
}).count()
(`x.abs() >= 1 && x.abs() <= 3` would be better as `(1..=3).contains(x.abs())` or `matches!(x.abs(), 1..=3)`.)And one optimised to only do a single pass:
diffs.iter().filter(|line| {
let mut iter = line.iter();
let range = match iter.next() {
Some(-3..=-1) => -3..=-1,
Some(1..=3) => 1..=3,
Some(_) => return false,
None => return true,
};
iter.all(|x| range.contains(x))
}).count()
abs(x) >= 1 and abs(x) <= 3
Is also unidiomatic in Python. 1 <= abs(x) <= 3
Means the same thing and tightens it up a bit, and reads better since it's indicating that you're testing if something is in a range more clearly.EDIT: To add:
The filter, list construction, and len aren't needed either. It's just:
sum(map(predicate, diffs)) # this counts the number of elements in diffs which satisfy predicate, map is lazy so no big memory overhead
Or alternatively: sum(predicate(diff) for diff in diffs)
The predicate is complex enough and used twice, so it warrants extraction to its own named function (or lambda assigned to a variable), but even if it were still embedded this form would be slightly clearer (along with adding the line breaks and removing the extra list generations): sum(map(lambda line: all(1 <= abs(x) <= 3 for x in line)
and (all(x > 0 for x in line) or all(x < 0 for x in line)),
diffs))
Verb-final languages like PostScript and Forth are oddballs.
Embedded verbs are a thing of course, with the largest contribution coming from arithmetic expressions with infix operators, followed by certain common syntax like "else" being in the of an "if" statement, followed by things like these Python comprehensions and <action> if <condition> syntactic experiments and whatnot.
table.where(...).select(...)
that is typical in many OO interfaces, though you sometimes see a pipe syntax which would be autocomplete friendly in languages like Clojurehttps://clojuredocs.org/clojure.core/-%3E
or F#
https://stackoverflow.com/questions/12921197/why-does-the-pi...
Yes, I agree with the author, list comprehensions are readible, and I'd add, practical.
> it gets worse as the complexity of the logic increases
len(list(filter(lambda line: all([abs(x) >= 1 and abs(x) <= 3 for x in line]) and (all([x > 0 for x in line]) or all([x < 0 for x in line])), diffs)))
Ok, well this is something that someone would be unlikely to write... unless they wanted to make a contrived example to prove a point.It would be written more like:
result = sum(my_contrived_condition(x) for line in diffs)
See also the Google Python style guide, which says not to do the kind of thing in the contrived example above: https://google.github.io/styleguide/pyguide.html(Surely in any language it's possible to write very bad confusing code, using some feature of the language...)
And note:
x = [line.split() for line in text.splitlines()]
^- list comprehension is just a convenient shorthand for a `for loop`, i.e.: x = []
for line in text.splitlines():
x.append(line.split())
Just moving the `line.split()` to the front and removing the empty list creation and append.But you still have to define line without autocomplete.
That would suggest the file object needs to be refactored and split.
> What's next, a fuzzy finding search box for all possible functions? Contextually relevant ones based on the code you've already written?
IDEs already provide both those options.
However, while I agree, I do want to challenge you to explain/show if and why this is the case (I know that you wrote "suggest"):
> That would suggest the file object needs to be refactored and split.
If you have dozens of methods then that might be an indicator that your object is too large in size and thus consumes too much memory even for simple tasks.
It also might be a symptom that you’re trying to do too much with that object so perhaps some methods should be generalised, if possible.
Or maybe the object is already too generalised and what you actually need is a more specific object that inherits some of the initial objects design?
It might be that some of those methods should have been private, or even functions.
Ultimately, it could be an indicator that the object is either badly designed or has evolved to the point that it’s now ready for a refactor.
But it could also be the most optimal way to approach that problem too. In some situations. However it’s a good point to pause and reflect on whether the pain/reward threshold has been crossed.
Sometimes it is called a fluent-interface in other languages.
Where've you heard it called that? I've normally heard tacit programming
Could you elaborate? AFAIK tacit programming tend to be scrambling around composition, paren, and args which makes left-to-right reading significantly harder for function with arity greater than 2.
I find Java's method reference or Rust's namespace resolution + function as an argument much better than Haskell tacit-style for left-to-right reading.
When it's OO, it's a virtue that everyone loves - a "fluent interface".
When it's FP - oh it's unreadable! Why don't they just break every line out with an intermediate variable so I know what's going on!
If/else if fails the relocation principal across many languages, since the first must be if, and middles else if. Switch tends to pass.
Languages that don't allow trailing commas also fail
For the record though, "it's more readable" is a much better argument then "LSP mad"
I'm not really a fan of list comprehensions, I usually just use for loops. It does seem consistent with Pythons syntax though. For loop is `for item in items` and comprehensions have 'item for item in items'.
let primeSquares = collect:
for n in 1..100:
if n.isPrime():
n * n
In Clojure I love the threading macro which accomplishes the same: (-> (h) (g) (f))
https://hackage-content.haskell.org/package/flow-2.0.0.9/doc...
arg
& f1
& f2
& f3
(f . g) x
is equivalent to f (g x)
uint16_t (MyStruct* s) some_func() { .. }
uint16_t MyStruct_some_func(MyStruct* s) { .. }
(I guess I should just use C++... but C++ is overwhelmed in many ways)
"Count length of string s" -> LLM -> correct syntax for string-count for any programming language. This is the perfect context-length for an LLM. But note that you don't "complete the line", you tell the LLM what you want to have done in full (very isolated) context, instead of having it guessing.
If you already know how to compute `len` on some arbitrary syntax soup then the difference is just a minor annoyance where you have to jump back in your editor, add a function call and some punctuation, and jump back to where you were to add some closing punctuation. It's so fast you'd never bother with an LLM, so despite real and meaningful differences existing the LLM discussion point isn't relevant.
If you don't know how to compute `len` on some arbitrary syntax soup, I don't see how crafting an ideal prompt in a "full (very isolated) context" is ever faster than tab-completing things which look like "count" or "len."
Here's an example I found https://github.com/gleam-lang/example-todomvc/blob/main/src/...
> Here, your program is constructed left to right. The first time you type line is the declaration of the variable. As soon as you type line., your editor is able to suggest available methods.
Yeah, having LSP autocomplete here does feel nice.
But it also makes the code harder to scan than Python. Quick readability at a glance seems like the bigger win than just better autocomplete.
It depends a lot on what you’re accustomed to. You get used to whichever style. Just like different languages use different sentence order: subject, object and verb appear in all possible orders in different languages, and their speakers get along just fine. There are some situations where one is clearly superior to the other, and vice versa.
`text.lines().map(|line| line.split_whitespace())` can be read loosely as “take text; take its lines; map each line, split it on whitespace”. Straightforward and matching execution flow.
`[line.split() for line in text.splitlines()]` doesn’t read so elegantly left-to-right, but so long as it’s small enough you spot the `for` token, realise you’re dealing with a list comprehension, and read it loosely from left to right as “we have a list made up of splitting each line, where lines come from text, split”. Execution-wise, you execute `text.splitlines()`, then `for line in`, then `line.split()`. It’s a bunch of left-to-rights embedded in a right-to-left. This has long been noted as a hazard of list comprehensions, especially the confusion you end up with with nested ones. Now you could quibble over my division of `for line in text.splitlines()` into two runs; but I think it’s fair. Consider how in Rust you get both `for line in text.split_lines() { … }` and `text.split_lines().for_each(|line| { … })`. Sometimes the for block reads better, sometimes .for_each() or .map() or whatever does. (But map(lambda …: …, …) never really does.)
Python was my preferred language from 2009–2013 and I still use it not infrequently, but Rust has been my preferred language ever since. I can say: I find the Rust version significantly easier to read, in this particular case. I think the fact there are two levels of split contributes to this.
.unwrap().unwrap().into::<&&&Vec<Box<&&&&&mut String>>>()
which kind of ruins the elegance :PThe Rust and Python code are not equivalent: The Python code instantly produces the nested list. Rust map does not iterate over the list given to it, it only produces an iterator that you then have to drain. To make them equivalent, you need to add collect calls.
... which adds more typing, because then collect needs to know the types you want to collect into. To make the Rust code fully equivalent with the Python version, you need to do:
let words_on_lines: Vec<Vec<&str>> = text.lines().map(|line| line.split_whitespace().collect()).collect();
// alternatively, you can put the type arguments into the .collect() calls:
let words_on_lines = text.lines().map(|line| line.split_whitespace().collect::<Vec<_>>()).collect::<Vec<_>>();
// which approaches the level of noise you expect from rust,
// but stylistically I much prefer having the type declaration near the definition.
But of course this example also highlights Rust's power. Often (usually) you don't need to do that, and you can instead use the iterator directly, saving all the intermediate allocations. With Rust, I can often write the kind of code I do in python, but without a single heap allocation. Also, it doesn't bind you to a single collection type that ships with the language; if you want to, you can easily use something like a vec with a short vec optimization instead.But it definitely does inherit Perl's mantle as executable line noise.
Instead of writing out a turbofish both times, I’d probably leave the first unannotated and put `::Vec<Vec<_>>` on the second one.
Must not be a vim user.
sometimes you want to think inside out
its the difference between imperative thinking and declarative
both are good, use each when applicable
“Shut up old man” yes yes okay.
Stair steps could be 450 mm high and work, but building codes make them 200 mm for a reason. And you are not "better" by saying that "I am fit enough to climb 450 mm steps, and you are all lazy for wanting stairs built to ergonomic standards".
This way, as soon as you type "1", it is truly one, you add a "2" and you know that's adding twenty...
IIRC Arabic gets this right!
/s
The idea is that we write `foo.bar`, where `foo` is in scope, exactly because `foo` is in scope. We don't write `bar from foo` or whatever, because it would be hard to reverse search the things that have `bar` in them.
But which is more likely: will `foo` be Liskov-substitutable for `bar`, or will other things that contain a `bar` have a `bar` of a Liskov-substitutable type?
Which is to say, the author is depending on a particularly idiosyncratic meaning of "valid".
That said, the problem the author describes at the beginning can be easily worked around, if you like this kind of workflow:
words_on_lines = [st # 'str' is a plausible autocompletion
words_on_lines = [str.sp # must be either 'split' or 'splitlines', unless 'str' is shadowed
words_on_lines = [str.split(line) for # 'line in' can be suggested
words_on_lines = [str.split(line) for line in text.splitlines()] # IDE can automatically check type and refactor the idiom
And it also isn't at all true that IDEs work left to right. They're constantly auto-typing the balancing close parentheses/braces/brackets for me and it's never clear to me what the intended flow is for moving past what was automatically typed, or whether manually typing that bracket explicitly (since I'm in my own "automatic typing") will double it up or not. There's nothing preventing the IDE from expecting you to type the clauses of the comprehension in a different order and I wouldn't at all be surprised to hear that someone has already implemented this.I believe there are some strongly typed stack based languages where you really always do have something very close to a syntactically correct program as you type. But now that LLMs exist to paper over our awful intuitions, we're stuck with bad syntax like python forever.
On one level, I do prefer my code to be readable left to right and top to bottom. This, typically, means the big "narrative" functions up top and any supporting functions will come after they were used. Ideally, you could read those as "details" after you have understood the overall flow.
On another level, though, it isn't like this is how most things are done. Yes, you want a general flow that makes sense in one direction through text. But this often has major compromises and is not the norm. Directly to programming, trying to make things context free is just not something that works in life.
Directly to this discussion, I'm just not sure how much I care about small context-free parts of the code?
Tangential to this discussion, I oddly hate comprehensions in python. I have yet to get where I can type those directly. And, though I'm asking if LLM tools are making this OBE, I don't use those myself. :(
I've had some minor success with claude, but enabling the AI plugin in intellij has literally made my experience worse, even without using any AI interactions.