I also implemented a spreadsheet last year [0] in pure TypeScript, with the fun twist that formulas also update backwards. While the backwards root finding algorithm was challenging, I also found it incredibly humbling to discover how much complexity there is in the UX of the simple spreadsheet interface. Handling selection states, reactive updates, detecting cycles of dependency and gracefully recovering from them is a massive state machine programming challenge! Very fun project with a lot of depth!
I myself didn't hand roll my own parser but used Ohm-js [1] which I highly recommend if you want to parse a custom language in Javascript or TypeScript.
> One way of doing this is to keep track of all dependencies between the cells and trigger updates when necessary. Maintaining a dependency graph would give us the most efficient updates, but it’s often an overkill for a spreadsheet.
On that subject, figuring out the efficient way to do it is also a large engineering challenge, and is definitely not overkill but absolutely required for a modern spreadsheet implementation. There is a good description of how Excel does it in this famous paper "Build systems a la carte" paper, which interestingly takes on a spreadsheet as a build system [2].
[0] https://victorpoughon.github.io/bidicalc/
[2] https://www.microsoft.com/en-us/research/wp-content/uploads/...
But like I said I am not sure that I know what I am talking about and I may be confusing backwards calculation with algebraic engines. I would love for algebra solvers to be a first class object in more languages.
Take, for example, backprop in machine learning. The model operates forwards. Then you solve backwards to figure out how to update the terms.
You'd be surprised. It really depends on how you define the problem and what your goal is. My goal with bidicalc what to find ONE solution. This makes the problem somewhat possible since when there are an infinity of solution, the goal is just to converge to one. For example solving 100 = X + Y with both X and Y unknown sounds impossible in general, but finding one solution is not so difficult. The idea is that any further constraint that would help choose between the many solutions should be expressed by the user in the spreadsheet itself, rather than hardcoded in the backwards solver.
> What kind of problems can you solve backwardly?
This is the weakness of the project honestly! I made it because I was obsessed with the idea and wanted it to exist, not because I was driven by any use case. You can load some premade examples in the app, but I haven't found any killer use case for it yet. I'm just glad it exists now. You can enter any arbitrary DAG of formulas, update any value, input or output, and everything will update upstream and downstream from your edit and remain valid. That's just extremely satisfying to me.
#define MAXIN 128 // max cell input length
enum { EMPTY, NUM, LABEL, FORMULA }; // cell types
struct cell {
int type;
float val;
char text[MAXIN]; // raw user input
};
#define NCOL 26 // max number of columns (A..Z)
#define NROW 50 // max number of rows
struct grid {
struct cell cells[NCOL][NROW];
};
I doubt that 171 KB of static allocation would fly on an Apple II! I do wonder how they did memory allocation, it must have been tricky with all the fragmentation.> The basic approach was to allocate memory into fixed chunks so that we wouldn't have a problem with the kind of breakage that occurs with irregular allocation. Deallocating a cell freed up 100% of its storage. Thus a given spreadsheet would take up the same amount of space no matter how it was created. I presumed that the spreadsheet would normally be compact and in the upper left (low number rows and cells) so used a vector of rows vectors. The chunks were also called cells so I had to be careful about terminology to avoid confusion. Internally the term "cell" always meant storage cell. These cells were allocated from one direction and the vectors from the other. When they collided the program reorganized the storage. It had to do this in place since there was no room left at that point -- after all that's why we had to do the reorganization.
> The actual representation was variable length with each element prefixed by a varying length type indicator. In order to avoid having most code parse the formula the last by was marked $ff (or 0xff in today's representation). It turned out that valid cell references at the edges of the sheet looked like this and created some interesting bugs.
It leaves out a lot of details - if you're skimping enough you could allocate variable length row vectors, but it seems they wanted to avoid variable length allocations, in which case you could start with a 255 byte array pointing to which subsequent equal-sized chunk represents each in-use row. You'd need at most 126 bytes per row in actual use to point into the chunks representing the cell contents. But this is just guesses.
[1] https://www.landley.net/history/mirror/apple2/implementingvi... and https://news.ycombinator.com/item?id=34303825
It's not overkill at all. In fact, it's absolutely necessary for all but the simplest toy examples.
It would seem that the creators of VisiCalc regarded this is a choice that made sense in the context of the limitations of the Apple ][, but agree that a dependency graph would have been better.
https://www.landley.net/history/mirror/apple2/implementingvi...
Edit: It's also interesting that the tradeoff here is put in terms of correctness, not performance as in the posted article. And that makes sense: Consider a spreadsheet with =B2 in A1 and =B1 in B2. Now change the value of B1. If you recalc the sheet in row-column OR column-row order, B2 will update to match B1, but A1 will now be incorrect! You need to evaluate twice to fully resolve the dependency graph.
There are many common spreadsheet use cases that don't involve complicated dependency trees.
Though I think the definition of the parser struct should be
struct parser {
const char* s;
const char* p;
struct grid* g;
};
based on the rest of the code.You can also literally run Lotus 123 if you want. Someone has binaries to make it work on linux. or under dosemu
Edit: Quite painless! Opened some test xlsx files without issue. Did get a stack trace on a very complicated one, so when I have time I'll try and dig in deeper. Added a doc to the wiki in case it's helpful to other: https://github.com/andmarti1424/sc-im/wiki/Building-sc%E2%80...
Pretty much anything that you used to do on paper with a columnar notebook or worksheet and a calculator, or anything that could be represented in tabular form could probably be implemented in VisiCalc, Lotus 123, and others. Spreadsheets are probably the most successful software application that was ever invented. Certainly one of the most.
After paying by writing out a check, I helped load everything into his car and he drove off into the sunset --- I was then allowed to choose a reformatted disk from the box as a reward and chose _The Softporn Adventure_ (which I then stupidly removed the label from, but it wasn't something I wanted to explain to my parents...).
Back then it was common for people to buy a whole system for their requirements. Hardware and software.
* Person who deals with numbers all day goes to a computer store to browse.
* He sees VisiCalc, and immediately understands what it can do. It *blows his mind*.
* He wants to buy it right away. Pays for $2000 Apple II computer with disk drives to run $100 software; price is no object.
* Shows friends and colleagues.
* They rush to computer store. Repeat.
https://github.com/drclcomputers/GoSheet
https://github.com/xi/spreadsheet/
https://github.com/andmarti1424/sc-im
https://github.com/saulpw/visidata
https://github.com/bgreenwell/xleak
https://github.com/SamuelSchlesinger/tshts
https://github.com/CodeOne45/vex-tuiWith MS Edit resurrected similarly, I wonder how hard it would be to get a flushed out text based spreadsheet closer in function to MS Excel or Lotus 123 versions for DOS, but cross platform. Maybe even able to load/save a few different formats from CSV/TSV to XLSX (without OLE/COM embeds).