I've found the book "Foundations of Multidimensional and Metric Data Structures" by Hanan Samet to be an excellent resource when looking for a slightly deeper dive than a more introductory algorithms course. It goes in depth on the nuances of these approaches, many of which are highly similar at a cursory glance.
With fixed point recursion, this allows for very tiny definitions of IFS fractals. For example, fractals like the Sierpinski triangle/carpet only require ~50 bit of binary lambda calculus [1] [2]!
[0]: https://text.marvinborner.de/2024-03-25-02.html
[1]: https://lambda-screen.marvinborner.de/?term=ERoc0CrYLYA%3D
[2]: https://lambda-screen.marvinborner.de/?term=QcCqqttsFtsI0OaA
Aren't Quadtrees covered by almost all basic data-structure books? It is the most simple form of taking the binary tree into the next (2D) dimension.
In short, it seems relatively few people have the skills to implement them and even fewer have the skills to come up with functional versions of ordinary data structures. They are almost completely absent from university lectures as well, as far as I am aware. For example for AVL trees I could only find a document from ETH from a lecture, that no longer exists or is taught. The language is Isabel and I need to understand its syntax first, before being able to translate it to Scheme.
If anyone has an obscure source for implementations one can learn from and implement oneself in another language, please share.
Edit: lol, downvoted for this post. Never change, HN.
And for a z-curve, the order is basically a depth-first traversal of a quadtree.
https://forceflow.be/2013/10/07/morton-encodingdecoding-thro...
Here's an example from AWS, where lat/long pairs are put into a Z-index, which is used as a DynamoDB sort key, letting you efficiently query for items near a point.
https://aws.amazon.com/blogs/database/z-order-indexing-for-m...
points.sort(key=lambda p: sum(((p[0]>>i&1)<<(2*i))|((p[1]>>i&1)<<(2*i+1)) for i in range(16)))https://proc0.github.io/HackerZen (it's also open source)
https://www.lindelystables.dk/en/posts/functional-quadtree-c...
Got very confused! I challenge the HN hivemind to figure out what's going on.
Well, I guess it was the first AI-first browser, hence all this bs. I uninstalled it months ago...
Quad trees are abstract until you see what they look like. It’s a clever method to partition 2D points.
(Kd trees are even better)
What does that mean?
And then maybe even juxtapose that with a linear search example, which also numbers every step. I bet this would make it really click for some people. And for free the user can also play with how a linear search can sometimes be faster when they just want the first element!
As a bonus: allow the user to change the cell count so they can really feel just how each method scales!
I do have a semi-unrelated question though: does using the recursive approach prevent it from being calculated efficiently on the GPU/compute shaders? Not that it matters; plenty of value in a CPU-bound version of a solution and especially one that is easy to understand when recursive. I was just wondering why the prominent examples used a non-recursive approach, but then I was like "oh, because they expect you to use them on the GPU". ...and then I was like "wait, is that why?"
Most code that I saw that used quadtrees were treating things as points and storing them only at the lowest level.
I also made mine auto-divide by counting items that are entirely in a quadrant as they are added to the node, with allocate and split triggered if a count went above a certain threshold.
Anything novel or oopsie?
Quadtrees might look like natural generalization of binary trees, but some things that work very efficiently in binary trees don't work for naive generalization of binary trees into quadtrees. For a full binary tree with W leaves any segment can be described using log(W) nodes. Almost every use of binary tree more interesting than maintaining sorted list of numbers depends on this property. What happens in 2d with Quadtree, how many of quadtree nodes are necessary to exactly describe arbitrary rectangular area? Turn's out you get O(W+H) nodes, not log(N), not log(W)*log(H).
If you stored a rectangle in smallest node that completely contains that avoids the W+H problem during insertion operations, but during read operations you may end up with situation where all the information is stored at the root. This is no better having no quadtree at all and storing everything in plain list that has no special order. Such worse case scenario can be created very easily if every rectangle contains centre of area described by quadtree, then all rectangles will be placed at root.
Dynamically subdividing and or allocating tree nodes on demand isn't anything particularity novel. If anything the cases where you can use fixed sized static trees outside textbook examples is somewhat a minority. Not saying there are no such cases there are enough of them, but large fraction of practical systems need to be capable of scaling for arbitrary amount of data which get's added and removed over the time and total size is not known ahead of time and the systems need to be able to deal arbitrary user data that has maliciously crafted worst case distribution. Every B-tree will split leaves into smaller nodes when necessary. Almost every self balancing binary tree can be considered of doing automatic division just with threshold of no more than 1 item per node. For 2d many uses cases of k-d tree will also subdivide on demand.
Similar to yours I split the leaf nodes when they became too full, with some simple fixed threshold defining "full". Only leaf nodes contained references to the indexed objects, and all overlapping leaf nodes would reference the objects they overlapped. Search queries were done also using an AABB, and would iteratively invoke a callback for all overlapping object AABBs found in the overlapping leaf nodes. IIRC the object references hanging off the leaf nodes had a fixed number of linked list slots to be put on a results list during a search, to deduplicate the results before iterating that list with the provided candidate-found callback. Since any given indexed object could be on many leaf nodes in the index, if they spanned a large area shared with a high density of other indexed objects for instance.
It was up to the callback to do the narrow-phase collision detection / control the results list iterating stop vs. continue via return value.
I recall one of the annoyances of sticking the search results linked list entry in the object references hanging off the leaf nodes was it set a limit to the number of simultaneous searches one could perform against the index. Basically the game would initialize the index with a fixed maximum concurrent number of searches to handle, and that set the number of results-linked-list slots the object references would be allocated to accommodate. As long as that was never exceeded it worked great.
It's been a while so I may have gotten it wrong, but that sounds right to me.
Not sure what the most common approaches are...
The C source is @ https://git.pengaru.com/cgit/libix2/.git/tree/src/ix2.c
That being said most ray tracing seems to do bounding volume hierarchies now, so maybe a bvh is the best way to deal with things that have volume.
This is the most pedantic way of saying "binary 2-tuples" I've ever seen. Also for quadtrees this is inferior to base 4 because you can assume clockwise (or counter) ordering.
I don't think this is something the article cares about, though.