FilterHN

Functional Quadtrees

113 points

by lbj

10 hours ago

| past

| 11 comments

| lbjgruppen.com

| HN

▲

lemonwaterlime

8 hours ago

[-]

I like to do data-oriented programming, and was just thinking about how I want to organize (and search through) the primary data structures/concepts for a project I'm working on. Part of that involved thinking about things like what information I might cache and what representations data might take. That lead me to looking into the nuances of things like B-Trees, AVL Trees, Quadtrees, k-d trees and so forth.

I've found the book "Foundations of Multidimensional and Metric Data Structures" by Hanan Samet to be an excellent resource when looking for a slightly deeper dive than a more introductory algorithms course. It goes in depth on the nuances of these approaches, many of which are highly similar at a cursory glance.

▲

andoando

6 hours ago

[-]

Dammit why these books have to be $60

▲

incognito124

5 hours ago

[-]

Just ask anna if she has it in her archive

▲

dswalter

3 hours ago

[-]

I see USD $36 on ebay, used. It's a smaller barrier to entry.

▲

marvinborner

9 hours ago

[-]

Quadtrees are also quite useful for generating fractals. A very related project of mine, Lambda Screen [0], explores this by encoding these functional quadtrees directly in lambda calculus and rendering the structure based on Church booleans being true (white) or false (black).

With fixed point recursion, this allows for very tiny definitions of IFS fractals. For example, fractals like the Sierpinski triangle/carpet only require ~50 bit of binary lambda calculus [1] [2]!

[0]: https://text.marvinborner.de/2024-03-25-02.html

[1]: https://lambda-screen.marvinborner.de/?term=ERoc0CrYLYA%3D

[2]: https://lambda-screen.marvinborner.de/?term=QcCqqttsFtsI0OaA

▲

torusle

9 hours ago

[-]

"I could only find a couple tutorials/guides and both were imperative"

Aren't Quadtrees covered by almost all basic data-structure books? It is the most simple form of taking the binary tree into the next (2D) dimension.

▲

zelphirkalt

5 hours ago

[-]

The problem is rather, that most data structure tutorials and books don't even get the idea, to introduce a purely functional version, but merely state the imperative versions. Coming up with the functional versions of data structures can be difficult. Papers about it can be hard to understand and often require one to already know some niche language, that a researcher used for the paper. Even if you can find a functional implementation of a data structure, there is often not a good explanation and you need to, sort of, reverse engineer it and translate it to the language you are using.

In short, it seems relatively few people have the skills to implement them and even fewer have the skills to come up with functional versions of ordinary data structures. They are almost completely absent from university lectures as well, as far as I am aware. For example for AVL trees I could only find a document from ETH from a lecture, that no longer exists or is taught. The language is Isabel and I need to understand its syntax first, before being able to translate it to Scheme.

If anyone has an obscure source for implementations one can learn from and implement oneself in another language, please share.

▲

pixelpoet

9 hours ago

[-]

You can even build them with basically one line of code by sorting points using Morton / Z-curve order. It's linear time if you use a counting/radix sort.

Edit: lol, downvoted for this post. Never change, HN.

▲

rdtsc

4 hours ago

[-]

Yeah good point, they are downsides for sure but it's a simple enough approach and most of all it can be shoved in a database (or b-tree or any 1d-sorted data structure).

And for a z-curve, the order is basically a depth-first traversal of a quadtree.

▲

quibono

9 hours ago

[-]

I'd love to see that. Could you link me to an implementation or explain this in more detail please?

▲

_jackdk_

2 hours ago

[-]

Here's a 3D version used in the creation of sparse voxel octrees:

https://forceflow.be/2013/10/07/morton-encodingdecoding-thro...

Here's an example from AWS, where lat/long pairs are put into a Z-index, which is used as a DynamoDB sort key, letting you efficiently query for items near a point.

https://aws.amazon.com/blogs/database/z-order-indexing-for-m...

▲

johnisgood

7 hours ago

[-]

I would like to know about this more, too. Is there a code anywhere, ideally with comments? But I am fine without comments, too, I would just like to see the code and possibly with an example usage.

▲

quibono

4 hours ago

[-]

Okay, I was intrigued and I did some digging. Morton / Z-order is all about interleaving the individual bits of the x and y coordinates. You end up grouping by quadrants. Python one liner:

    points.sort(key=lambda p: sum(((p[0]>>i&1)<<(2*i))|((p[1]>>i&1)<<(2*i+1)) for i in range(16)))

▲

pavlov

9 hours ago

[-]

It’s super easy to click the downvote button by accident on mobile when you meant to upvote. And this UI will never be fixed because this is HN after all.

▲

proc0

6 hours ago

[-]

I just implemented an HN UI. This is good feedback as I'm aiming to have a mobile friendly web version.

https://proc0.github.io/HackerZen (it's also open source)

▲

craftkiller

9 hours ago

[-]

This is precisely the reason that I do not log in to HN on my phone. My phone is read-only and if I want to upvote or comment then I have to switch to my laptop. Pretty easy with firefox because I can send tabs to other devices.

▲

jasonjmcghee

8 hours ago

[-]

Zoom in or unvote/revote

▲

acters

7 hours ago

[-]

That is why I like harmonic app, there is an invite button separating the upvote and downvote. Never going to have this kind of issue

▲

OisinMoran

10 hours ago

[-]

Neat! Weirdly sending this article from my phone (Pixel 8) to my browser (Arc) via Pushbullet resulted in an incredibly strange bug that it loads this site instead:

https://www.lindelystables.dk/en/posts/functional-quadtree-c...

Got very confused! I challenge the HN hivemind to figure out what's going on.

▲

mutkach

10 hours ago

[-]

I remember Arc randomly rewriting my bookmarks using some kind of summarization model or something like that, it also sometimes changed the name of downloaded files reinterpreting their names. Maybe it is related somehow.

Well, I guess it was the first AI-first browser, hence all this bs. I uninstalled it months ago...

▲

OisinMoran

9 hours ago

[-]

Yeah that feature was nice at the start then got very annoying. Still like the browser though. Weirdly enough chrome on my phone has been reporting the wrong URLs, usually one I've just been on.

▲

lbj

9 hours ago

[-]

Apologies! I think I might have a found an eager redirect on the server. I haven't been able to reproduce, but you're not the first to report it. I hope it's fixed now.

▲

runemadsen

10 hours ago

[-]

We just did a whole visual identity around the quadtree concept. Take a scroll on this one! https://trace.systems/

▲

wenc

6 hours ago

[-]

That is a cool visualization of a quad tree. I use quadtrees in geospatial applications (to partition lat longs) but this is the time I’ve seen it used to render a photo.

Quad trees are abstract until you see what they look like. It’s a clever method to partition 2D points.

(Kd trees are even better)

▲

CyberDildonics

6 hours ago

[-]

a whole visual identity around the quadtree concept

What does that mean?

▲

dwb

9 hours ago

[-]

That all sounds incredibly dystopian, ugh.

▲

Waterluvian

10 hours ago

[-]

I love the visualization, which gave me an idea: what if we numbered every "Looking at" step in the visualization? Then it's obvious just how many search steps it takes.

And then maybe even juxtapose that with a linear search example, which also numbers every step. I bet this would make it really click for some people. And for free the user can also play with how a linear search can sometimes be faster when they just want the first element!

As a bonus: allow the user to change the cell count so they can really feel just how each method scales!

▲

catapart

3 hours ago

[-]

I appreciate that this writeup takes care to call out use cases when they help with understanding!

I do have a semi-unrelated question though: does using the recursive approach prevent it from being calculated efficiently on the GPU/compute shaders? Not that it matters; plenty of value in a CPU-bound version of a solution and especially one that is easy to understand when recursive. I was just wondering why the prominent examples used a non-recursive approach, but then I was like "oh, because they expect you to use them on the GPU". ...and then I was like "wait, is that why?"

▲

djmips

2 hours ago

[-]

Non recursive approaches can help on the CPU as well. It's just easy and elegant to do function recusion but not necessarily faster.

▲

willvarfar

9 hours ago

[-]

A general quadtree implementation question that puzzled me when I was implementing it myself for hobby games was: do you store a rectangle in the smallest node that completely contains it?

Most code that I saw that used quadtrees were treating things as points and storing them only at the lowest level.

I also made mine auto-divide by counting items that are entirely in a quadrant as they are added to the node, with allocate and split triggered if a count went above a certain threshold.

Anything novel or oopsie?

▲

Karliss

5 hours ago

[-]

One of the reasons you mostly saw point operations and very few rectangle operations is because quadtrees aren't great for range operations.

Quadtrees might look like natural generalization of binary trees, but some things that work very efficiently in binary trees don't work for naive generalization of binary trees into quadtrees. For a full binary tree with W leaves any segment can be described using log(W) nodes. Almost every use of binary tree more interesting than maintaining sorted list of numbers depends on this property. What happens in 2d with Quadtree, how many of quadtree nodes are necessary to exactly describe arbitrary rectangular area? Turn's out you get O(W+H) nodes, not log(N), not log(W)*log(H).

If you stored a rectangle in smallest node that completely contains that avoids the W+H problem during insertion operations, but during read operations you may end up with situation where all the information is stored at the root. This is no better having no quadtree at all and storing everything in plain list that has no special order. Such worse case scenario can be created very easily if every rectangle contains centre of area described by quadtree, then all rectangles will be placed at root.

Dynamically subdividing and or allocating tree nodes on demand isn't anything particularity novel. If anything the cases where you can use fixed sized static trees outside textbook examples is somewhat a minority. Not saying there are no such cases there are enough of them, but large fraction of practical systems need to be capable of scaling for arbitrary amount of data which get's added and removed over the time and total size is not known ahead of time and the systems need to be able to deal arbitrary user data that has maliciously crafted worst case distribution. Every B-tree will split leaves into smaller nodes when necessary. Almost every self balancing binary tree can be considered of doing automatic division just with threshold of no more than 1 item per node. For 2d many uses cases of k-d tree will also subdivide on demand.

▲

pengaru

5 hours ago

[-]

When I made a quadtree for some simple 2d games I handled AABB areas, since it was aimed at broad-phase collision detection of what were essentially sprites just rendered w/GL.

Similar to yours I split the leaf nodes when they became too full, with some simple fixed threshold defining "full". Only leaf nodes contained references to the indexed objects, and all overlapping leaf nodes would reference the objects they overlapped. Search queries were done also using an AABB, and would iteratively invoke a callback for all overlapping object AABBs found in the overlapping leaf nodes. IIRC the object references hanging off the leaf nodes had a fixed number of linked list slots to be put on a results list during a search, to deduplicate the results before iterating that list with the provided candidate-found callback. Since any given indexed object could be on many leaf nodes in the index, if they spanned a large area shared with a high density of other indexed objects for instance.

It was up to the callback to do the narrow-phase collision detection / control the results list iterating stop vs. continue via return value.

I recall one of the annoyances of sticking the search results linked list entry in the object references hanging off the leaf nodes was it set a limit to the number of simultaneous searches one could perform against the index. Basically the game would initialize the index with a fixed maximum concurrent number of searches to handle, and that set the number of results-linked-list slots the object references would be allocated to accommodate. As long as that was never exceeded it worked great.

It's been a while so I may have gotten it wrong, but that sounds right to me.

Not sure what the most common approaches are...

The C source is @ https://git.pengaru.com/cgit/libix2/.git/tree/src/ix2.c

▲

CyberDildonics

6 hours ago

[-]

In 3D this has to be dealt with in the form of polygons and I think it was common when people were using acceleration grids and kd-trees to split the polygons so they fit neatly.

That being said most ray tracing seems to do bounding volume hierarchies now, so maybe a bvh is the best way to deal with things that have volume.

▲

wiz21c

10 hours ago

[-]

I think it is weird to have two cells divided downto their smallest size when my cursor clearly occupies only one of them, not two.

▲

senderista

8 hours ago

[-]

I wish the article had made it clearer that quadtree positions are encoded as strings over the alphabet on 2 bits (similarly, octrees use the alphabet over 3 bits). This makes storing keys and lexicographically comparing them very simple.

▲

almostgotcaught

7 hours ago

[-]

> positions are encoded as strings over the alphabet on 2 bits

This is the most pedantic way of saying "binary 2-tuples" I've ever seen. Also for quadtrees this is inferior to base 4 because you can assume clockwise (or counter) ordering.

▲

mkehrt

4 hours ago

[-]

I don't think that's what they meant. It's the case you can use literal strings of bits to encode a (2^n)-tree node, so you use actual bitstring comparisons and operations to manipulate them. Rightshift gives you the parent and things like that.

I don't think this is something the article cares about, though.

▲

enigma101

5 hours ago

[-]

The language hurts the eyes