Show HN: An extension to track your Wikipedia adventures
178 points
16 days ago
| 22 comments
| chromewebstore.google.com
| HN
Wiki Journey tracks your daily Wikipedia rabbit holes in a tree format.

Available on Firefox and Chrome: https://addons.mozilla.org/en-US/firefox/addon/wiki-journey/ https://chromewebstore.google.com/detail/wiki-journey/lehenb...

It's open source, feel free to contribute! https://github.com/demegire/wiki-journey

IncreasePosts
16 days ago
[-]
I wrote a plugin just like this, and every day, I have it present me with a quiz based on a summaries of the first paragraph of the pages I read over the day.

Basically, I was reading way too much Wikipedia and not actually storing much information, so I have the extension shame me if I don't remember what I read.

reply
cooper_ganglia
16 days ago
[-]
That's genius. Have you published this as an extension? I'd love automatically-written flashcards to quiz myself on what I've read that day...
reply
timcobb
16 days ago
[-]
reply
nullindividual
15 days ago
[-]
We absorb the words in front of our eyes even if we're not conscious of it. A topic that you glossed over may come up in another context and remind you of that wiki article.

It shapes who we are.

And sometimes knowledge of the existence of a topic is valuable.

reply
dr_kiszonka
15 days ago
[-]
I remember seeing an article about it in HN.
reply
graypegg
15 days ago
[-]
I would love to mess around with this if you've published it somewhere! Folks would love it as a Show HN I bet too!
reply
phailhaus
16 days ago
[-]
Have you tried out a tangled-tree visualization? [1] I've found it to be super useful when visualizing these sorts of relationships in a compact way, and it naturally sorts the data topologically.

[1] https://observablehq.com/@nitaku/tangled-tree-visualization-...

reply
throwaway444441
16 days ago
[-]
Very cool! One small point of pedantry:

> A tree with multiple inheritance (sometimes called tangled tree) cannot be represented by using a classic tree visualization. It is technically a directed acyclic graph (DAG) with one (or more) nodes identified as root.

What is the difference between a DAG and a tangled tree? Isn't any DAG a tangled tree? I don't see immediately why a new definition is required.

reply
S33V
16 days ago
[-]
I'm not entirely familiar with tangled trees, but it seems like one of the larger differences is that a tangled tree isn't necessarily acrylic. For this example, someone could navigate away from one page, but potentially be linked back to it later down the adventure.
reply
throwaway444441
15 days ago
[-]
> A tree with multiple inheritance (sometimes called tangled tree)

By the author's definition, multiple inheritance prohibits cycles. DAGs can be modeled as tree with back edges to non-ancestors. So I'm pretty sure tangled tree = DAG.

> For this example, someone could navigate away from one page, but potentially be linked back to it later down the adventure.

Good point, maybe "tangled tree with back edges to ancestors" is the really correct model for what the author wants. The key point of the visualization is to highlight the deviation from a standard DAG or tree.

reply
phailhaus
14 days ago
[-]
The author already says that:

> It is technically a directed acyclic graph (DAG)

But DAG's don't have 'roots', they just have nodes. The concept of roots makes it a tangled tree.

reply
wiseowise
15 days ago
[-]
Is there a source code for the visualization?
reply
phailhaus
15 days ago
[-]
That's a live notebook! If you click on the cells, you can see the code that was used to create it, like a Jupyter notebook.
reply
wiseowise
14 days ago
[-]
Ah, thanks! Wasn't that obvious on mobile.
reply
jack_riminton
16 days ago
[-]
This looks really neat
reply
bloopernova
16 days ago
[-]
I feel like there's a lot of knowledge or information that we're "leaving on the plate". For instance, the sites we visit, the files we edit, the branches and PRs we create, etc etc. All of that is related, but it feels like that context is being lost or discarded.

An example might be: I have to include new AWS resources in a deployment, so I look up information about them, find examples and read about potential problems, security information, etc etc. That then becomes edits in a terraform file somewhere, with a Jita ticket, my own knowledge database (Emacs org-roam files in my case, Obsidian etc for other people). Then the feature branch gets a PR to dev, we might discuss changes in Teams (ugh) or a meeting. All of that seems ripe to be linked together conceptually, but the computer has no way to do that.

It makes me wonder if that could be fed into the right machine learning thing to at least start tracking this sort of work stuff. Heck just synchronizing my Firefox bookmarks (ff lets you tag your bookmarks) with my org-roam instance's tags would be useful. Tagged files in my knowledge base could be automatically linked to similarly tagged bookmarks.

reply
surfingdino
16 days ago
[-]
I like these pieces of my digital footprint to not be connected. There is no need to track everything.
reply
idle_zealot
16 days ago
[-]
Do you not want them connected, or do you not want the connections shared and potentially used against you?
reply
surfingdino
15 days ago
[-]
I like them not to be collected or connected. I don't trust those collecting such data.
reply
ranger207
16 days ago
[-]
I typically write up all of that in my documentation somewhere. Stuff like "first thoughts are this approach might work, talked to person who had this idea, looked at this link and found this info, decided to go with this approach because of factors x, y, z". This isn't the primary user-facing documentation but a subpage or something that's helpful a couple of years down the line

It's like a book titled "A History of [Object]" that traces what solved problems before the object, issues with old solutions, the emotional, financial, etc state of the inventor, why they chose this solution over that one, how the object was adopted and improved afterwards, other inventions spawned off the object, etc. Capturing the history of the object requires capturing the context around the object too

reply
joshuahutt
15 days ago
[-]
My thoughts on this are to slow down and document and explore that knowledge and information. If it is really valuable, the "loss" in efficiency from slowing down will be offset by the gain in skill/utility from really grokking the stuff.

If it's not...then there's really nothing "left" on the table — if ever turns out to be valuable, you'll probably come across it again, when needed.

I constantly get a similar feeling. I'm speeding around from task to task, just grasping enough to get the current task done so I can get to the next one and the next one...

And somehow this is value-creating? Apparently it is, but it seems almost accidental, at that rate.

I'd rather slow down and appreciate the value as it moves through me, into whatever I'm doing.

I usually get more from the process, at the same time.

reply
joshuahutt
15 days ago
[-]
It's like...if "less is more," then "more is less."

Reminds me of a floating point number. The bigger or smaller they get, the less accurate they become.

If you're chunking on a ton of data and tasks, you're getting less out of it. At a certain point, none of it even seems to enter your brain at all.

reply
sslayer
15 days ago
[-]
Basically, this is what college should be teaching you - how to research. What good does are useless facts? I don't want to walk around cluttered with a dictionary - I want to know where to look in that dictionary. Obviously in the sciences there are facts that you should know, but even with math, its more about how to derive the formula, than actually memorizing it. I mean, their called "Research Papers" right?
reply
joshuahutt
10 days ago
[-]
Totally agree. I remember the phrase “learning how to think” being thrown around.

I also remember not being explicitly taught that.

It sort of seems like trying to find enlightenment by chopping wood and carrying water at a monastery.

If critical thinking is something that spontaneously emerges in a learning environment, maybe we shouldn’t sell it as a benefit. “Some students experience deep insight into the nature of the mind. Results not typical.”

reply
steezeburger
16 days ago
[-]
I've been thinking of something like this since LLMs became popular. I've toyed around with some proof of concepts, but haven't had the time or motivation to work on it lately. I love the idea of tagging everything and showing connections when you're searching for things. Also semantic search would be great, like "blue website with information about databases I read last week" would be super powerful in my opinion.

I really love the idea of digital knowledge bases, but as you said, I think we're leaving a lot on the table. I need to get back to my project of a user-owned-data knowledge base.

reply
jskherman
16 days ago
[-]
What kind of approach did you take? I was thinking along the lines of requiring something like rewind.ai or some program that autoscreenshots your screen at a set interval (or originally a recorded video split into several images later) and having a vision-capable model (particularly specialized in UIs) describe these set of images in order to build a dataset of images-tags-description and the like.
reply
jskherman
16 days ago
[-]
There's also libraries like trafilatura in Python featured here in HN some time ago that could extract content from websites to help augment the data.
reply
bongodongobob
16 days ago
[-]
I've had similar thoughts but over time you'd just end up with a private copy of the internet. You'll still have to search for the information anyway, so I'm not sure what the benefit is. Searching your knowledge base for "the thing I did yesterday" vs "how to sync Azure to AD" seems basically equivalent to me. You're just creating yet another thing to search.
reply
bloopernova
16 days ago
[-]
That's a good point, you'd absolutely want to get away from adding another burden to the human.

Seeing relevant bookmarks when I'm viewing a specific note in my database could be useful though. And finding pull requests related to a subject might also be useful.

So the idea would be to reduce the number of searches performed by the human. Automate and enhance rather than dump and forget.

reply
eichin
16 days ago
[-]
Yeah, but your private copy would be more like "The internet: The Good Parts" (assuming you had a way to not store what you immediately dismissed as garbage; maybe only include pages with a dwell time of 15-30s or more.) That's enormously valuable (and why I've implemented it before - but in conkeror, which didn't survive the death of xulrunner - so now I use pinboard and text files and logseq, which are pretty good but a lot more work.)
reply
happypumpkin
16 days ago
[-]
To whatever extent something like this can be done locally, I'd probably pay a monthly sub for it if its good enough. But I wouldn't want any of that leaving my machine, we get tracked and profiled enough as-is imo.
reply
bloopernova
16 days ago
[-]
Yeah, this is worth at least as much as Kagi or Copilot is to me right now.
reply
_boffin_
16 days ago
[-]
Working on something like that, but there’s still a good amount of work to do
reply
bawolff
16 days ago
[-]
That's cool.

I do find it ironic though that wikipedia is one of the major sites with the least amount of user tracking, and then users decide to implement the tracking themselves.

reply
nullhole
16 days ago
[-]
That is funny, though this is more tracking-by-users than tracking-of-users
reply
BlueTemplar
16 days ago
[-]
reply
eichin
16 days ago
[-]
tracking for-the-benefit-of users, which only has to be done by the users because no services can be trusted :-)
reply
non-
16 days ago
[-]
This is cool, I love how it shows you all the branches you've followed in actual tree diagram.

The concept reminds of https://browser.horse/ a bit, which has the concept of "trails" that track any links you visit. Great for research projects.

reply
BlairCurrey
16 days ago
[-]
Cool tool. Might be cool to make something wikipedia agnostic. Sometimes I manually create such a thing via obsidian but its kind of tedious. It's interesting how sometimes different starting sources read far apart in time lead to rabbitholes which cross paths.

This reminds me of a python scraper I wrote a while back when I was learning to program - Youtube rabbithole: https://github.com/BlairCurrey/youtube-rabbithole

It basically just follows the next recommended video, recording the path along the way. More about tracing the youtube algorithm than tracking your own journey.

reply
starkparker
16 days ago
[-]
Looking at https://github.com/demegire/wiki-journey/blob/main/firefox/c...

It seems likely that the extension could be customized to any Mediawiki instance? As an admin I'd love to be able to use it elsewhere. This looks like it could be a great tool working with test users on stuff like information architecture, to see the path of how they found information. (I know there are better tools for that, but something that focuses tightly on wiki interactions would be useful to me.)

reply
KaiMagnus
16 days ago
[-]
That’s a very cool project and I wish something like this would exist for all websites.

A few years ago I did a university project where we looked into (internet) research and how information discovery and gathering could be improved. (https://www.kaimagnus.de/projects/halo)

There we had the concept of a similar looking tree. Users could then come back to their exploration and take notes, prioritize and sort.

It was only a concept back then, so it’s nice to see it in action.

reply
jskherman
16 days ago
[-]
Similarly, per chance, is there also an extension for the browser to show a tree graph or a directional node graph like in Obsidian for the sequence of websites you visit in your browser history to see your whole rabbit hole on the Internet? I'm pretty sure the tech is already used by the advertising industry.
reply
CalRobert
15 days ago
[-]
Suddenly I am reminded, for the first time in maybe 2 decades, that "surfing the internet" was once a term used specifically for this kind of rabbit-holing
reply
steezeburger
16 days ago
[-]
This is really cool! It would be super neat if the nodes were more interconnected, forming a fully connected graph rather than just a tree.
reply
sixo
16 days ago
[-]
This is tracking the user's trajectory through the site, necessarily a tree, not the network structure of W itself.
reply
random3
16 days ago
[-]
How is the user journey through the site necessarily a tree? What prevents the user to create loops through their journey?
reply
lyk2005
15 days ago
[-]
Not an absolute statement, just that it resembles a tree more closely as you branch off slicking on hyperlinks.
reply
steezeburger
16 days ago
[-]
Oo, yeah, that's a good point! I totally see why it was done this way now.

Though I do still think it would be cool to have a toggleable overlay or something that shows the cyclic connections!

reply
phailhaus
16 days ago
[-]
That would be technically more "accurate", but it doesn't yield more useful information and ends up being harder to read.
reply
ldayley
16 days ago
[-]
Interesting. I’ve been using the Wikipedia iOS app (which saves history by the day) to keep track of my personal rabbit hole journeys…
reply
russdpale
16 days ago
[-]
This is one of those ideas where you think "why the hell didn't I think of this?"
reply
RockRobotRock
15 days ago
[-]
POV: It's 4 AM and I still can't fall asleep
reply
jsunderland323
16 days ago
[-]
This is great! Will try to give a try later
reply
kcarter80
16 days ago
[-]
Narrator: they didn't.
reply
KaiMagnus
16 days ago
[-]
Have to admit I'm slightly disappointed that the FF version only shows two users still and one of them is me.
reply
krylon
15 days ago
[-]
I didn't know I needed this.
reply
noashavit
16 days ago
[-]
A graph of Wikipedia rabbit holes
reply
serenayakgun
15 days ago
[-]
wow, this is great
reply
nathell
16 days ago
[-]
Obligatory xkcd: https://xkcd.com/214/
reply
AdmiralAsshat
16 days ago
[-]
My Wikipedia searches are like my porn searches: no one needs to know about them, least of all myself. They bring only shame and remorse.
reply
xcdzvyn
16 days ago
[-]
This is fantastic. Great idea!
reply
nirmel
16 days ago
[-]
I'll mention that I made what could be described as a AI-generated Wikipedia alternative, where you can generate articles on anything with text links on terms that link to new articles that get generated considering the context of the the article path that got you there. I reckon Wiki-enthusiasts won't be disappointed: https://anylearn.ai
reply
dcsan
15 days ago
[-]
Awesome!
reply