FilterHN

Show HN: An extension to track your Wikipedia adventures

178 points

by demegire

16 days ago

| past

| 22 comments

| chromewebstore.google.com

| HN

Wiki Journey tracks your daily Wikipedia rabbit holes in a tree format.

Available on Firefox and Chrome: https://addons.mozilla.org/en-US/firefox/addon/wiki-journey/ https://chromewebstore.google.com/detail/wiki-journey/lehenb...

It's open source, feel free to contribute! https://github.com/demegire/wiki-journey

▲

IncreasePosts

16 days ago

[-]

I wrote a plugin just like this, and every day, I have it present me with a quiz based on a summaries of the first paragraph of the pages I read over the day.

Basically, I was reading way too much Wikipedia and not actually storing much information, so I have the extension shame me if I don't remember what I read.

▲

cooper_ganglia

16 days ago

[-]

That's genius. Have you published this as an extension? I'd love automatically-written flashcards to quiz myself on what I've read that day...

▲

timcobb

16 days ago

[-]

https://news.ycombinator.com/item?id=40151952 ;)

▲

nullindividual

15 days ago

[-]

We absorb the words in front of our eyes even if we're not conscious of it. A topic that you glossed over may come up in another context and remind you of that wiki article.

It shapes who we are.

And sometimes knowledge of the existence of a topic is valuable.

▲

dr_kiszonka

15 days ago

[-]

I remember seeing an article about it in HN.

▲

graypegg

15 days ago

[-]

I would love to mess around with this if you've published it somewhere! Folks would love it as a Show HN I bet too!

▲

phailhaus

16 days ago

[-]

Have you tried out a tangled-tree visualization? [1] I've found it to be super useful when visualizing these sorts of relationships in a compact way, and it naturally sorts the data topologically.

[1] https://observablehq.com/@nitaku/tangled-tree-visualization-...

▲

throwaway444441

16 days ago

[-]

Very cool! One small point of pedantry:

> A tree with multiple inheritance (sometimes called tangled tree) cannot be represented by using a classic tree visualization. It is technically a directed acyclic graph (DAG) with one (or more) nodes identified as root.

What is the difference between a DAG and a tangled tree? Isn't any DAG a tangled tree? I don't see immediately why a new definition is required.

▲

S33V

16 days ago

[-]

I'm not entirely familiar with tangled trees, but it seems like one of the larger differences is that a tangled tree isn't necessarily acrylic. For this example, someone could navigate away from one page, but potentially be linked back to it later down the adventure.

▲

throwaway444441

15 days ago

[-]

> A tree with multiple inheritance (sometimes called tangled tree)

By the author's definition, multiple inheritance prohibits cycles. DAGs can be modeled as tree with back edges to non-ancestors. So I'm pretty sure tangled tree = DAG.

> For this example, someone could navigate away from one page, but potentially be linked back to it later down the adventure.

Good point, maybe "tangled tree with back edges to ancestors" is the really correct model for what the author wants. The key point of the visualization is to highlight the deviation from a standard DAG or tree.

▲

phailhaus

14 days ago

[-]

The author already says that:

> It is technically a directed acyclic graph (DAG)

But DAG's don't have 'roots', they just have nodes. The concept of roots makes it a tangled tree.

▲

wiseowise

15 days ago

[-]

Is there a source code for the visualization?

▲

phailhaus

15 days ago

[-]

That's a live notebook! If you click on the cells, you can see the code that was used to create it, like a Jupyter notebook.

▲

wiseowise

14 days ago

[-]

Ah, thanks! Wasn't that obvious on mobile.

▲

jack_riminton

16 days ago

[-]

This looks really neat

▲

bloopernova

16 days ago

[-]

I feel like there's a lot of knowledge or information that we're "leaving on the plate". For instance, the sites we visit, the files we edit, the branches and PRs we create, etc etc. All of that is related, but it feels like that context is being lost or discarded.

An example might be: I have to include new AWS resources in a deployment, so I look up information about them, find examples and read about potential problems, security information, etc etc. That then becomes edits in a terraform file somewhere, with a Jita ticket, my own knowledge database (Emacs org-roam files in my case, Obsidian etc for other people). Then the feature branch gets a PR to dev, we might discuss changes in Teams (ugh) or a meeting. All of that seems ripe to be linked together conceptually, but the computer has no way to do that.

It makes me wonder if that could be fed into the right machine learning thing to at least start tracking this sort of work stuff. Heck just synchronizing my Firefox bookmarks (ff lets you tag your bookmarks) with my org-roam instance's tags would be useful. Tagged files in my knowledge base could be automatically linked to similarly tagged bookmarks.

▲

surfingdino

16 days ago

[-]

I like these pieces of my digital footprint to not be connected. There is no need to track everything.

▲

idle_zealot

16 days ago

[-]

Do you not want them connected, or do you not want the connections shared and potentially used against you?

▲

surfingdino

15 days ago

[-]

I like them not to be collected or connected. I don't trust those collecting such data.

▲

ranger207

16 days ago

[-]

I typically write up all of that in my documentation somewhere. Stuff like "first thoughts are this approach might work, talked to person who had this idea, looked at this link and found this info, decided to go with this approach because of factors x, y, z". This isn't the primary user-facing documentation but a subpage or something that's helpful a couple of years down the line

It's like a book titled "A History of [Object]" that traces what solved problems before the object, issues with old solutions, the emotional, financial, etc state of the inventor, why they chose this solution over that one, how the object was adopted and improved afterwards, other inventions spawned off the object, etc. Capturing the history of the object requires capturing the context around the object too

▲

joshuahutt

15 days ago

[-]

My thoughts on this are to slow down and document and explore that knowledge and information. If it is really valuable, the "loss" in efficiency from slowing down will be offset by the gain in skill/utility from really grokking the stuff.

If it's not...then there's really nothing "left" on the table — if ever turns out to be valuable, you'll probably come across it again, when needed.

I constantly get a similar feeling. I'm speeding around from task to task, just grasping enough to get the current task done so I can get to the next one and the next one...

And somehow this is value-creating? Apparently it is, but it seems almost accidental, at that rate.

I'd rather slow down and appreciate the value as it moves through me, into whatever I'm doing.

I usually get more from the process, at the same time.

▲

joshuahutt

15 days ago

[-]

It's like...if "less is more," then "more is less."

Reminds me of a floating point number. The bigger or smaller they get, the less accurate they become.

If you're chunking on a ton of data and tasks, you're getting less out of it. At a certain point, none of it even seems to enter your brain at all.

▲

sslayer

15 days ago

[-]

Basically, this is what college should be teaching you - how to research. What good does are useless facts? I don't want to walk around cluttered with a dictionary - I want to know where to look in that dictionary. Obviously in the sciences there are facts that you should know, but even with math, its more about how to derive the formula, than actually memorizing it. I mean, their called "Research Papers" right?

▲

joshuahutt

10 days ago

[-]

Totally agree. I remember the phrase “learning how to think” being thrown around.

I also remember not being explicitly taught that.

It sort of seems like trying to find enlightenment by chopping wood and carrying water at a monastery.

If critical thinking is something that spontaneously emerges in a learning environment, maybe we shouldn’t sell it as a benefit. “Some students experience deep insight into the nature of the mind. Results not typical.”

▲

steezeburger

16 days ago

[-]

I've been thinking of something like this since LLMs became popular. I've toyed around with some proof of concepts, but haven't had the time or motivation to work on it lately. I love the idea of tagging everything and showing connections when you're searching for things. Also semantic search would be great, like "blue website with information about databases I read last week" would be super powerful in my opinion.

I really love the idea of digital knowledge bases, but as you said, I think we're leaving a lot on the table. I need to get back to my project of a user-owned-data knowledge base.

▲

jskherman

16 days ago

[-]

What kind of approach did you take? I was thinking along the lines of requiring something like rewind.ai or some program that autoscreenshots your screen at a set interval (or originally a recorded video split into several images later) and having a vision-capable model (particularly specialized in UIs) describe these set of images in order to build a dataset of images-tags-description and the like.

▲

jskherman

16 days ago

[-]

There's also libraries like trafilatura in Python featured here in HN some time ago that could extract content from websites to help augment the data.

▲

bongodongobob

16 days ago

[-]

I've had similar thoughts but over time you'd just end up with a private copy of the internet. You'll still have to search for the information anyway, so I'm not sure what the benefit is. Searching your knowledge base for "the thing I did yesterday" vs "how to sync Azure to AD" seems basically equivalent to me. You're just creating yet another thing to search.

▲

bloopernova

16 days ago

[-]

That's a good point, you'd absolutely want to get away from adding another burden to the human.

Seeing relevant bookmarks when I'm viewing a specific note in my database could be useful though. And finding pull requests related to a subject might also be useful.

So the idea would be to reduce the number of searches performed by the human. Automate and enhance rather than dump and forget.

▲

eichin

16 days ago

[-]

Yeah, but your private copy would be more like "The internet: The Good Parts" (assuming you had a way to not store what you immediately dismissed as garbage; maybe only include pages with a dwell time of 15-30s or more.) That's enormously valuable (and why I've implemented it before - but in conkeror, which didn't survive the death of xulrunner - so now I use pinboard and text files and logseq, which are pretty good but a lot more work.)

▲

happypumpkin

16 days ago

[-]

To whatever extent something like this can be done locally, I'd probably pay a monthly sub for it if its good enough. But I wouldn't want any of that leaving my machine, we get tracked and profiled enough as-is imo.

▲

bloopernova

16 days ago

[-]

Yeah, this is worth at least as much as Kagi or Copilot is to me right now.

▲

_boffin_

16 days ago

[-]

Working on something like that, but there’s still a good amount of work to do

▲

bawolff

16 days ago

[-]

That's cool.

I do find it ironic though that wikipedia is one of the major sites with the least amount of user tracking, and then users decide to implement the tracking themselves.

▲

nullhole

16 days ago

[-]

That is funny, though this is more tracking-by-users than tracking-of-users

▲

BlueTemplar

16 days ago

[-]

reminded me of :

https://news.ycombinator.com/item?id=40191075

▲

eichin

16 days ago

[-]

tracking for-the-benefit-of users, which only has to be done by the users because no services can be trusted :-)

▲

non-

16 days ago

[-]

This is cool, I love how it shows you all the branches you've followed in actual tree diagram.

The concept reminds of https://browser.horse/ a bit, which has the concept of "trails" that track any links you visit. Great for research projects.

▲

BlairCurrey

16 days ago

[-]

Cool tool. Might be cool to make something wikipedia agnostic. Sometimes I manually create such a thing via obsidian but its kind of tedious. It's interesting how sometimes different starting sources read far apart in time lead to rabbitholes which cross paths.

This reminds me of a python scraper I wrote a while back when I was learning to program - Youtube rabbithole: https://github.com/BlairCurrey/youtube-rabbithole

It basically just follows the next recommended video, recording the path along the way. More about tracing the youtube algorithm than tracking your own journey.

▲

starkparker

16 days ago

[-]

Looking at https://github.com/demegire/wiki-journey/blob/main/firefox/c...

It seems likely that the extension could be customized to any Mediawiki instance? As an admin I'd love to be able to use it elsewhere. This looks like it could be a great tool working with test users on stuff like information architecture, to see the path of how they found information. (I know there are better tools for that, but something that focuses tightly on wiki interactions would be useful to me.)

▲

KaiMagnus

16 days ago

[-]

That’s a very cool project and I wish something like this would exist for all websites.

A few years ago I did a university project where we looked into (internet) research and how information discovery and gathering could be improved. (https://www.kaimagnus.de/projects/halo)

There we had the concept of a similar looking tree. Users could then come back to their exploration and take notes, prioritize and sort.

It was only a concept back then, so it’s nice to see it in action.

▲

jskherman

16 days ago

[-]

Similarly, per chance, is there also an extension for the browser to show a tree graph or a directional node graph like in Obsidian for the sequence of websites you visit in your browser history to see your whole rabbit hole on the Internet? I'm pretty sure the tech is already used by the advertising industry.

▲

CalRobert

15 days ago

[-]

Suddenly I am reminded, for the first time in maybe 2 decades, that "surfing the internet" was once a term used specifically for this kind of rabbit-holing

▲

steezeburger

16 days ago

[-]

This is really cool! It would be super neat if the nodes were more interconnected, forming a fully connected graph rather than just a tree.

▲

sixo

16 days ago

[-]

This is tracking the user's trajectory through the site, necessarily a tree, not the network structure of W itself.

▲

random3

16 days ago

[-]

How is the user journey through the site necessarily a tree? What prevents the user to create loops through their journey?

▲

lyk2005

15 days ago

[-]

Not an absolute statement, just that it resembles a tree more closely as you branch off slicking on hyperlinks.

▲

steezeburger

16 days ago

[-]

Oo, yeah, that's a good point! I totally see why it was done this way now.

Though I do still think it would be cool to have a toggleable overlay or something that shows the cyclic connections!

▲

phailhaus

16 days ago

[-]

That would be technically more "accurate", but it doesn't yield more useful information and ends up being harder to read.

▲

ldayley

16 days ago

[-]

Interesting. I’ve been using the Wikipedia iOS app (which saves history by the day) to keep track of my personal rabbit hole journeys…

▲

russdpale

16 days ago

[-]

This is one of those ideas where you think "why the hell didn't I think of this?"

▲

RockRobotRock

15 days ago

[-]

POV: It's 4 AM and I still can't fall asleep

▲

jsunderland323

16 days ago

[-]

This is great! Will try to give a try later

▲

kcarter80

16 days ago

[-]

Narrator: they didn't.

▲

KaiMagnus

16 days ago

[-]

Have to admit I'm slightly disappointed that the FF version only shows two users still and one of them is me.

▲

krylon

15 days ago

[-]

I didn't know I needed this.

▲

noashavit

16 days ago

[-]

A graph of Wikipedia rabbit holes

▲

serenayakgun

15 days ago

[-]

wow, this is great

▲

nathell

16 days ago

[-]

Obligatory xkcd: https://xkcd.com/214/

▲

AdmiralAsshat

16 days ago

[-]

My Wikipedia searches are like my porn searches: no one needs to know about them, least of all myself. They bring only shame and remorse.

▲

xcdzvyn

16 days ago

[-]

This is fantastic. Great idea!

▲

nirmel

16 days ago

[-]

I'll mention that I made what could be described as a AI-generated Wikipedia alternative, where you can generate articles on anything with text links on terms that link to new articles that get generated considering the context of the the article path that got you there. I reckon Wiki-enthusiasts won't be disappointed: https://anylearn.ai

▲

dcsan

15 days ago

[-]

Awesome!