FilterHN

Show HN: Grab a Random ArXiv Paper

13 points

by jegp

6 months ago

| past

| 2 comments

| jepedersen.dk

| HN

I needed a way to grab a random paper from arXiv, so I built one and wanted to share it with you.

It 1) picks a random topic (of all the cs., econ., math.* etc. topics) 2) finds the maximum amount of papers in that topic, and 3) queries for a random paper in that topic.

Note that this skews the distribution heavily in favor of topics that are less common, but it should get the job done. Suggestions for improvements are welcome.

▲

dginev

6 months ago

[-]

One can also grab a random arXiv paper in HTML (via ar5iv), if that was desired.

Just visit:

https://ar5iv.labs.arxiv.org/feeling_lucky

▲

jegp

6 months ago

[-]

Is the code for the "feeling lucky" selection mechanism open? Or, do you know how they select papers at random?

▲

dginev

6 months ago

[-]

Sure, the repository is open source (linked from the front page).

The selection did not get much thought at all, just a Rust rand shuffle over all ids performed at first page visit and then cached: https://docs.rs/rand/latest/rand/seq/trait.SliceRandom.html#...

I had all IDs already computed for the previous/next article navigation feature, so it seemed fun to reuse them.

▲

jegp

6 months ago

[-]

Thanks for this! If I knew this existed, I wouldn't have built the page myself.

▲

vhantz

6 months ago

[-]

> Note that this skews the distribution heavily in favor of topics that are less common, but it should get the job done. Suggestions for improvements are welcome.

You could use the max number of paper in each topic to weight it and make the distribution uniform.

▲

jegp

6 months ago

[-]

That's a great point. I thought about something similar, but I also realized the arXiv numbers are growing like crazy, so I wonder how long it'll take for the (hardcoded) numbers to be deprecated. One could of course add some kind of cronjob to update the numbers, but that sounds like a lot of work...