Ask HN: What are your worst pain points when dealing with scientific literature?
7 points
2 months ago
| 5 comments
| HN
My background is mainly in computer science / software engineering, but I've been working with scientists (mainly biologists) for most of my career. One of the biggest areas of friction and frustration I've encountered seems to be around how to effectively extract value from the literature -- whether it be raw data in a form that can be reused, or higher-level knowledge (e.g., how do you consume multiple studies about a given topic and synthesize them into a coherent mental model to inform your own research?).

I'm interested in building tools to improve this, so I'd really appreciate hearing what you all think are the biggest challenges in this area. And if there are existing tools that you consider "secret weapons" for doing better at science, I'd love to hear about those as well.

kingkongjaffa
2 months ago
[-]
Some kind of academic CRM where you can start a new 'project' with some keywords and it assembles highly cited works and the authors of the works.

From there you can recursively search through the bibliography of the seminal works and the works that cite the seminal work to build a research map.

When researching different fields you often end up finding a) who the top researchers are and then you want to go read all of their stuff. b) who is currently working on the thing in $current_year who you might want to contact and talk to.

for example, when it comes to internal combustion engine research, Heywood is the man: https://scholar.google.co.uk/scholar?hl=en&as_sdt=0%2C5&q=JB...

(most cutting edge research is locked away in the automotive company's sadly).

Or in computational fluid dynamics the 'entry point' to the field is basically JD Anderson.

In both cases you're like 6 degrees of separation away from the cutting edge in several micro topics of active research.

> synthesize them into a coherent mental model to inform your own research

For the mental model there's no real way around sitting and reading a bunch of papers, I basically taught myself how to read papers efficiently and then read papers every day (often dead ends which can be quickly discounted.)

reply
kingkongjaffa
2 months ago
[-]
The worst pain point actually, however will always be lack of access to research unless you go to a university that pays for all of the 100's of journals.
reply
rkwz
2 months ago
[-]
> Some kind of academic CRM where you can start a new 'project' with some keywords and it assembles highly cited works and the authors of the works. From there you can recursively search through the bibliography of the seminal works and the works that cite the seminal work to build a research map. When researching different fields you often end up finding a) who the top researchers are and then you want to go read all of their stuff. b) who is currently working on the thing in $current_year who you might want to contact and talk to.

Yes, this would be great. Would also prefer if there's a way to trace back to the "origin" papers from which other papers build upon, and if you don't understand a term or concept, can search papers that explain it better.

reply
solardev
2 months ago
[-]
Not a professional, but as someone with a science degree and casual interest in reading papers now and then, I wish there was a:

1) A better (cheaper) way to access them. It doesn't necessarily have to be free as in SciHub, but there's no way I'm going to pay $80 as an individual to read one paper.

2) An easy way to summarize them, ask questions of it, etc. Google's NotebookLM (https://notebooklm.google.com/) is actually decent at this... upload a PDF and you can ask it questions about that content with minimal hallucination and citations back to the source. However, it's buggy (some files just never finish loading, others won't accept any prompt at all). And it's probably another short-lived experiment soon to meet the Google Graveyard :(

I would be willing to pay maybe $10-$20/mo for a service that can do both (provide Netflix-like access to papers, and also use LLM to summarize them and answer questions). Bonus points if it can do its own meta-analysis of multiple related papers and easily summarize them.

I suspect journal publishers would be heavily resistant to any of that. Probably a more technical workaround would be a web browser extension that uses public/school library logins to fetch papers from the clientside and then mirror them into the service. There is something like this in the legal world, https://free.law/recap to bypass access fees. But there's no copyright concerns there (since the documents themselves are public domain works of the federal gov, different from scientific papers).

reply
enceladus06
2 months ago
[-]
(2) Consensus might work for asking LLM to summarize papers https://consensus.app/search.
reply
phewson
2 months ago
[-]
The fact the entire model is based on articles in a printed journal. I really like the Cochrane Collaboration systematic reviews. As part of the article interested authorities can ask pertinent questions and receive responses. The best we get in most journals is "cited by" links but that's it. Is it being cited because a point is contested. If so, what point is contested. Does the citing paper make a good case. It it being cited as an inspiration for some derivative research with new applications in a different context; if so does that reinforce the methodology in the paper you are looking at. Why not have something that quickly helps you determine whether the paper has been reproduced, and maybe even uprate it if so. And so on.
reply
lbhdc
2 months ago
[-]
The biggest painpoints for me are discovery, and access. It can be really difficult to find papers on the topic I am researching, and often getting access to the papers I find is hard.
reply
noncovalence
2 months ago
[-]
A better way to organise and find papers I've looked at before. For example, being able to ask an LLM "what was that paper again that tried using X to solve Y but ran into some issue" which I vaguely remember skimming a month ago but only just realised that it might actually be useful to me, and it will find the right one from my reference manager and/or subset of my browser history.
reply