Literature searches

One skill required during the last degree was being able to efficiently review available literature on a topic. The process itself is interesting. At the initial stage it’s as you would expect - list all possible keywords or terms that you think are relevant, then run those searches on PubMed, Scopus, Google Scholar.

You need a decent reference manager. I use LaTeX and BibTeX, so tools like KBibTeX and JabRef are very useful. KBibTeX in particular allowing interesting indexing of the growing reference database - by journal, author, etc.

If you’re searching in a specific field (mine was vesicle recycling in synaptic cell biology), then other patterns become apparent. This field has had long running research, but a comparatively small number of labs involved. It’s then interesting to track the publications of both the labs (basically the principal investigator in biomedical research), and the individual authors, as both labs and researchers tend to follow the same topic for years - this kind of science requires significant time and resource investment. If you collate the literature in this way, you can see the stories and theories proposed by individual labs evolve over time. As researchers transfer from one lab to another over the years, you can see the evolution of their theories as knowledge of both the theoretical and practical experimental protocols spread. It’s pretty cool when you see it.

Some of the indexing sites let you track these connections between authors. It makes for interesting graphs. I had a shot at building my own version, with the intention of showing the evolution of theories, linking the collaboration between researchers and labs towards a common goal, and the time evolution of conflicting theories as each lab sought stronger evidence (or disproving evidence - it’s science) for their theories. I dropped the project though as it became time consuming for not a lot of benefit beyond satisfying curiosity.

One difficulty I had to deal with (and previously discussed here) was that different labs may have very different views on the same topic, each supported by their published literature. It means you can’t work from a single source of truth. Each of the conclusions put forward in the published literature has to be weighed on the strength of the published evidence. Conflicts are difficult or impossible to resolve; it’s just something you have to work around in your own research.

I had also attempted to handle these conflicts by grouping the research under a host of experimental parameters - generally there were differences enough in the protocols used to possibly explain the differences in results and the conclusions drawn. The idea was to have a web of evidence from the conclusions drawn, back through the analysis and observations of individual papers, back to the result datasets used to infer these results, finally back to the experimental protocols and environment used to gather each result dataset.

The intention was to have a navigable path from the conclusions put forward by the various labs back to the raw data supporting it, along with confidence measures for each step of the way. I had hoped that such a graph, suitably annotated, would provide a confidence measure of each conclusion, as individual studies either reinforced or weakened each other’s conclusions. It was a fun idea, but rapidly became a massive undertaking and beyond available resources.

Shared at https://www.linkedin.com/pulse/literature-searches-donal-stewart