I love the simplicity of Rmarkdown to produce documents and I am maintaining my own library in a Bibtex (*.bib) document. I'm using these instructions to cite in document (bibtexkey leaded by "@" symbol).
My question is: Is there a way to scan the Rmarkdown document (*.Rmd) and extract a list of bibtexkeys cited in the document? This could be great to produce a subset of my library to be attached to the project instead of all the ca. 6000 references accumulated in my library.
After exploring several alternatives, I came to the function
str_extract()
from the packagestringr
. Here I am assuming, you have a bibtex library including all cited references (usually more). I also combined the example of Oto Kaláb with an own because of the different bibtexkey styles.First the Rmd document.
The next code block is commented. At the end we obtain a vector with all cited references, which could be compressed by
unique()
.The resulting vector can be then aggregated to know the frequency of appearance in the text (I think also useful to detect rare citations). I'm also thinking to extract the "line number" in the output.