Roadblock to full OpenURLness [Updated]


This week I encountered a significant roadblock when trying to use OpenURL in a situation where it is a natural fit. Let me explain the scenario. A scientific researcher at the company where I work built an extensive bibliography of journal articles on a particular subject, and wants to publish that bibliography on the company intranet, complete with hyp[er]text links to the full text. This person initially thought it’d be ok to simply mount the full text articles that he had downloaded in the same webspace as the bibliography, and simply link to the files. Of course, that ideas was quickly shot down. Instead, we thought, why can’t we take this bibliography, check it against our SFX KnowledgeBase to see what articles we have available in full text, and then output the complete OpenURL for each of those articles for this researcher to use when marking up and publishing his bibliography?

The use case sounds straightforward, right? Turns out that it is anything but. I was provided with a text file of citations and was asked to come up with appropriate SFX links for each. Of course I could have manually rekeyed the citations one by one into a search form querying our SFX KB, but that would take quite a long time and quite a bit of effort. I tried to think of how this whole process could be automated.

On the advice of Dan Chudnov I downloaded an open source application written in Perl called Biblio-Citation-Parser, which on the face of it seemed to be exactly what I needed. I need a way to automatically parse the whole list of citations into the necessary chunks of metadata, and then automatically generate an OpenURL for each citation. After trying unsuccessfully to get Biblio-Citation-Parser to work (this isn’t a limitation of the software but of my Perl expertise), I sent queries out to other SFX users as well as to the Code4Lib discussion list. There were several responses from members of the Code4Lib discussion list, some of whom mentioned the application that I already knew about. But it turns out that pretty much nobody in that community [at least among those who responded] had ever used it, and also, that nobody in that community had come up with a good solution to this parsing problem themselves.

Since the original citations were stored in Reference Manager, one of the more common citation management software applications, I wrote back to the colleague who first asked me to help with this situation, asking him if he could provide me with the Reference Manager files. He did, and I downloaded a free trial version of the software, imported the references, then exported them in RIS format. Next, I imported the RIS output file into Zotero, and then exported the whole bibliography from Zotero into a readymade HTML bibliography. Because of Zotero’s built-in COinS functionality, the readymade HTML bibliography is automatically populated with OpenURLs. But I wasn’t done yet. I had to go through each citation by hand and test whether we did indeed have the article in full text, and also, to edit the HTML coding to substitute our company’s specific SFX base URL in each link.

In the end, I achieved what the user wanted — a list of bibliographic references with SFX links as the hypertext links. But it was a huge amount of work, and I kept asking myself, surely there is a better, easier way to do this?! Surely, someone, somewhere has already solved this problem of how to readily parse bibliographic citations in a text file and run them through a process to check for which articles are available in full text?

Maybe there is a much simpler solution and if you know of it, please comment on this post to let me know. I’m left thinking that this whole OpenURL stuff still has a ways to go in terms of ease of implementation for situations like I described.

Information and Links

Join the fray by commenting, tracking what others have to say, or linking to it from your blog.


Other Posts
Pointing fingers at what is not understood
NASIG Site Selection Survey results published

 

Trackbacks

(Trackback URL)

close Reblog this comment
blog comments powered by Disqus