Bingo! Towards a description of a low-overhead academic Firefox plugin for the discovery and sharing of free scholarly articles:
1. At the click of a toolbar button, a web-browser plugin ‘reads’ the text of an online academic paper you’re browsing (inc. a PDF if it’s opened within a browser). It seeks and finds the references / bibliography section.
2. The plugin automatically detects any full well-formed academic references in the standard formats, extracts and de-duplicates these, and then uses javascript to create simple new links. Each of these new links is built on-the-fly and embeds the exact title of the article, alongside the surname of the author. Possibly these links could be presented in a sidebar, or even as a page overlay. The plugin doesn’t try to seek or add any direct URLs for the article.
e.g.: it would search for… Craig “Werewolf Cinema of the 1930s”
3. The user can set the plugin to feed the overlay link to the main Google index or some other suitably deep search engine.
4. If the user clicks on such a “fuzzy” overlay link, they will hope to discover a direct link to the free full-text of the article near the top of the search results.
5. If they do find free full-text, then the user has the ability to click a simple feedback button on the browser toolbar. This passes the surname and article title information back to a public database of open academic article titles.
6. If it’s an article name/title the database hasn’t seen before, it will flag that unique combination of author and article title as having a reasonable probability of leading to open full-text. The database could also do automatic tagging and sorting by academic discipline, as judged by detecting common keywords and phrases in the title(s). At no time are any URLs (or downloading of pages/PDFs by the plugin database) involved in the process.
Once the database is established, in version 2.0 of the plugin a feedback loop might be created. The database would now indicate, via a three-star rating alongside the link, if the title had a high probability of being freely available somewhere. This might not need a dedicated server — the plugin might instead locally install a huge list of free article titles as detected by v1.0. For user convenience this list could be split up by discipline.
The plugin is thus using Surname “Article title” as the implicit unique identifier for the article, which is good enough for a search-engine even if it probably causes librarians to shiver in horror.
The database can’t use journal titles to determine if an article should be flagged as “likely to be free”, since journal titles are rarely extractable or detectable by search-engines in individual open-access articles. Nor can it use base URLs, since the plugin aims to completely bypass the need for direct-to-article URLs. The public database of open academic article titles would need to be hidden from search-engines, so as not to contaminate search results.