• Directory
  • FAQ: about JURN
  • Group tests
  • Guide to academic search
  • JURN’s donationware
  • Links
  • openEco: titles indexed

News from JURN

~ search tool for open access content

News from JURN

Category Archives: How to improve academic search

UK National Bibliographic Knowledgebase

06 Monday Feb 2017

Posted by futurilla in How to improve academic search, Spotted in the news

≈ Leave a comment

Newly announced for the UK…

“Today Jisc announced that OCLC, the global library cooperative, has been awarded the contract to develop a new national bibliographic knowledgebase (NBK).”

Judging by the initial press-release, the focus seems likely to rest first on cohering UK academia’s metadata management for digital book collections. This will in time…

“enable shared bibliographic metadata to flow into … global search engines”

Hopefully that means Google Search, as well as Google Scholar (which are two separate systems and databases).

DOI availability levels

21 Saturday Jan 2017

Posted by futurilla in How to improve academic search, Spotted in the news

≈ Leave a comment

“Availability of digital object identifiers in publications archived by PubMed”, 3rd January 2017. For…

“the period 1966–2015 (50 years). Of the 496,665 articles studied over this period, 201,055 have DOIs (40.48%).”

So just under 60% are without DOIs, and that’s for biomedical in PubMed — albeit when including thirty years of pre-1995 (pre the mass Internet) coverage. More recently, for 2015, the study found that 13.5% of new content was still without a DOI.

The DOI-free figures for the humanities will be far higher, according to “Availability of digital object identifiers (DOIs) in Web of Science and Scopus”, February 2016…

“Many journals related to the Natural Sciences and Medicine with considerable impact have no DOI. Arts & Humanities WoS [Web of Science] categories have the highest percentage of documents without DOI.” … “exceeding 50% only since 2013. The observed values for Books and Proceedings are even lower despite the importance of these document types …”

As for DOI availability within articles in repositories, IRUS-UK provides a “DOI Summary” field giving “the numbers and percentages that have DOIs available” in UK repositories, although the access to their datasets is controlled. IRUS-UK has no summary infographics that I could find, relevant to DOI availability. But it would be interesting to determine what proportion of UK repository free/open journal articles have DOIs.

“Is the Open Access discoverability problem solvable?

30 Saturday Apr 2016

Posted by futurilla in How to improve academic search, Spotted in the news

≈ Leave a comment

A new Medium article, from the head of Ingenta Connect, “Is the Open Access discoverability problem solvable? And whose problem is it?”. It’s a cursory look at the problem, but even then it’s interesting for what it doesn’t say…

* For “institutional librarians” the author seems to imply that their future role is only to be in one-to-one “mentoring and facilitation” of researchers. No mention of anything else, like the big publishers working with librarians to craft and adopt universal OA-status tagging code for discoverability.

* For “scholarly authors” he only suggests academics might become marketeers for their own papers. Frankly, this seems like a waste of their valuable time. Given the salaries that full-time research academics get, they can afford to hire a virtual assistant. To promote four or five papers a year outside of one’s own disciplinary niche, simply go to UpWork (or similar) and hire your personal marketeer at $180 a paper (to get someone of quality, for a day and-a-half of work). One could probably find a way to write the $900 bill off against tax each year. Of course that assumes one is publishing something worth reading, rather than academic shovel-ware intended to tick boxes inside one’s own institution.

* For the big “publishers” the article vaguely suggests they need to embrace openness. Though perhaps only in order to capture it for their own purposes, via a… “drawing-together of all the dispersed OA content silos into one place”. Well, for their own limited set of OA content, the big publishers can solve that on Monday morning if they really want it. They just have to allow the seemingly-stalled Paperity to import the OA-only article feeds of Elsevier, Brill, Degruyter, Wiley and others, so that Paperity has full coverage of all OA articles from the big publishers.

doai.io

05 Saturday Mar 2016

Posted by futurilla in How to improve academic search, Spotted in the news

≈ Leave a comment

A new OA tool from the French, doai.io. If you’ve found a live DOI Web link that can only take you to a paywall article, then replacing http://dx.doi.org/ with http://doai.io/ will get a URL that tries to find a free version via BASE.

BASE is only middling for finding open access articles. It currently has 3.1m OA journal articles in English, with those being overwhelmingly in science, technology and medicine…

3mill

… but it’s reported that doai.io now also looks for the article posted on ResearchGate.

The doai.io coding was completed back in November and it’s only just gone public, so it’s early days. They don’t yet have a Web browser add-on that will automate the fallback from dx.doi.org to doai.io. One has to wonder if the same add-on, which would presumably be open sourced, would be quickly forked to also serve Sci-Hub (which at present only has a Chrome add-on, and no Firefox add-on).

Making Sense of the Flood

02 Wednesday Mar 2016

Posted by futurilla in How to improve academic search

≈ Leave a comment

An interesting discussion on UI and navigation for academic content discovery, from the recent “The Researcher to Reader” conference (London, Feb 2016): “Making Sense of the Flood: ways to curate content and adapt search to deliver serendipity in discovery” and the following Q&A.

That Google moment…

18 Monday Jan 2016

Posted by futurilla in How to improve academic search, Spotted in the news

≈ Leave a comment

Byron Russell, manager of Ingentaconnect, wants to search only for freely re-usable Open Access articles, but finds that ‘the Google moment’ for such a search hasn’t arrived yet…

Run a Google search on “Mendelian dominance open access” and the first two hits are for one publisher – the OMICS Group.

Judging from my Google Search results to recreate his search, what he actually tried to search for was: Mendelian dominance open access — without the quote marks. Difficult to see how such a loose search would find something worth having. But even if he’d then gone on to say… ‘so, we need to teach students how to search Google properly…’, his article’s point would have been much the same. Even using sophisticated Google search methods, one still gets mired amid a swamp of Powerpoints, K-12 lesson plans, student quizzes, wikis, high-ranking predatory journal articles and other junk.

JURN does a fairly good job with…

     Mendel “dominance” “Commons Attribution” -noncommercial

Having Mendel without quote marks in that way, catches Mendel | Mendel’s | Mendelian | since Google automatically expands the name.

The target CC content, as currently found on OA journals via JURN, seems to reside almost entirely in PLOS, Pubmed, Springer and a few others.

But there’s more in the hybrid journals. So one can also approximate a main Google Search across the large publishers, Elsevier for instance, via something like…

     site:www.sciencedirect.com/science/article/ “Commons Attribution” -noncommercial -“non-commercial”

For Oxford Journals it’s slightly different…

     inurl:oxfordjournals.org “Commons Attribution” -“non-commercial”

(Google will probably flash up an annoying “captcha” to make sure you’re not a robot, at that point, if you’ve worked the examples down to this point).

And so on… one could just work through the larger publishers that way. For Springer most of the work has already been done by Paperity, although Paperity still lacks coverage of a couple of OA Springer titles.

It’s certainly not ideal, as Russell suggests. On the other hand, one might ask why someone needs to find just the CC-BY content on a topic. Perhaps it’s actually quite useful that a big publisher would find it difficult to automatically siphon all known CC-BY articles and books into its own giant repository, slap on some search, mining, overlay journal and themed book-compiling tools, and then sell access to it.

Hypothes.is

05 Saturday Dec 2015

Posted by futurilla in How to improve academic search, New media journal articles, Spotted in the news

≈ Leave a comment

Hypothes.is lets visitors annotate your Web pages, via a pop-out sidebar filled with a Twitter-like stream of visitor comments/links.

It’s the perennial idea of re-inventing the classic footer comments box as a uniform annotation layer, something that has been tried many times over the past 20 years. Google ran such a tool for three years before closing it down. Such services tend to end up as dank wastelands filled with Viagra ads, troll spoor and link-rot.

But this time might be different. There’s a couple of somewhat workable-looking early W3C standards (more are on the way), new options for moderation and closed group working, and an impressive range of publishers and universities are now planning to discuss how social annotation might proceed for scholarship…

Our goal is that within three years, annotation can be deployed across much of scholarship.”

The ‘can’, not ‘will’, is probably because the big publishers like Elsevier et al are noticeably absent from the list of Hypothes.is academic supporters. I can’t see them liking the idea that an open commenting system is being laid over/into their content. The sidebar’s content seems to be outside the control of the page owner, so I could theoretically pitch up at an Elsevier $66 article paywall and say “there’s a free PDF of this article over at Site XYZ…”

sidebar

So how does it work, at present? Imagine that someone took a Web page’s comments section from the bottom of the page, and instead put it into a standalone and uniform sidebar. Someone adding a comment also has the option to highlight a bit of text on the page, automatically hyperlinking their comment to it. Other visitors see the comments and the highlighted text. Obviously various Twitter-ish and Wiki-ish features could be added, but that’s the basic functionality.

A pop-out sidebar means that Hypothes.is can work with PDFs, and the Hypothes.is roadmap suggests that annotation of data / images / videos / ePubs could be on the way soon. So it seems Hypothes.is needs fixed browser-displayed content, located on a URL that’s never going to break — a natural fit with things like PDFs in repositories and digital libraries. But even in that relatively limited arena, who will do all the hand annotation, moderation, linkrot checking and repair need to keep such a service usable across a billion or more pages and documents? I somehow doubt that overworked and underpaid repository staff will be skipping through the library stacks with joy, at being told they must also become the herders of social media cats and the tamers of trolls.

Institutional Repositories and ‘dark deposit’

03 Thursday Dec 2015

Posted by futurilla in How to improve academic search, Open Access publishing

≈ Leave a comment

OA expert Richard Poynder has a new PDF paper on his blog “Open Access, Almost-OA, OA Policies, and Institutional Repositories”. In it he looks at how many fulltext papers are in various repositories, and explores the trend toward the dark side that involves records that state of the PDF that ‘this item is embargoed until…’ .

Poynder’s article also has details and very extensive analysis of the “Button”, sometimes seen on repository record pages, which allows one to request a fulltext copy of an embargoed repository item.

As an aside, he notes…

a suspicion I have long had that repository managers are depositing a lot of historical data.”

Yup, I can confirm that feeling. Not all, of course, but a few do have a lot of historical material jumbled in. I guess that may be because they only have funds and staff to run one repository, which then has to hold everything. Only a few large universities sensibly split their repositories into separate servers/URLs, thus:

* a slimline one for public access to theses and masters dissertations.

* one to capture the flow of all the public-access scholarly OA items, sometimes even with filters that can knock out preprints, conference papers, or which can focus only on papers from individual journal titles.

* plus a more conventionally rambling repository to hold the digitisation of pre-1960s content, image collections, university ephemera, and the ‘local interest’ collections such as newspapers and trade magazines. Sometimes this has a slick public-friendly ‘showcase’ front-end, sometimes it’s just a list browse.

* big U.S. law schools increasingly have their own separate repositories, and their own OJS server for their journals.

* and running alongside all those, an OJS installation to run the university’s current journals (some universities even split their mainstream academic journals from the graduate school / undergraduate / creative writing / alumni magazine titles, having the latter on a second OJS installation).

It’s my feeling that even smaller universities may soon have to adopt such splitting strategies, given the tidal wave of OA content that’s looming on the horizon.

When servers do get split up like this there’s often no public interlinking between them, even in terms of using the front page of each as a platform to publicise the existence of the others.

Dissemin

15 Sunday Nov 2015

Posted by futurilla in How to improve academic search, Spotted in the news

≈ Leave a comment

Have you spotted an academic who is supposed to have made their work OA, but who hasn’t done so? Dissemin checks their OA status, and provides a way to upload their papers to the Zenodo repository (CERN’s data repository).

zenodo

Placing text

29 Tuesday Sep 2015

Posted by futurilla in How to improve academic search

≈ Leave a comment

A fascinating and very clearly written April 2015 article about automatically mining geolocation points out of plain text: “Mapping Words: Lessons Learned From a Decade of Exploring the Geography of Text”…

In Fall 2014 I collaborated with the US Army to create the first large-scale map of the geography of academic literature and the open web, geocoding more than 21 billion words of academic literature spanning the entire contents of JSTOR, DTIC, CORE, CiteSeerX, and the Internet Archive’s 1.6 billion PDFs relating to Africa and the Middle East, as well as a second project creating the first large-scale map of human rights reports. A key focus of this project was the ability to infuse geographic search into academic literature…”

We probably need a name for such activities, and also for mining eco/geo data out of old paintings and photographs of landscapes. Geo-mining is too 20th century and eco-unfriendly. Geo-gleaning and Geo-gleaner are terms that have a certain poetry about them, while also suggesting both the curatorial and the imprecise nature of the techniques.

← Older posts
Newer posts →
RSS Feed: Subscribe

 

Please become my patron at www.patreon.com/davehaden to help JURN survive and thrive.

JURN

  • JURN : directory of ejournals
  • JURN : main search-engine
  • JURN : openEco directory
  • JURN : repository search
  • Categories

    • Academic search
    • Ecology additions
    • Economics of Open Access
    • How to improve academic search
    • JURN blogged
    • JURN metrics
    • JURN tips and tricks
    • JURN's Google watch
    • My general observations
    • New media journal articles
    • New titles added to JURN
    • Official and think-tank reports
    • Ooops!
    • Open Access publishing
    • Spotted in the news
    • Uncategorized

    Archives

    • February 2026
    • January 2026
    • October 2025
    • May 2025
    • April 2025
    • September 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • June 2023
    • May 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
    • October 2014
    • September 2014
    • August 2014
    • July 2014
    • June 2014
    • May 2014
    • April 2014
    • March 2014
    • February 2014
    • January 2014
    • December 2013
    • November 2013
    • October 2013
    • September 2013
    • August 2013
    • July 2013
    • June 2013
    • May 2013
    • April 2013
    • March 2013
    • February 2013
    • January 2013
    • December 2012
    • November 2012
    • October 2012
    • September 2012
    • August 2012
    • June 2012
    • May 2012
    • April 2012
    • March 2012
    • February 2012
    • January 2012
    • December 2011
    • November 2011
    • October 2011
    • September 2011
    • August 2011
    • July 2011
    • June 2011
    • May 2011
    • April 2011
    • March 2011
    • February 2011
    • January 2011
    • December 2010
    • November 2010
    • October 2010
    • September 2010
    • August 2010
    • July 2010
    • June 2010
    • May 2010
    • April 2010
    • March 2010
    • February 2010
    • January 2010
    • December 2009
    • November 2009
    • October 2009
    • September 2009
    • August 2009
    • July 2009
    • June 2009
    • May 2009
    • April 2009
    • March 2009
    • February 2009

    Proudly powered by WordPress Theme: Chateau by Ignacio Ricci.