{"id":20014,"date":"2017-10-22T11:48:45","date_gmt":"2017-10-22T10:48:45","guid":{"rendered":"https:\/\/jurnsearch.wordpress.com\/?p=20014"},"modified":"2017-10-22T11:48:45","modified_gmt":"2017-10-22T10:48:45","slug":"sci-hub-provides-access-to-nearly-all-scholarly-literature-er-nope","status":"publish","type":"post","link":"https:\/\/jurn.link\/jurnsearch\/index.php\/2017\/10\/22\/sci-hub-provides-access-to-nearly-all-scholarly-literature-er-nope\/","title":{"rendered":"&#8220;Sci-Hub provides access to nearly all scholarly literature&#8221;&#8230; er, nope."},"content":{"rendered":"<p><a href=\"https:\/\/peerj.com\/preprints\/3100v1\/\">&#8220;Sci-Hub provides access to nearly all scholarly literature&#8221;<\/a> is a new, if misleadingly titled, pre-print at <em>PeerJ Preprints<\/em>.  Mis-titled because it seems to imply to the world that <em>all<\/em> scholarly literature is behind paywalls, or that all the stuff <em>that matters<\/em> is behind paywalls.  It isn&#8217;t, as there&#8217;s also Open Access.  On this point the body of the article contradicts its own title, in terms of the OA coverage&#8230;<\/p>\n<blockquote><p>&#8220;Strikingly, coverage [at Sci-Hub] was substantially higher for articles from closed rather than open-access journals (85.2% versus 49.1%).&#8221;<\/p><\/blockquote>\n<p>So only 49.1% for OA. I&#8217;d guess that&#8217;s mainly because less people need to plug a DOI into Sci-Hub to get an OA article.<\/p>\n<p>However the idea that 49.1% of &#8220;all&#8221; OA articles are in Sci-Hub turns out to be very questionable. Because that 49.1% amounts, according to the article, to a piffling &#8220;1.4m&#8221; articles from 2,650 OA journals. <\/p>\n<p>The whole of OA journal output to date cannot possibly fit into a mere 2.8m articles.  For instance, CORE alone has 5m full-text OA papers, according to their February 2017 blog post&#8230;.<\/p>\n<blockquote><p>&#8220;CORE is thrilled to announce that it currently provides 5 million open access full-text papers.&#8221;<\/p><\/blockquote>\n<p>And that&#8217;s after CORE&#8217;s great difficulties in successfully finding and harvesting full-text (running at around 30%, last I heard) from cantankerous repositories.  (<em>Update: At Dec 2017 CORE is claiming to locally host 9m in &#8220;research outputs&#8221; and &#8220;full text articles&#8221;, inc. 1.8m articles extracted from Elsevier, Springer, Frontiers and PLoS<\/em>).<\/p>\n<p>Consider also that the DOAJ currently lists just over 10,000 OA journals, even after its recent\/ongoing clean-up of titles. DOAJ made 2.5m articles searchable in full-text at 2016, and its full-text holdings hardly even scratch the surface of the contents of DOAJ journals.<\/p>\n<p>Given numbers like these and <a href=\"https:\/\/peerj.com\/preprints\/3119v1\/\">others<\/a>, bloggers and journalists should be wary of glancing at this new <em>PeerJ Preprints<\/em> article and making claims such as: &#8216;Sci-Hub shown to provide access to nearly half of all OA articles!&#8217;<\/p>\n<p>How to explain the mis-match?  It appears to be a result of the article&#8217;s authors using a database which is very partial in its OA coverage&#8230;<\/p>\n<blockquote><p>&#8220;To define the extent of the scholarly literature, we relied on DOIs from the Crossref database&#8221;.<\/p><\/blockquote>\n<p>After cleaning that haul&#8230; <\/p>\n<blockquote><p>&#8220;our catalog consisted of 22,193 journals encompassing 57,074,208 articles. Of these journals, 4,345 (19.6%) were inactive (i.e.\u00a0no longer publishing articles), and 2,650 were open access (11.9%). Only two journals were inactive and also open access.&#8221;<\/p><\/blockquote>\n<p>Well now, that last point is interesting in its own right. Is CrossRef throwing out all inactive OA journals? It looks like it. If so, then that seems a bit unfair on OA &mdash; but perhaps it&#8217;s happening because a CrossRef bot is just automatically tracking the journals in the DOAJ.  It&#8217;s well known that the DOAJ removes a journal as soon as it ceases or takes a break from publishing, and that would seem to neatly explain the apparent lack of inactive OA journals in CrossRef.<\/p>\n<p>(If that&#8217;s the case then I&#8217;d also suspect CrossRef may not even be tracking <em>all<\/em> of the DOAJ: since the journals of &#8216;the top 10 publishers&#8217; in the DOAJ currently stand at 2,282 OA journal titles. Add a few worthy niche publishers and &#8216;learned association&#8217; titles, and I&#8217;d be willing to bet that CrossRef&#8217;s 2,650 OA total would be matched fairly neatly.  CrossRef&#8217;s title .XLS is <a href=\"https:\/\/www.crossref.org\/titleList\/\">here<\/a>, if anyone cares to do a more precise tally against the DOAJ&#8217;s .XLS and then sort the results by publisher).<\/p>\n<p>Which means Sci-Hub is still a long way, probably a <em>very<\/em> long way (15%?), from useful coverage of all OA journal articles.  And it may never offer the claimed&#8230; &#8220;access to nearly all scholarly literature&#8221;.  Partly because pirates have little or no interest in pirating &#8216;free&#8217;, and indeed usually take professional pride in shunning &#8216;free&#8217;.  Even if they were aiming to pro-actively include OA, neither Sci-Hub or LibGen would be able to provide the public with Google&#8217;s speed, relevancy ranking, up-time, traffic management etc. Nor could they remove dead links in the same speedy way as Google does &mdash; and I doubt they want to try to mirror the entire OA corpus locally (although they might harvest and ingest things like the CORE full-text, fairly easily). More likely that they will start to detect a DOI request as being OA, and then bounce the user to the public full-text without re-hosting it themselves. In which case I don&#8217;t see a Sci-Hub\/LibGen combo becoming &#8220;the one box to rule them all&#8221;.  <\/p>\n<p>Which isn&#8217;t to say that there won&#8217;t one day be some whizzy Web browser addon that provides all sorts of sophisticated automated overlays and injections into your Google Search results, far beyond a basic &#8220;article has DOI, look it up on Sci-Hub&#8221; button for those too lazy to do a manual copy\/paste lookup. In which case it might be possible to approximate a melding of Google Search and Sci-Hub on a single page of results.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;Sci-Hub provides access to nearly all scholarly literature&#8221; is a new, if misleadingly titled, pre-print at PeerJ Preprints. Mis-titled because &hellip;<\/p>\n<p><a href=\"https:\/\/jurn.link\/jurnsearch\/index.php\/2017\/10\/22\/sci-hub-provides-access-to-nearly-all-scholarly-literature-er-nope\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16],"tags":[],"class_list":["post-20014","post","type-post","status-publish","format-standard","hentry","category-spotted-in-the-news"],"_links":{"self":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/posts\/20014","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/comments?post=20014"}],"version-history":[{"count":0,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/posts\/20014\/revisions"}],"wp:attachment":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/media?parent=20014"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/categories?post=20014"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/tags?post=20014"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}