{"id":9756,"date":"2014-03-21T08:51:20","date_gmt":"2014-03-21T08:51:20","guid":{"rendered":"http:\/\/jurnsearch.wordpress.com\/?p=9756"},"modified":"2014-03-21T08:51:20","modified_gmt":"2014-03-21T08:51:20","slug":"on-clearing-the-debris","status":"publish","type":"post","link":"https:\/\/jurn.link\/jurnsearch\/index.php\/2014\/03\/21\/on-clearing-the-debris\/","title":{"rendered":"Google Scholar&#8217;s debris"},"content":{"rendered":"<p>I found a 2013 article from geoscientists who had tested Google Scholar: <a href=\"http:\/\/www.geosociety.org\/gsatoday\/archive\/23\/10\/article\/\">&#8220;Literature searches with Google Scholar: Knowing what you are and are not getting&#8221;<\/a>. Although the body of the paper states that their test phrase was <em>&#8220;wildfire-related debris flows&#8221;<\/em>, the data shows they actually tested Scholar with the  keywords <em>wildfire-related debris flows<\/em>.  They reportedly found that&#8230;<\/p>\n<blockquote><p>&#8220;free articles were available in PDF format for 88% of citations returned by Google Scholar. They were available from open-access journals or via links to organizational sites where authors had posted their publications.&#8221;<\/p><\/blockquote>\n<p>However if you actually look at <a href=\"ftp:\/\/rock.geosociety.org\/pub\/reposit\/2013\/2013316.pdf\">their linked search-results data file<\/a>, then the above statement needs additional clarification.  Since it&#8217;s clear that paywall articles from Elsevier, Springer and the like, appearing in their Scholar results, were being counted toward those &#8220;free articles&#8221;.  It turns out that many of these were &#8220;free&#8221; only via a DigiTop proxy overlay for Scholar that is, <a href=\"http:\/\/digitop.nal.usda.gov\/digitop_interim\/proxy_stop403.html\">in the words of DigiTop<\/a>, &#8220;available to USDA employees only&#8221;.  Nice if you work under the U.S. Department of Agriculture umbrella, but it seems that those outside have to pay.<\/p>\n<p>Does Google Scholar perhaps need to add some kind of &#8220;paywall box detector&#8221; to its scraper bots?  Then perhaps something like &nbsp;<strong>[PDF] [-||-]<\/strong> &nbsp;could be added on the right-hand column of the Scholar results, to indicate a PDF that&#8217;s &#8220;available maybe&#8221; &mdash; but which will prove to have a paywall that needs to be either backed out from or negotiated?  And perhaps &nbsp;<strong>[PDF] [-~-]<\/strong> &nbsp;could indicate a genuine direct link to a <em>bona fide<\/em> PDF file?<\/p>\n<p>Anyway&#8230; this is what geoscientists are talking about when they refer to <em>wildfire-related debris flows<\/em>.  Seems like it <em>might<\/em> be a geological process that intelligent farmers, hiker-campers, and treeline homesteaders around the world would like to learn some precise details about&#8230;<\/p>\n<p><a href=\"https:\/\/jurn.link\/jurnsearch\/2014\/03\/33-debav.gif\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/jurn.link\/jurnsearch\/2014\/03\/33-debav.gif\" alt=\"33-debav\" width=\"422\" height=\"288\" class=\"alignnone size-full wp-image-9795\" \/><\/a><\/p>\n<p><a href=\"https:\/\/jurn.link\/jurnsearch\/2014\/03\/3607389_orig.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/jurn.link\/jurnsearch\/2014\/03\/3607389_orig.jpg\" alt=\"3607389_orig\" width=\"529\" height=\"396\" class=\"alignnone size-large wp-image-9796\" \/><\/a><\/p>\n<p>Giant mudslides, basically.<\/p>\n<p>Incidentally, the same <em>wildfire-related debris flows<\/em> search <a href=\"http:\/\/www.jurn.org\/\">in JURN<\/a> needs to be tightened up just a little for strong results. Using <em>wildfire-related &#8220;debris flows&#8221;<\/em> works better, though the first six pages of good results do stray just a little (to pick up what seem to be three articles about prehistoric &#8216;dinosaur-era&#8217; debris flow events).  Yet even on this test JURN appears to be doing about twice as well as Google Scholar in terms of getting open articles, once Scholar&#8217;s &#8216;false-positive&#8217; paywall PDFs from Elsevier &amp; co. are subtracted from Scholar&#8217;s results.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I found a 2013 article from geoscientists who had tested Google Scholar: &#8220;Literature searches with Google Scholar: Knowing what you &hellip;<\/p>\n<p><a href=\"https:\/\/jurn.link\/jurnsearch\/index.php\/2014\/03\/21\/on-clearing-the-debris\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3,5,9,16],"tags":[],"class_list":["post-9756","post","type-post","status-publish","format-standard","hentry","category-ecology-additions","category-how-to-improve-academic-search","category-jurns-google-watch","category-spotted-in-the-news"],"_links":{"self":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/posts\/9756","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/comments?post=9756"}],"version-history":[{"count":0,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/posts\/9756\/revisions"}],"wp:attachment":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/media?parent=9756"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/categories?post=9756"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/tags?post=9756"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}