{"id":1584,"date":"2009-06-12T22:57:33","date_gmt":"2009-06-12T22:57:33","guid":{"rendered":"http:\/\/jurnsearch.wordpress.com\/?p=1584"},"modified":"2009-06-12T22:57:33","modified_gmt":"2009-06-12T22:57:33","slug":"open-access-search","status":"publish","type":"post","link":"https:\/\/jurn.link\/jurnsearch\/index.php\/2009\/06\/12\/open-access-search\/","title":{"rendered":"Open access search?"},"content":{"rendered":"<p>Following on from <a href=\"https:\/\/jurn.link\/jurnsearch\/2009\/06\/12\/a-sea-of-cses\/\">my previous post<\/a>&#8230; a search for <em>&#8220;open access&#8221; site:www.google.com\/coop\/<\/em> was discouraging.  There are about twenty &#8220;living-dead&#8221; Custom Search Engines from 2006, but no large ones updated after 2006 (so far as I could tell from a quick visit).<\/p>\n<p>Pouring out all this open access content is all very well, but where&#8217;s the competition and development in open access <em>search<\/em>? <\/p>\n<p>And where are the simple common standards for flagging open content for search-engine discovery and sorting, for that matter?   Judging by the structure and look of most academic repositories, internet search-engines are the last things on their minds.<\/p>\n<p>Now of course I&#8217;m viewing things from the outside,  as an independent curator and social entreprenuer, not a librarian or OA evangelist. But it seems to me that burying your PhD thesis deep in a repository cattle-car  &mdash; seemingly with only a few keywords, an ugly template and an impenetrable URL for company  &mdash; isn&#8217;t serving it or the author very well. Especially in terms of metadata and tagging leading to full-text search discovery.  As the authors of &#8220;Experiences in Deploying Metadata Analysis Tools for Institutional Repositories&#8221; recently wrote in <em><a href=\"http:\/\/catalogingandclassificationquarterly.com\/ccq47nr3-4.html\">Cataloging &amp; Classification Quarterly<\/a><\/em> (No. 3\/4, 2009)&#8230;<\/p>\n<blockquote><p>&#8220;Current institutional repository software provides few tools to help metadata librarians understand and analyse their collections.&#8221;<\/p><\/blockquote>\n<p>Which doesn&#8217;t bode well for search-engines aiming to hook into and sort the same metadata. That sort of statement might have been acceptable in 1999, but it&#8217;s a damning statement to hear from librarians in 2009. And another paper in the same issue concludes that there is&#8230;<\/p>\n<blockquote><p>&#8220;a pressing need for the building of a common data model that is interoperable across digital repositories&#8221;.<\/p><\/blockquote>\n<p>Now I wouldn&#8217;t know a Dublin Core from a Dublin Pint, but how difficult would it have been to build a search-engine friendly tag that allows a repository to tell the world &#8220;this is a root free-to-all full-text file&#8221; and &#8220;you&#8217;re not going to get any full-text for this title&#8221;?  Or to allow the &#8220;one-click&#8221; filtering out of science and medical-related OA material across search results from a thousand repositories?<\/p>\n<p>This could be done at the URL level.  For example by using a standard universal URL structure that could be read by machines and humans alike.  For a journal it might run something like:<\/p>\n<p>&nbsp;&nbsp;&nbsp;www.technology-history.org\/journal-issue-004\/free-full-text\/2009_adams_preindustrial_water_mills.html<\/p>\n<p>Where preindustrial_water_mills are the first three words of the article title.  <\/p>\n<p>Without even accessing the document, a human can now glance at the URL in search results and read off:<\/p>\n<p>&nbsp;&nbsp;&nbsp;Journal name (<em>Technology History<\/em>)<br \/>\n&nbsp;&nbsp;&nbsp;Issue number (<em>Number 4<\/em>)<br \/>\n&nbsp;&nbsp;&nbsp;It&#8217;s from a journal<br \/>\n&nbsp;&nbsp;&nbsp;It&#8217;s free full-text<br \/>\n&nbsp;&nbsp;&nbsp;The year published (<em>2009<\/em>)<br \/>\n&nbsp;&nbsp;&nbsp;The author surname (<em>Adams<\/em>)<br \/>\n&nbsp;&nbsp;&nbsp;The first three words of the article title (&#8220;<em>preindustrial water mills<\/em>&#8220;)<\/p>\n<p>For a repository it could look something like:<\/p>\n<p>&nbsp;&nbsp;&nbsp;www.uni.edu\/oa-repository\/free-full-text\/theses\/history\/history-of-technology\/2009_adams_preindustrial_water_mills.html<\/p>\n<p>And with a uniform standard for URL structures, university IT techies would not be allowed to fiddle with the directory structure and thus break the URL.  <em>All<\/em> full-text files in U.S. repositories could then be searched simply by indexing <em>one<\/em> line:<\/p>\n<blockquote><p>http:\/\/www.*.edu\/oa-repository\/free-full-text\/<\/p><\/blockquote>\n<p>Anyway, rant over.  I did find <a href=\"http:\/\/www.google.com\/coop\/cse?cx=018123512344280340302:tqa4kiqukzs\">a large Google CSE for Economics<\/a>.  Not much use for the arts and humanities you might think, and last updated in 2006, but due to its sheer size (23,613 sites from apparently reputable sources) searches for&#8230;<\/p>\n<blockquote><p>&#8220;creative economy&#8221; keyword<\/p>\n<p>&#8220;creative industries&#8221; keyword<\/p>\n<p>&#8220;art market&#8221; keyword<\/p><\/blockquote>\n<p>&#8230; all seem to show it still has some use as a discovery tool.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Following on from my previous post&#8230; a search for &#8220;open access&#8221; site:www.google.com\/coop\/ was discouraging. There are about twenty &#8220;living-dead&#8221; Custom &hellip;<\/p>\n<p><a href=\"https:\/\/jurn.link\/jurnsearch\/index.php\/2009\/06\/12\/open-access-search\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2,5,10],"tags":[],"class_list":["post-1584","post","type-post","status-publish","format-standard","hentry","category-academic-search","category-how-to-improve-academic-search","category-my-general-observations"],"_links":{"self":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/posts\/1584","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/comments?post=1584"}],"version-history":[{"count":0,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/posts\/1584\/revisions"}],"wp:attachment":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/media?parent=1584"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/categories?post=1584"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/tags?post=1584"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}