{"id":21272,"date":"2018-07-14T08:59:54","date_gmt":"2018-07-14T07:59:54","guid":{"rendered":"https:\/\/jurnsearch.wordpress.com\/?p=21272"},"modified":"2018-07-14T08:59:54","modified_gmt":"2018-07-14T07:59:54","slug":"20m-open-library-books-full-text-deep-search","status":"publish","type":"post","link":"https:\/\/jurn.link\/jurnsearch\/index.php\/2018\/07\/14\/20m-open-library-books-full-text-deep-search\/","title":{"rendered":"4m Open Library books, full-text, deep search"},"content":{"rendered":"<p>You can now <a href=\"https:\/\/dev.openlibrary.org\/search\/inside?q=sir+gawain+and+the+green+knight&amp;mode=ebooks&amp;m=edit&amp;has_fulltext=true\">&#8216;search inside&#8217;<\/a> all 4m Open Library books held at Archive.org, with your search seemingly constrained to just those books (and not the jumble that Archive.org also hosts).  Nice results, with multi-snippets from deep inside the full-text of the books, plus phrase highlighting.  This looks like excellent work, and it takes advantage of new tweaks by Archive.org&#8217;s search leader Giovanni Damiola.<\/p>\n<p><a href=\"https:\/\/jurn.link\/jurnsearch\/2018\/07\/snippets.jpg\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/jurn.link\/jurnsearch\/2018\/07\/snippets.jpg\" alt=\"\" width=\"529\" height=\"424\" class=\"alignnone size-large wp-image-21273\" \/><\/a><\/p>\n<p>A serious history researcher is still going to need to pound Archive.org itself and go through everything, but at first glance this seems to be a useful time-saver for those who only need to search the upper layers of the service.<\/p>\n<p>The ultimate goal of the <a href=\"https:\/\/openlibrary.org\/\">Open Library<\/a> is &#8220;One Web page for every book ever published&#8221;.  Think of it as one of those annoying university repositories where 95% of the full-text is not available <em>yet<\/em>, but will be one day&#8230; so &#8220;here&#8217;s a record page instead&#8221;.  But in this case it&#8217;s for <em>all<\/em> books, and already has a substantial amount of full-text for free.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>You can now &#8216;search inside&#8217; all 4m Open Library books held at Archive.org, with your search seemingly constrained to just &hellip;<\/p>\n<p><a href=\"https:\/\/jurn.link\/jurnsearch\/index.php\/2018\/07\/14\/20m-open-library-books-full-text-deep-search\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2,16],"tags":[],"class_list":["post-21272","post","type-post","status-publish","format-standard","hentry","category-academic-search","category-spotted-in-the-news"],"_links":{"self":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/posts\/21272","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/comments?post=21272"}],"version-history":[{"count":0,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/posts\/21272\/revisions"}],"wp:attachment":[{"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/media?parent=21272"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/categories?post=21272"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jurn.link\/jurnsearch\/index.php\/wp-json\/wp\/v2\/tags?post=21272"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}