New York Public Library collections

The new New York Public Library public domain scans website seems to have sorted out its launch difficulties. Visitors can now sort-of enjoy everything from enormous numbers of complete sets of old cigarette cards to old photography of New York City.

I say ‘sort-of’ because I found that every test item I tried was capped at 720px, and the hi-res versions are only obtainable on payment. The site’s image URL path is also hidden from Google, so one can’t use Google Images to find just the free hi-res versions, if there are any.

Overall, despite its very large scope and quality selection, the huge paywall means that it’s hardly the most exemplary presentation of public domain material.

cards

That Google moment…

Byron Russell, manager of Ingentaconnect, wants to search only for freely re-usable Open Access articles, but finds that ‘the Google moment’ for such a search hasn’t arrived yet…

Run a Google search on “Mendelian dominance open access” and the first two hits are for one publisher – the OMICS Group.

Judging from my Google Search results to recreate his search, what he actually tried to search for was: Mendelian dominance open access — without the quote marks. Difficult to see how such a loose search would find something worth having. But even if he’d then gone on to say… ‘so, we need to teach students how to search Google properly…’, his article’s point would have been much the same. Even using sophisticated Google search methods, one still gets mired amid a swamp of Powerpoints, K-12 lesson plans, student quizzes, wikis, high-ranking predatory journal articles and other junk.

JURN does a fairly good job with…

     Mendel “dominance” “Commons Attribution” -noncommercial

Having Mendel without quote marks in that way, catches Mendel | Mendel’s | Mendelian | since Google automatically expands the name.

The target CC content, as currently found on OA journals via JURN, seems to reside almost entirely in PLOS, Pubmed, Springer and a few others.

But there’s more in the hybrid journals. So one can also approximate a main Google Search across the large publishers, Elsevier for instance, via something like…

     site:www.sciencedirect.com/science/article/ “Commons Attribution” -noncommercial -“non-commercial”

For Oxford Journals it’s slightly different…

     inurl:oxfordjournals.org “Commons Attribution” -“non-commercial”

(Google will probably flash up an annoying “captcha” to make sure you’re not a robot, at that point, if you’ve worked the examples down to this point).

And so on… one could just work through the larger publishers that way. For Springer most of the work has already been done by Paperity, although Paperity still lacks coverage of a couple of OA Springer titles.

It’s certainly not ideal, as Russell suggests. On the other hand, one might ask why someone needs to find just the CC-BY content on a topic. Perhaps it’s actually quite useful that a big publisher would find it difficult to automatically siphon all known CC-BY articles and books into its own giant repository, slap on some search, mining, overlay journal and themed book-compiling tools, and then sell access to it.