• Directory
  • FAQ: about JURN
  • Group tests
  • Guide to academic search
  • JURN’s donationware
  • Links
  • openEco: titles indexed

News from JURN

~ search tool for open access content

News from JURN

Monthly Archives: July 2009

Free OCR for Google Book Search pages

05 Sunday Jul 2009

Posted by futurilla in JURN tips and tricks, JURN's Google watch

≈ 11 Comments

Ever wanted to take the hassle out of re-typing a short quote, found on Google Books? Free OCR is a simple online OCR application that might help.

To test it, I gave it a very unpromising bit of text captured from Google Books using a standard screen-capture utility — slightly skewed, slightly fuzzy, in a non-standard typeface I’m willing to bet no-one has on their system, captured as a JPG at a mere 72 dpi, and just 500 pixels wide…

ocr-test

A few seconds after uploading, it gave me this…

ADVERTISEMENT.
Tms publication of the Works of Jomv KNOx, it is
supposed, will extend to F’ive Volumes. It was thought
advisable to commence the series with his History of
the Reformation in Scotland, as the work of greatest
importance. The next voliune will thus contain the
Third and Fourth Books, which continue the History to
the year 1564; at which period his historical labeurs
maybeconsideredtoterminate. ButtheFi&hBook,
forming a sequel to the History, and published under
his name in 1644, will also be included. His Letters
and Miscellaneous Writings will be arranged in the
subsequent volumes, as nearly as possible in chronolo-
gical order; each portion being introduced by a separate
notice, respecting the manuscript or printed copies from
which they have been taken.
It may perhaps be expected that a Life of the Author
should have been prefixed to this volume. The Life of
Knox., by Ds. M‘Cms, is however a work so universally
, known, and of so much historical value, as to supersede
l any attempt that might be made for a detailed bio-

Not perfect, but not bad for such a poor-quality capture. Stand-alone OCR software usually demands a much better quality source.

The popular screenshot software HyperSnap v6 promises to do the same with its TextSnap feature, but for some unknown reason this feature just doesn’t work with Google Books or the captured image above. I suspect it can only handle text that uses system fonts.

So until we get a neat free OCR Firefox addon (which is a direction I would urge the makers of Free OCR to go in) then screenshot – save image – upload image to Free OCR is a viable and speedy workflow for OCR-ing fair-use quotes found on Google Book Search or other places that only offer plain page-scans.

Oh, and don’t bother doing this for books that are already in the public domain — since last month Google provides the full-text of these for download, and also serves it up via Google Book Search Mobile.

   ** Update: If you have Microsoft Office 2007 or higher, then I find that the included Microsoft OneNote works just as well for OCR on low-res images such as the one above. It also works well on most PDFs that don’t allow copy/paste. See the comments to this post for details.

intitle: works with Google Blogs search

05 Sunday Jul 2009

Posted by futurilla in JURN tips and tricks, JURN's Google watch

≈ Leave a comment

Here’s a useful tip for those who want better precision while wading through the Google Blog search “blog bog”. The search modifier intitle: works with Google Blog Search.

Non non-destructive scanning

05 Sunday Jul 2009

Posted by futurilla in Ooops!

≈ Leave a comment

The man who ripped books…

“I have a sheet-fed scanner — a Fujitsu Scan Snap S510M [$350] — which works quickly. It handles about 20 sheets per minute, scanning both sides. A 200 page book takes about 5 minutes to scan. The problem is turning a bound book into sheets. I’ve been using a utility knife to cut the pages […] the knife only takes a few minutes. In less than 10 minutes I can reduce a bulky 2-3 pound book to a weightless file with all the typography, graphics and even the paper’s color preserved in a PDF.”

A better option, which means you can still sell the books afterwards. Or donate them to a library.

Or you could just run a $65 barcode scanner over the back of each book, keep them (“Books furnish a room”, etc) and then search a lot of your library via Google Books. Plus you get a record of your library for insurance purposes, in case the house burns down. Which in large sections of the American desert/scrubland is apparently a real possibility. I’d imagine it might be quite useful for scholars in repressive countries too, where one might suddenly have to flee the country without a personal library.

Big apple

03 Friday Jul 2009

Posted by futurilla in JURN blogged

≈ Leave a comment

JURN is reference site of the day at the Brooklyn Public Library in New York City, the fifth largest public library in the United States.

The hidden economics of Open Access

03 Friday Jul 2009

Posted by futurilla in Economics of Open Access

≈ Leave a comment

Joseph Gelfer criticises aspects of the paper “But what have you done for me lately? Commercial Publishing, Scholarly Communication, and Open-Access” (2009) by John P. Conley and Myrna Wooders, with special focus on the value that paid editors can bring in terms of polishing manuscripts.

In the second half of the post, Gelper also points out that…

“the volunteer labor on which many OA journals … are based hides the true cost of doing business. One would expect an economist to make more of this analysis, but the fact that $0 is spent on editing an OA journal does not result in zero cost. Costs come in many shapes and forms: that hour of volunteer copyediting from our editorially skilled and willing academic comes at the cost of their employer, or family, or an hour of leisure activity. … when such [OA] mandates rely on unpaid labor, they also have the potential to erase the skills of academics and publishing professionals who may otherwise reasonably demand an honest day’s pay for an honest day’s work … the glossing over of economic realities does no service to OA’s moral high-ground”

The other hidden long-term cost factor here is training. Professionals may have invested years of their life in training courses and self-learning, whereas volunteer OA editors are seemingly expected to “just know how to do it”. Not only are volunteer editors not paid (even in terms of workload allowances), they’re not paid to train for their role either.

Self-archiving after publication

02 Thursday Jul 2009

Posted by futurilla in Economics of Open Access

≈ Leave a comment

The Occasional Pamphlet (a law blog at Harvard) has a long and detailed posting on the issues around the public self-archiving of academic articles, after publication in an academic journal.

A tax on research

02 Thursday Jul 2009

Posted by futurilla in Economics of Open Access

≈ Leave a comment

Amazing. Apparently the Treasury grabs 17.5% of the cost of all online academic journals, via charging VAT (a UK sales tax) on sales to university libraries.

Open Access: What are the economic benefits?

01 Wednesday Jul 2009

Posted by futurilla in Economics of Open Access

≈ Leave a comment

Yet another new report for your holiday deckchair reading. Open Access: What are the economic benefits? A comparison of the United Kingdom, Netherlands and Denmark is by John Houghton of the Australian Centre for Strategic Economic Studies, and is published by the Danish Knowledge Exchange….

“Open access or ‘author-pays’ publishing for journal articles (i.e. ‘Gold OA’) might bring net system savings of around […] EUR 480 million in the UK (at 2007 prices and levels of publishing activity) […] a repositories and overlay-services model may well produce similar cost savings to open access publishing.”

Newer posts →
RSS Feed: Subscribe

 

Please become my patron at www.patreon.com/davehaden to help JURN survive and thrive.

JURN

  • JURN : directory of ejournals
  • JURN : main search-engine
  • JURN : openEco directory
  • JURN : repository search
  • Categories

    • Academic search
    • Ecology additions
    • Economics of Open Access
    • How to improve academic search
    • JURN blogged
    • JURN metrics
    • JURN tips and tricks
    • JURN's Google watch
    • My general observations
    • New media journal articles
    • New titles added to JURN
    • Official and think-tank reports
    • Ooops!
    • Open Access publishing
    • Spotted in the news
    • Uncategorized

    Archives

    • February 2026
    • January 2026
    • October 2025
    • May 2025
    • April 2025
    • September 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • June 2023
    • May 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
    • October 2014
    • September 2014
    • August 2014
    • July 2014
    • June 2014
    • May 2014
    • April 2014
    • March 2014
    • February 2014
    • January 2014
    • December 2013
    • November 2013
    • October 2013
    • September 2013
    • August 2013
    • July 2013
    • June 2013
    • May 2013
    • April 2013
    • March 2013
    • February 2013
    • January 2013
    • December 2012
    • November 2012
    • October 2012
    • September 2012
    • August 2012
    • June 2012
    • May 2012
    • April 2012
    • March 2012
    • February 2012
    • January 2012
    • December 2011
    • November 2011
    • October 2011
    • September 2011
    • August 2011
    • July 2011
    • June 2011
    • May 2011
    • April 2011
    • March 2011
    • February 2011
    • January 2011
    • December 2010
    • November 2010
    • October 2010
    • September 2010
    • August 2010
    • July 2010
    • June 2010
    • May 2010
    • April 2010
    • March 2010
    • February 2010
    • January 2010
    • December 2009
    • November 2009
    • October 2009
    • September 2009
    • August 2009
    • July 2009
    • June 2009
    • May 2009
    • April 2009
    • March 2009
    • February 2009

    Proudly powered by WordPress Theme: Chateau by Ignacio Ricci.