• Directory
  • FAQ: about JURN
  • Group tests
  • Guide to academic search
  • JURN’s donationware
  • Links
  • openEco: titles indexed

News from JURN

~ search tool for open access content

News from JURN

Monthly Archives: May 2018

The REF’s decision

11 Friday May 2018

Posted by futurilla in Spotted in the news

≈ Leave a comment

“How to use Web of Science in order to measure Open Access publications and compliance with Open Science policies to support REF claims”.

What could possibly go wrong?

Free OCR for German blackletter text

10 Thursday May 2018

Posted by futurilla in JURN tips and tricks, Spotted in the news

≈ 1 Comment

The free open-source Tesseract OCR 4.0 for Windows (beta, 64-bit), released 14th April 2018.

“The Mannheim University Library uses Tesseract to perform OCR of historical German newspapers. Normally we run Tesseract on Debian GNU Linux, but there was also the need for a Windows version. That’s why we have built a Tesseract installer for Windows.”

The Tesseract engine was apparently originally from Google, in use there at Google Books, but Google made it open source.

Tesseract 4.0 supports OCR in a range of old and ancient letterforms including German blackletter (aka Fraktur, in popular parlance ‘Gothic’), but these need to selectively enabled at install…

Once installed there are a few Windows GUI front-ends to choose from, with which to operate Tesseract. gImageReader is 64-bit Windows and current. On their forums I found a gImageReader beta version that is newly-compiled for Tesseract 4.0 beta. That needs to be launched in Windows Administrator mode, and then it also seems to require a Fraktur download, in order to handle OCR of German blackletter letterforms…

I’m assuming that gImageReader ‘knows’ where Tesseract 4.0 is, and hooks into it automatically. Because I didn’t need to set any file-paths to it, in gImageReader.

Once gImageReader is set up and the Frankur toggle/icon is switched, even when taking a screenshot the OCR results were pretty good…

It can also handle complete PDFs, and seems to go at about 15 pages per minute on a modern desktop PC. Nice to have, and (in combination with Google Translate) useful if your research takes you back to the German literature of pre-1938 — but you can’t read German and certainly not in blackletter.

There are probably online sign-up services that can do the same, these days, where you do a sluggish upload and have to deal with time-outs and usage-quotas etc. But I prefer the ease of having one’s own Windows desktop software.

Google Translate does PDFs

10 Thursday May 2018

Posted by futurilla in JURN tips and tricks, My general observations

≈ Leave a comment

New to me: Google Translate now works on foreign-language PDFs. Perhaps it’s been available for a while, but I’ve seen no-one blogging about it.

It doesn’t work if you just right-click on the Web link to the PDF in, say, Google Scholar or JURN search results, and then select “Translate this page…”.

Instead you have to:

1) Right-click, and copy to the clipboard the direct PDF link.
2) Visit Google Translate, manually paste in the URL you just copied.
3) Click on the URL that appears over in the facing box.
4) The PDF text appears extracted, in the form of a Web page, and translated.

Very useful, and I had excellent results with a Polish article I tested. I had the whole article translated, too, not just the first few paragraphs. Longer items such as a PhD thesis will be refused as “too long”.

Note that a ‘redirect URL’, which gives the PDF but hides the direct URL link to the PDF, is of no use in the above workflow.

Sadly I guess it’s also a route to plagiarism for students. I’d suggest that the anti-plagiarism detector-bot services might usefully build a bank of Google-translated theses and dissertations, to add to their phrase-detection sources. Teachers who mark suspiciously-excellent final dissertations, and who are then inclined ‘to go on the hunt’, should also be aware of the possibility that the lacklustre student may have run a foreign dissertation through Google Translate and then lightly re-written it for clarity in English.

“Something’s wrong in the Library”

05 Saturday May 2018

Posted by futurilla in Spotted in the news

≈ Leave a comment

Old-school point-n’-click videogame fun with the new free The Librarian, for Windows or Mac. The graphics style is deliberately retro (it’s a hipster thing).

[youtube https://www.youtube.com/watch?v=W81wa0VYlpI?rel=0&start=10&w=560&h=315]

WordPress User Jargon Glossary

05 Saturday May 2018

Posted by futurilla in Spotted in the news

≈ Leave a comment

A new WordPress User Jargon Glossary, offering useful brain-jangling reminders in Plain English. Or, in WordPress-speak: ‘Post-Slug Pingbacks for your Metabox’.

Dialling it back

04 Friday May 2018

Posted by futurilla in Spotted in the news

≈ Leave a comment

A preprint, just arrived on SocArxiv: “Digital blackout of Spanish scientific production in Google Scholar”…

“An abrupt drop in the number of Spanish scientific journals covered in [since] the last edition of Google Scholar Metrics (2012-2016) has been detected. […] After considering several hypothesis to explain this phenomenon, we conclude that the main cause was the sudden disappearance of the Spanish bibliographic database Dialnet from Google Scholar.”

I’d add that parts of Ex Libris also summarily removed Dialnet in July 2017…

“all titles will be removed from Dialnet database in the Knowledgebase on July 20, 2017. The database will become a zero-titles database.”

This might suggest that the Google Scholar cut-out — apparently of some 2m Dialnet items — was just ‘an up-stream -> down-steam thing’ that flowed into Google Scholar. Due to the way they have their automated inputs set up from their partners? Just my guess.

More GRAFT-ing

04 Friday May 2018

Posted by futurilla in My general observations, New titles added to JURN

≈ Leave a comment

GRAFT has just had another tranche of new URLs added to its index. Now searching across 4,640 university repositories, full-text and records alike.

Newberry Library makes 1.7m images free to use

03 Thursday May 2018

Posted by futurilla in Spotted in the news

≈ Leave a comment

The Newberry Library has made its 1.7m images free to re-use, including commercial…

“users can share and re-use images derived from the library’s collection for any purpose without having to pay licensing or permissions fees to the Newberry. There are currently over 1.7 million Newberry digital images freely accessible online.”

Picture: Norman Rockwell, “Rosie the Riveter”, 1943. Not sure that Norman Rockwell is really public domain, but it’s nice to have in high-res.

‘Discoverability of award-winning undergraduate research in history’

03 Thursday May 2018

Posted by futurilla in Academic search, Spotted in the news

≈ Leave a comment

New paper: “The discoverability of award-winning undergraduate research in history: Implications for academic libraries”, College & Undergraduate Libraries, April 2018…

“eight of the fifteen papers could be found in full text. If full text was available somewhere, Google always found it. Google Scholar only found four of the eight full-text papers […] Microsoft Academic found two of the full-text papers”

BBC releases 16,000 sound effects in .WAV

02 Wednesday May 2018

Posted by futurilla in Spotted in the news

≈ Leave a comment

The BBC has released 16,000 sound effects for “personal, educational, or research purposes”, at BBC Sound Effects (beta). Format is .WAV. Search results were so fast that I didn’t initially realise that they’d been returned.

← Older posts
Newer posts →
RSS Feed: Subscribe

 

Please become my patron at www.patreon.com/davehaden to help JURN survive and thrive.

JURN

  • JURN : directory of ejournals
  • JURN : main search-engine
  • JURN : openEco directory
  • JURN : repository search
  • Categories

    • Academic search
    • Ecology additions
    • Economics of Open Access
    • How to improve academic search
    • JURN blogged
    • JURN metrics
    • JURN tips and tricks
    • JURN's Google watch
    • My general observations
    • New media journal articles
    • New titles added to JURN
    • Official and think-tank reports
    • Ooops!
    • Open Access publishing
    • Spotted in the news
    • Uncategorized

    Archives

    • January 2026
    • October 2025
    • May 2025
    • April 2025
    • September 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • June 2023
    • May 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
    • October 2014
    • September 2014
    • August 2014
    • July 2014
    • June 2014
    • May 2014
    • April 2014
    • March 2014
    • February 2014
    • January 2014
    • December 2013
    • November 2013
    • October 2013
    • September 2013
    • August 2013
    • July 2013
    • June 2013
    • May 2013
    • April 2013
    • March 2013
    • February 2013
    • January 2013
    • December 2012
    • November 2012
    • October 2012
    • September 2012
    • August 2012
    • June 2012
    • May 2012
    • April 2012
    • March 2012
    • February 2012
    • January 2012
    • December 2011
    • November 2011
    • October 2011
    • September 2011
    • August 2011
    • July 2011
    • June 2011
    • May 2011
    • April 2011
    • March 2011
    • February 2011
    • January 2011
    • December 2010
    • November 2010
    • October 2010
    • September 2010
    • August 2010
    • July 2010
    • June 2010
    • May 2010
    • April 2010
    • March 2010
    • February 2010
    • January 2010
    • December 2009
    • November 2009
    • October 2009
    • September 2009
    • August 2009
    • July 2009
    • June 2009
    • May 2009
    • April 2009
    • March 2009
    • February 2009

    Proudly powered by WordPress Theme: Chateau by Ignacio Ricci.