• Directory
  • FAQ: about JURN
  • Group tests
  • Guide to academic search
  • JURN’s donationware
  • Links
  • openEco: titles indexed

News from JURN

~ search tool for open access content

News from JURN

Category Archives: Spotted in the news

128Tb SD cards

17 Tuesday Jul 2018

Posted by futurilla in Spotted in the news

≈ Leave a comment

128Tb can now fit on a retail SD card. That’s 134 million megabytes. At five x 200kb .PDF papers per megabyte, such a postage-stamp sized SD card can now hold 600 million academic papers. It’s actually around 650 million, but I leave plenty of space for the indexing software and its index.

New type of Custom Search Engine

17 Tuesday Jul 2018

Posted by futurilla in JURN's Google watch, Spotted in the news

≈ Leave a comment

Google Custom Search has slightly expanded the range of services.

The Standard and Non-profit CSE services are unchanged.

They also offer an CSE via a JSON API: there’s no Google branding on that, but you pay $5 per thousand queries, and are limited to 10,000 search queries per day.

The new and fourth offering is a “Site Restricted JSON API”: it also requires the same “$5 per thousand search queries” payment. But if you search across no more than 10 URLs, then there’s no daily traffic limit.

I guess a use-case for this would be a huge and very heavily-used corporation like Boeing, where you want to offer your clients the quickest and most accurate way to search across all your technical reports, papers and manuals — which are spread across 10 different URLs? That use-case would likely need some guarantees from Google, though, on the spread and depth of the indexing.

4m Open Library books, full-text, deep search

14 Saturday Jul 2018

Posted by futurilla in Academic search, Spotted in the news

≈ Leave a comment

You can now ‘search inside’ all 4m Open Library books held at Archive.org, with your search seemingly constrained to just those books (and not the jumble that Archive.org also hosts). Nice results, with multi-snippets from deep inside the full-text of the books, plus phrase highlighting. This looks like excellent work, and it takes advantage of new tweaks by Archive.org’s search leader Giovanni Damiola.

A serious history researcher is still going to need to pound Archive.org itself and go through everything, but at first glance this seems to be a useful time-saver for those who only need to search the upper layers of the service.

The ultimate goal of the Open Library is “One Web page for every book ever published”. Think of it as one of those annoying university repositories where 95% of the full-text is not available yet, but will be one day… so “here’s a record page instead”. But in this case it’s for all books, and already has a substantial amount of full-text for free.

Birmingham Museums Trust images are CC0

13 Friday Jul 2018

Posted by futurilla in Spotted in the news

≈ Leave a comment

All public domain Birmingham Museums Trust images are now CC0. Birmingham UK, not USA. The Trust will still make a charge for use of files larger than 3MB at 300dpi. Currently the site is still offering small and low Web-res pictures, but the Trust states they… “will introduce a new Digital Asset Management System in late 2019”, when presumably we’ll get new buttons to access large files. Until then, I’m guessing that one has to manually email them to ask for a larger CC0 version. The 3Mb barrier is a nice model, but AI upscaling of images will probably ensure that it only lasts a few years.

Burne-Jones, sketch for Theseus and the Minotaur in the Labyrinth.

itty.bitty

07 Saturday Jul 2018

Posted by futurilla in Open Access publishing, Spotted in the news

≈ Leave a comment

itty.bitty, new from the design leader at Dropbox. Itty.bitty uses the URL to contain the text of a Web page. The page can have 2,000 bytes, or about 170-200 words, if you’re going to support legacy Web browsers such as Internet Explorer.

No hosting server is required, and as the data sits after the # symbol. What comes after the # is meant to be page-position related, and as such it never gets sent to the server.

The base64 link code is not pretty…

But the same link/page displays as…

How it works

This link is the page.

Scripting and hyper-linking is enabled in such pages, so long as it all fits in the URL length. The code can’t do images, but you can do old-school ASCII-art.

The main drawback seems to be that you’re going to have to be 1000% sure that your text is exactly as you want it before you make the link… because there’s no after-post editing for errors or updating of dead hyperlinks in the page.

In which case you’d ideally consistently version and date-stamp the

">How it works

bit, as…

">How it works (v.0.1 | 14/07/2018)

…so that people and search-tools can discover later updated versions of the same content. Otherwise the itty.bitty system risks becoming an intertwingled mess of half-baked and old/broken stuff that you (and probably Google) won’t want in search results.

I’m guessing that advanced Web browsers such as Brave will soon ‘add a feature’ in relation to this, by enabling much longer data-carrier code to be read from URLs. Perhaps also some simple automatic “…and can we find a later version of this itty.bitty.site?” query, done inside the browser. There would, however, also have to be some sort of dynamic ownership hash embedded in the page, to protect against impersonation of the page-author. Perhaps the system of authoring an ownership-hash and datestamp could be combined into a simple ‘one-click operation’ in a desktop authoring tool.

Anyway, it’s one example of the coming uncensorable Decentralized Web.

500px – Creative Commons close-down and a Getty-grab

01 Sunday Jul 2018

Posted by futurilla in JURN tips and tricks, Spotted in the news

≈ 1 Comment

Flickr-alternative 500px has announced it is set to close down the sharing of images under Creative Commons. The new owners have partnered with evil megacorp Getty and as a consequence are…

“disabling the ability for people to upload or download photos shared under Creative Commons licenses.”

So far as I can tell from tests, the CC options and search have not yet been disabled.

But it’s not that desperate in terms of effects on serious picture researchers — I mean, when did you last find a print-sized commercial-use CC picture at 500px, via Google Images? Never, in my experience. It’s probably because the 500px user-base tends strongly toward makers of naff me-too ‘stock’ and ‘tourist’ images, which are of no use to academics and historians (and of little use to discriminating stock-hunters, either). But the decision is annoying for creatives who have a 500px subscription. Which includes me, after the once-great Flickr was crashed and burned by Yahoo.

I think the way for active makers to get around the new block may be just to tag with the phrase “Creative Commons” in the keywords, and also add a ‘please freely use this image’ comment as the creator. But not to explicitly place the picture under a CC license (which it seems won’t even be an option, soon). Let’s hope the new owners of 500px are not so crass as to also go in and delete all their users’ “Creative Commons” keyword tags.

More importantly, for 500px users….

“If you’re a contributing photographer who has not opted out of distribution, your images may be selected for inclusion on Getty Images”.

I find that also applies to people who have not chosen to actively try to sell stock on the 500px site. Here’s how to prevent Getty from grabbing all your pictures, in the next day or so…

1. Go to “Settings”…

2. Find “Distribution”, tick the check-box and save.

Presumably the plan is that all the commercial-use CC 500px images show up for sale at Getty next week, and that the 500px users then have no way to pull them back and/or delete them?

WordPress at 15

29 Tuesday May 2018

Posted by futurilla in Spotted in the news

≈ Leave a comment

Wow, 15 years of WordPress, and still the best and most reliable and generous social-media company! It probably helps that the software itself is GLP and run by a Foundation, and that the .com side is still basically (as far as I can tell) run by the founder Matt Mullenweg. Thanks Matt!

On ResearchGate

22 Tuesday May 2018

Posted by futurilla in Academic search, Official and think-tank reports, Spotted in the news

≈ Leave a comment

What publishers can take away from the latest early career researcher research ($), a five-page “Industry Update” for the journal Learned Publishing, 28th April 2018…

“ResearchGate is unquestionably the scholarly elephant in the room, which despite being just 10 years old boasts 15 million research members and is still growing at a rate of knots. … publisher offerings can look monastic and parochial by comparison. […] It looks rather like the new scholarly world order.” […] “Much depends on whether ECRs [early-career-researchers] take their millennial beliefs in sharing, openness, and transparency into leadership positions. [and if] publishers [start] feeding ResearchGate rather than competing with it – [making it] a publishing Amazon”.

The Update is by the team doing an industry-supported three-year cohort study of search and similar practices. Their first two reports are Early Career Researchers: the harbingers of change? Year One 2016 and now also the Year Two 2017 report, both free and public at the same website. Apparently the cohort of around 100+ is all science and social studies.

Also fairly new, and related, “ResearchGate and Academia.edu as networked socio-technical systems for scholarly communication: a literature review” (OA), in the Research in Learning Technology journal, 20th February 2018…

“a thorough understanding is still lacking of how these sites operate as networked socio-technical systems reshaping scholarly practices and academic identity. This article analyses 39 empirical studies published in peer-reviewed journals with a specific focus on ResearchGate and Academia.edu.”

Google Search currently suggests circa 72-million full-text PDFs at ResearchGate, although given the above Industry Update statement on ‘the 15m members’ we can probably assume some 10m of those PDFs are just CVs (which are nearly all excluded from JURN, by the way). Remove other fluff and I guess there might be circa 50m proper papers there. It would then be interesting to work out what “the uniques” are, by removing the papers freely available elsewhere in repositories and OA journals and suchlike. I’d very roughly guess that including ResearchGate PDFs in JURN may bring in some 5m to 8m papers not found elsewhere.

New book: Shadow Libraries

21 Monday May 2018

Posted by futurilla in How to improve academic search, Spotted in the news

≈ Leave a comment

New from MIT Press and under CC, Shadow Libraries: Access to Educational Materials in Global Higher Education (PDF). Also available in paperback via Amazon etc. Surveys the evolution of the trend that has today become Sci-Hub, Libgen.io etc.

India culls 4,305 dubious journals

20 Sunday May 2018

Posted by futurilla in Spotted in the news

≈ Leave a comment

Nature India, May 2018: “India culls 4,305 dubious journals from approved list”…

“India culls 4,305 dubious journals from approved list. … The University Grants Commission (UGC), which funds and oversees higher-education in India, has removed 4,305 spurious journals from a list of some 30,000 publications used for weighing academic performance.”

The Delhi Declaration on Open Access recently stated “20,000+ journals being published from India” alone.

← Older posts
Newer posts →
RSS Feed: Subscribe

 

Please become my patron at www.patreon.com/davehaden to help JURN survive and thrive.

JURN

  • JURN : directory of ejournals
  • JURN : main search-engine
  • JURN : openEco directory
  • JURN : repository search
  • Categories

    • Academic search
    • Ecology additions
    • Economics of Open Access
    • How to improve academic search
    • JURN blogged
    • JURN metrics
    • JURN tips and tricks
    • JURN's Google watch
    • My general observations
    • New media journal articles
    • New titles added to JURN
    • Official and think-tank reports
    • Ooops!
    • Open Access publishing
    • Spotted in the news
    • Uncategorized

    Archives

    • February 2026
    • January 2026
    • October 2025
    • May 2025
    • April 2025
    • September 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • June 2023
    • May 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
    • October 2014
    • September 2014
    • August 2014
    • July 2014
    • June 2014
    • May 2014
    • April 2014
    • March 2014
    • February 2014
    • January 2014
    • December 2013
    • November 2013
    • October 2013
    • September 2013
    • August 2013
    • July 2013
    • June 2013
    • May 2013
    • April 2013
    • March 2013
    • February 2013
    • January 2013
    • December 2012
    • November 2012
    • October 2012
    • September 2012
    • August 2012
    • June 2012
    • May 2012
    • April 2012
    • March 2012
    • February 2012
    • January 2012
    • December 2011
    • November 2011
    • October 2011
    • September 2011
    • August 2011
    • July 2011
    • June 2011
    • May 2011
    • April 2011
    • March 2011
    • February 2011
    • January 2011
    • December 2010
    • November 2010
    • October 2010
    • September 2010
    • August 2010
    • July 2010
    • June 2010
    • May 2010
    • April 2010
    • March 2010
    • February 2010
    • January 2010
    • December 2009
    • November 2009
    • October 2009
    • September 2009
    • August 2009
    • July 2009
    • June 2009
    • May 2009
    • April 2009
    • March 2009
    • February 2009

    Proudly powered by WordPress Theme: Chateau by Ignacio Ricci.