Meta.com purchased, to be made free

In the news this week, Priscilla Chan and Mark Zuckerberg (Facebook) have purchased an academic search engine Meta, and are set to… “offer Meta’s tools free to all researchers” at some point in the future. Very nice of them.

Currently meta.com’s search is shuttered to the public, but the site is inviting sign-ups. Meta.com is not a name that’s been on the tip of my tongue, or covered here. I don’t recall if public access to it was ever available, but possibly not. Apparently the pre-Zuckerberg Meta was one a clutch of startups trying to apply AI to a limited set of the academic literature — often in the relatively tame-but-lucrative biomedical field. I had a glancing post here on the apparently-similar Iris AI 2.0 back in November. At its search tool level Iris AI seems to propose much the same search capabilities as Meta — but via a demo of 30m+ records harvested from repositories by CORE. In contrast the pre-Zuckerberg Meta.com covered PubMed, according to a November 2015 press-release, combining that with metadata input from “dozens of publishers”. Another November 2015 press release rather ambitiously claimed that Meta.com enabled a user to…

“navigate the entirety of scientific information (25 million papers with 4,000 new ones published daily)”.

“Ambitiously” because there’s no way that the “entirety of scientific information” in journal article form = 25m papers.

After the Zuckerberg-boosted relaunch the stated aim is to expand the functionality via third-party access…

“we will enable developers to build on it or integrate it into third party platforms and services … will embrace the ideas and efforts of researchers in the diverse fields that Meta intersects with – including machine learning, network science, ontologies, science metrics, and data visualization”.

Hopefully that opening up will also include open public access to the most juicy commercial bits of Meta.com, like the ‘early awareness’ Horizon Scanning module. This claimed to be able to descry a predictive map of future research agendas and trends…

“will enable academics and industries to maintain early awareness of emergent scientific and technical advances at a speed, scale and comprehensiveness far beyond human capacity, and years in advance”

Assuming that works as intended (I haven’t encountered any gushing reviews) I’m still not sure I’d want to absolutely rely on a predictive tool that only saw a fraction of the picture. Since a mere “25 million papers” seems a little lightweight, re: a claim to index “the entirety of scientific information”. On the other hand, if it covers all of the output in one’s tight little niche, and has semantic links out into a spread of related and similarly delimited fields, then it could be quite useful for some people.

New Google CSE behaviour

There’s new behaviour from Google Custom Search, of relevance to Google CSE curators. In the dashboard, one can no longer edit a URL in place (for instance, make a simple updating of a URL from http:// to https:// ) and then save it to update it in the index. Doing this deletes the URL from the index without any warning. If you didn’t keep a backup copy of that URL pattern, you’ve lost it.

The behaviour is so remarkable and abrupt that I think perhaps it’s a temporary glitch. But for now, a wary CSE curator needs to:

1) open the indexed URL in the CSE then copy / paste it to Notepad
2) manually delete it from the URL base
3) make the corrections to the URL in Notepad
4) then paste the URL back into the CSE index, as if it were a fresh URL.

Backing up one’s CSE index (aka ‘annotations’) as .xml is probably also advisable, if glitches are indeed getting into the system.

Knowledge Unlatched – 2017 round

Publishers have until 10th February 2017 to submit suggested humanities book titles to Knowledge Unlatched. Selected books are made Open Access in perpetuity, albeit usually minus the cover art/design as part of the Creative Commons PDF. Losses are defrayed by a consortium of libraries.

469302_cover

106 Knowledge Unlatched titles currently show up in OAPEN and thus in JURN. Although 343 titles were unlatched for 2016, which means that a lot more are coming soon.

Persistent Identifiers for the Humanities

The Victoria & Albert Museum “Persistent Identifiers for the Humanities (workshop report)”, 20th January 2017…

“… the British Library and the DateCite organisation (as part of the THOR project) organised a workshop before Christmas on this issue of ‘Persistent Identifier Services for the Humanities’.

It was apparent from the discussions in the workshop that the implementation of this infrastructure in the humanities is still very much in its infancy in all institutions. Some of the basic concepts inherited from scientific research do not seem to map directly across. For example, do humanities’ researchers consider their source material ‘data’. Or should we even be referring to ‘data’ as a ‘dataset’? It is not immediately obvious what the distinction between the two terms is. Is an individual museum object a dataset or is a set of museum objects a dataset in the same way as a set of data points in scientific research can be?

A separate point of discussion is how to distinguish between the physical object, its digitised version, its associated catalogue record and different versions of this record, (as knowledge is accumulated/revised) as this is not currently clear in DataCite. Although a similar situation was mentioned in the sciences with ice-core samples, where different digital datasets continue to be published from the same physical ice-core samples.”