The Economic Impact of UK Arts and Humanities Research

A new report titled Leading the World: The Economic Impact of UK Arts and Humanities Research (PDF link), from the Arts & Humanities Research Council…

“it appears that the UK arts and humanities community is producing nearly as many articles as their US colleagues (over three years, the UK produced 33% and the USA 37%), even though the USA has five times our population.”

Impressive productivity, which also seems to be reflected in citations. Let’s hope it convinces — it’s the sort of report that appears before an axe-weilding government Comprehensive Spending Review stomps onto the scene.

Installing and using the JURN Firefox addon

You can find the free JURN add-on for the popular Firefox web browser on this page.

Simply scroll down the page until you see JURN in the list. Handily, it’s located next to JSTOR UK…

jurntool-0

After you install it, locate the small search box alongside your browser’s address bar. You should now be able to easily toggle between JURN and other engines, via a simple drop-down menu…

jurntool-1

Choosing JURN in this way allows you to highlight and right-click on any phase in any web page, then have JURN do a search…

jurntool-2

The set of results open in a new window.

That’s it! Thanks to Pattrice Jones for making the add-on.

Fab new additions today to Google Book Search

Some fab new additions to Google Book Search:

   * A drop-down menu to navigate directly to a chapter.

   * A YouTube-like “embed this book” code snippet.

   * Sort search results by “relevance”, as well as page order.

   * Expanded Book Overview page, with reviews and more keywords.

There are a few more additions, only applying to public-domain books.

Interestingly the new contents listing doesn’t seem to wholly rely on a table-of-contents, since Google apparently has a new “structure extraction technology” which is being added to the mix.

Indexing obituary articles from selected newspapers

Now indexing the obituary articles (and only the obituary articles) from, in the British Isles: The Times (London), The Telegraph, The Independent, The Irish Independent, and Irish Times. In the U.S.: the L.A. Times, Bay Area Reporter, Sacremento Bee, Boston Globe, and Chicago Sun-Times.

The Guardian and the New York Times don’t provide suitable indexable URLs for their obituary pages.

The Dumbest Generation?

A blog post in the U.S. Chronicle of Higher Education quickly rounds up a rash of apparently polemical books condemning a new “Twitter generation” of “dumb” young people (and, by implication, undergraduates). Others with equally shiny new books to sell claim the brains of students have been fundamentally re-wired by ‘growing up digital’, and the new breed of uber-student thus requires totally new forms of teaching by a new breed of teachers.

I’m suspicious of either extreme, and strongly doubt the existence of either a wholly slack-brained or a silicon-brained “Twitter generation”. I can’t help thinking that both these notions can serve as a handy scapegoat — able to neatly divert the blame for lowering standards in education from swivel-eyed socialist educational theorists and poor quality managers, to a tale in which impressionable students are seduced en masse by libertine online companies.

I suspect that we’re all the Google generation — in that that our superficial large-volume information-handling skills (i.e., ‘skimming’) have increased somewhat across the board (at least among those who are literate and took an interest in their education), but that almost no-one (young or old) is really excellent at searching or finding online, or at critically selecting and weaving what they find into something new.

One of the more interesting (but rarely discussed) aspects of this strong but patchy and often ramshackle shift from books / paper journals to digital learning and research, is the change in how serendipity (finding something useful that you’re not looking for) and misunderstanding happen — given that serendipity and misunderstanding seem to have been small but important elements in the ‘motor’ that drove chains of cultural production during the 20th century.

Google is in some ways a ‘serendipity engine’ (if you’re not searching it correctly, which few people seem to be able to do), while StumbleUpon is also a crude approximation of one — but I wonder if we might design far more streamlined and reliable ways of unearthing chance discoveries that will have meaning for cultural producers, while retaining some of their mystery and the potential for ‘creatively misunderstanding’ them. Perhaps not, perhaps it’s impossible in a world where everything now seems to be always-already discoverable — but it might be interesting to try.

Where Google Scholar stands on art history

Hannah Noll’s paper for her M.S. in Library Science degree, Where Google Scholar Stands on Art: An Evaluation of Content Coverage in Online Databases (PDF link, 300kb)…

“This [ 2008 ] study evaluates the content coverage of Google Scholar and three commercial databases (Arts & Humanities Citation Index, Bibliography of the History of Art, and Art Full Text/Art Index Retrospective) on the subject of art history. Each database is tested using a bibliography method and evaluated based on Peter Jacso’s scope criteria for online databases. Of the 472 articles tested [ * ] , Google Scholar indexed the smallest number of citations (35%), outshone by the Arts & Humanities Citation Index which covered 73% of the test set. This content evaluation also examines specific aspects of coverage to find that in comparison to the other databases, Google Scholar provides consistent coverage over the time range tested (1975-2008) and considerable access to article abstracts (56%). Google Scholar failed, however, to fully index the most frequently cited art periodical in the test set, Artforum International. Finally, Google Scholar’s total citation count is inflated by a significant percentage (23%) of articles which include duplicate, triplicate or multiple versions of the same record.”

* tested with a set of “article citations authored by a pre-selected set of art historians” via 12 names “culled from the Dictionary of Art Historians“, according to the paper. Authors had to be British or American, and born after 1925.

It’s interesting that Noll rejects keyword searches as a test measure…

“Searching by a compiled list of subject terms did not seem appropriate for testing Google Scholar. Google Scholar lacks a system of controlled vocabulary and search results reflect in many cases a full-text search of the document, whereas traditional databases only search the title and abstract keywords of a record.”

… yet Noll might have easily used intitle:”title of the article” with Google Scholar, to find specific articles. The intitle: search modifier is not mentioned in the paper. Instead Noll used a wider author search, then trawled the results for the target titles, but admits of this method of using Google Scholar that…

“some articles may have been impossible to find by using the author search.”

Seven things any new ejournal should consider

Having recently got up close and personal with thousands of ejournal URLs, here are seven suggestions for those who are considering launching an independent open ejournal in the arts and humanities.

1.   Register your own domain name. Try to make it human-readable and meaningful — e.g.: www.fabric-artists.org rather than using initials or shortened forms such as www.f-art.com. Pay for the domain and all hosted server costs up-front, for at least ten years, with a reliable commercial web hosting provider. This should not cost you more than about £600. Expensive, but it means that the university IT techies can’t capriciously juggle the root URL and thus break all your inbound links. Store all parts of the journal at your domain, calling no core content in from off-site, or from “slightly-different” URLs.

Problems solved: a) countless dead “404” links in ejournals list and directories just a few years old, and a circa 80% attrition rate on those more than five years old; b) a niche academic search-engine indexes your home page URL, but doesn’t also index the articles because you’ve stored them at a different URL.

2.   Consider using the URL and file name as a carrier for some basic metadata, including clearly indicating if the content is free or pay. For instance…

   www.technology-history.org/journal-issue-004/free-full-text/2009_adams_preindustrial_water_mills.html

Where preindustrial_water_mills are the first three words of the article title.

Without even accessing the document, a human can now glance at the URL in search results and read off:

   Journal name (Technology History)
   Issue number (Number 4)
   It’s from a journal
   It’s free full-text
   The year published (2009)
   The author surname (Adams)
   The first three words of the article title (“preindustrial water mills“)

As you can see, that’s much more useful than having something impenetrable such as:   www.hupt.stetford.edu/caij/admin/contentimages/38-02-106_h894.html   and far better than having a huge database-driven scripted URL. You’ll exclude common words such as ‘the’ from the article title, obviously.

Problems solved: a) a useful range of basic metadata is not automatically displayed alongside a link to the journal article, other than the title (if you’re lucky) and an often-misleading text snippet; b) users accessing via a standard public search-engine have to download and manually open your article file to find out simple things like when it was published and if it’s really free full-text.

3.   Don’t hang admin pages directly off the main URL. Put them in their own folder, e.g.: www.full-journal-name.org/editorial-files/our_editorial_board.html

Problem solved: Indexing the main domain also brings in all sorts of administrative fluff, old conference flyers, etc

4.   Publish in HTML, as well as in PDF.

Problem solved: PDF is print-oriented (so consider linking each issue to a POD book publisher such as Lulu), but with HTML people can do more interesting things with (like browser addons that auto-detect and auto-link citations on a page)

5.   Make sure all your articles contain basic information like: the journal title, issue number, and ideally your home-page URL in clickable form. Put this in the body text of the article. Also make sure your PDF file properties are all filled out correctly, as are your HTML headers. It’s just basic marketing really, but also useful for those who would organise knowledge.

Problem solved: A downloaded article from an open access ejournal very often has no embedded data giving the full journal title and issue number. Future generations won’t thank a researcher for telling them, “um yeah, but I once had that stuff via my personal copy of Zotero”.

6.   Zero tolerance for broken URLs and 404 errors. Never ever let your IT techies or web designers change your directory structure once it’s set. If they really have to for some world-shattering technical reason, then make sure you force them to set up durable (five-year minimum) working redirects for every article, or use some server magic to make the new structure look like the old structure to the outside world.

Problems solved: a) too many dead “404” links in ejournals directories just a few years old; b) blogs, discussion forums have many broken direct links to journal articles they’re discussing; c) there are even sometimes broken links on the journal website itself(!) caused by directory-juggling.

7.   Publicise. There’s nothing more disheartening than doing a Google search for link:www.your-established-ejournal.org — and finding that the only people who link to it are your university and a lone blog post from 2006. Being a journal on an obscure topic doesn’t mean you should be invisible. Google will bury you if you don’t have any inbound links, and (I would imagine) your authors may drift away if no-one links to or reads their articles. There’s also a whole planet out there, and the next expert in hyperkinetic light-art might be a kid sitting in a bush college in Uganda. She needs to find your excellent new article giving an overview of hyperkinetic light-art.

FireCite

Andy Hong already has a page for his 2009 undergraduate dissertation, titled “FireCite: A Browser Extension for Citation Recognition and Management” (2009). Not yet online, it seems, but there’s an abstract…

“This dissertation describes FireCite, a Mozilla Firefox extension that incorporates a citation parser and citation recognition. The citation parser is fast, lightweight, and can parse citations from HTML web pages with an overall F-measure of 0.878. Yet it can also parse plain-text citations with an overall F-measure of 0.97, comparable to larger and more complex parsers. The citation recognizer is also fast and lightweight with a high recall of 96%. FireCite proves that it is possible to perform citation recognition and parsing with real-time response and satisfactory accuracy. FireCite itself is able to recognise citations from any web page and extract basic metadata from them.”

Minutes of the development process + background papers | Latest version (0.501 with source code, 7th June 2009 — adds .ac to automatically processed domains)…

“As you surf, this extension detects citations on the webpage. You then have the option to save the information to a reading list, along with any attached PDF file.”

It seems to get confused (reads too many non-citations as if they were citations) on some types of pages, and it’s very basic. But it’s an interesting proof-of-concept for automatic finding/reading of citations on Web pages that are “in the wild” — compared to the popular Firefox addon Zotero which needs to find a “Zotero-friendly” website such as Google Scholar or Amazon in order to do something similar.

On the cards

An interesting if journalistic report in The Hindu (14th June 09), giving some insight into the state of online knowledge access at Indian universities…

very few libraries have an online catalogue available. “Even the University of Madras does not have information on its database online,” he [G.Sundar, director of the Roja Muthiah Research Library] says. … Many universities and colleges, have however access to online archives such as JSTOR…

“very few libraries have an online catalogue available”. So, presumably, most universities in India are still using card indexes?

A Manifesto for Scholarly Publishing

“A Manifesto for Scholarly Publishing”, over at the U.S. Chronicle of Higher Education (12th Jun 09)…

“the first key to a stronger and more vital university press is in the embrace of a broader array of fields, notably the professions, including medicine; engineering; computer, environmental, information, and management sciences; graphic design; and finance. [ all of which ] are often seen as peripheral to the humanities-centered core mission of universities, and to the heavily humanities-oriented program of university presses.”

“While naysayers may argue that publishing more books on the professions subverts the university press’s historical commitment to the humanities and culture, one could counter that those professional fields are themselves coming to define culture. Think of the growing influence on society of fields such as telecommunications, financial engineering, and cognitive science, as well as the increasingly ubiquitous influence of statistics and applied mathematics in everyday communications.”