Heads up

25 Sunday Oct 2009

Posted by futurilla in My general observations

Thanks for Pnoeric for the stylish new header for the JURN blog. You can see his full-size picture of the Copenhagen Old University Library at Flickr.

Tweaked JURN

28 Friday Aug 2009

Posted by futurilla in My general observations

≈ Leave a comment

I made some small tweaks to the colours, the font size and the typeface on JURN’s front page. If you’re not a designer you probably won’t notice the difference, but it should be a little easier on the eye now.

An interview with your browser

28 Sunday Jun 2009

Posted by futurilla in How to improve academic search, My general observations

≈ 1 Comment

With the release of the supposedly whippet-fast Firefox 3.5 just two days away, I’m wondering why browsers don’t do a short ‘search profile interview’ when they install. Rather like online dating ‘interviews’, I suppose, but with Google as the object of your affection rather than a Gordon/Gloria.

Then, on certain types of searches (i.e.: the vague ones) your browser would ping Google your carefully-considered ‘search profile’, and presto! — better search-results.

For example, an art historian doing a vague search for samuel palmer shoreham would never have to see results from dodgy poster websites, because the browser profile would say “my user is interested in art history and books and articles containing references”, and Google might also say “samuel palmer was a notable artist whose work is out-of-copyright”, and thus the modifiers -posters -framing -delivery would automatically be added to such a search, and pages with proper academic references would get a boost in the results.

Whereas the person whose browser profile said “frequently spends on home furnishings, subscribes to Homes & Gardens” will get the poster and prints websites pushed to the top, and the 50,000-word thesis on Christian visionary symbolism pushed to the bottom.

Yes, you could have removed those results manually (*) if you’re logged into Google, but you can only do that after the search. And most ‘vague’ searches will happen on searches that don’t tend to repeat themselves often.

( * I had about four poster-sales site in the first two pages of that search, yet I’m logged into Google and have been searching for academic stuff for months — Google seems to have learned little about what I want)

Privacy issues? Well, yes. But what if the browser could seamlessly re-configure a users’s vague search terms, based on their personal profile and known interests, before the query is sent to the engine? Think “search suggestions” on steroids, and without any annoyingly dumb flickery drop-down boxes that don’t have a clue about my interests.

Will your university press have to merge with the university library?

23 Tuesday Jun 2009

Posted by futurilla in My general observations

≈ Leave a comment

A report from the recent Conference of American University Presses barely hints at the unthinkable, made thinkable by looming cuts of billions of dollars to libraries and turbulent times at unprofitable presses… the university press will have to merge with the university library.

URLs to icons

21 Sunday Jun 2009

Posted by futurilla in How to improve academic search, My general observations

≈ Leave a comment

If there was a Firefox addon that converted URLs to icons…

There isn’t, sadly. 🙂 < of course, as you can see here, WordPress already has similar functionality — replacing a text-smiley with an icon.

The Dumbest Generation?

18 Thursday Jun 2009

Posted by futurilla in My general observations, Spotted in the news

≈ Leave a comment

A blog post in the U.S. Chronicle of Higher Education quickly rounds up a rash of apparently polemical books condemning a new “Twitter generation” of “dumb” young people (and, by implication, undergraduates). Others with equally shiny new books to sell claim the brains of students have been fundamentally re-wired by ‘growing up digital’, and the new breed of uber-student thus requires totally new forms of teaching by a new breed of teachers.

I’m suspicious of either extreme, and strongly doubt the existence of either a wholly slack-brained or a silicon-brained “Twitter generation”. I can’t help thinking that both these notions can serve as a handy scapegoat — able to neatly divert the blame for lowering standards in education from swivel-eyed socialist educational theorists and poor quality managers, to a tale in which impressionable students are seduced en masse by libertine online companies.

I suspect that we’re all the Google generation — in that that our superficial large-volume information-handling skills (i.e., ‘skimming’) have increased somewhat across the board (at least among those who are literate and took an interest in their education), but that almost no-one (young or old) is really excellent at searching or finding online, or at critically selecting and weaving what they find into something new.

One of the more interesting (but rarely discussed) aspects of this strong but patchy and often ramshackle shift from books / paper journals to digital learning and research, is the change in how serendipity (finding something useful that you’re not looking for) and misunderstanding happen — given that serendipity and misunderstanding seem to have been small but important elements in the ‘motor’ that drove chains of cultural production during the 20th century.

Google is in some ways a ‘serendipity engine’ (if you’re not searching it correctly, which few people seem to be able to do), while StumbleUpon is also a crude approximation of one — but I wonder if we might design far more streamlined and reliable ways of unearthing chance discoveries that will have meaning for cultural producers, while retaining some of their mystery and the potential for ‘creatively misunderstanding’ them. Perhaps not, perhaps it’s impossible in a world where everything now seems to be always-already discoverable — but it might be interesting to try.

Seven things any new ejournal should consider

16 Tuesday Jun 2009

Posted by futurilla in How to improve academic search, JURN tips and tricks, My general observations, Open Access publishing

≈ Leave a comment

Having recently got up close and personal with thousands of ejournal URLs, here are seven suggestions for those who are considering launching an independent open ejournal in the arts and humanities.

1. Register your own domain name. Try to make it human-readable and meaningful — e.g.: www.fabric-artists.org rather than using initials or shortened forms such as www.f-art.com. Pay for the domain and all hosted server costs up-front, for at least ten years, with a reliable commercial web hosting provider. This should not cost you more than about £600. Expensive, but it means that the university IT techies can’t capriciously juggle the root URL and thus break all your inbound links. Store all parts of the journal at your domain, calling no core content in from off-site, or from “slightly-different” URLs.

Problems solved: a) countless dead “404” links in ejournals list and directories just a few years old, and a circa 80% attrition rate on those more than five years old; b) a niche academic search-engine indexes your home page URL, but doesn’t also index the articles because you’ve stored them at a different URL.

2. Consider using the URL and file name as a carrier for some basic metadata, including clearly indicating if the content is free or pay. For instance…

www.technology-history.org/journal-issue-004/free-full-text/2009_adams_preindustrial_water_mills.html

Where preindustrial_water_mills are the first three words of the article title.

Without even accessing the document, a human can now glance at the URL in search results and read off:

   Journal name (Technology History)
   Issue number (Number 4)
   It’s from a journal
   It’s free full-text
   The year published (2009)
   The author surname (Adams)
   The first three words of the article title (“preindustrial water mills“)

As you can see, that’s much more useful than having something impenetrable such as: www.hupt.stetford.edu/caij/admin/contentimages/38-02-106_h894.html and far better than having a huge database-driven scripted URL. You’ll exclude common words such as ‘the’ from the article title, obviously.

Problems solved: a) a useful range of basic metadata is not automatically displayed alongside a link to the journal article, other than the title (if you’re lucky) and an often-misleading text snippet; b) users accessing via a standard public search-engine have to download and manually open your article file to find out simple things like when it was published and if it’s really free full-text.

3. Don’t hang admin pages directly off the main URL. Put them in their own folder, e.g.: www.full-journal-name.org/editorial-files/our_editorial_board.html

Problem solved: Indexing the main domain also brings in all sorts of administrative fluff, old conference flyers, etc

4. Publish in HTML, as well as in PDF.

Problem solved: PDF is print-oriented (so consider linking each issue to a POD book publisher such as Lulu), but with HTML people can do more interesting things with (like browser addons that auto-detect and auto-link citations on a page)

5. Make sure all your articles contain basic information like: the journal title, issue number, and ideally your home-page URL in clickable form. Put this in the body text of the article. Also make sure your PDF file properties are all filled out correctly, as are your HTML headers. It’s just basic marketing really, but also useful for those who would organise knowledge.

Problem solved: A downloaded article from an open access ejournal very often has no embedded data giving the full journal title and issue number. Future generations won’t thank a researcher for telling them, “um yeah, but I once had that stuff via my personal copy of Zotero”.

6. Zero tolerance for broken URLs and 404 errors. Never ever let your IT techies or web designers change your directory structure once it’s set. If they really have to for some world-shattering technical reason, then make sure you force them to set up durable (five-year minimum) working redirects for every article, or use some server magic to make the new structure look like the old structure to the outside world.

Problems solved: a) too many dead “404” links in ejournals directories just a few years old; b) blogs, discussion forums have many broken direct links to journal articles they’re discussing; c) there are even sometimes broken links on the journal website itself(!) caused by directory-juggling.

7. Publicise. There’s nothing more disheartening than doing a Google search for link:www.your-established-ejournal.org — and finding that the only people who link to it are your university and a lone blog post from 2006. Being a journal on an obscure topic doesn’t mean you should be invisible. Google will bury you if you don’t have any inbound links, and (I would imagine) your authors may drift away if no-one links to or reads their articles. There’s also a whole planet out there, and the next expert in hyperkinetic light-art might be a kid sitting in a bush college in Uganda. She needs to find your excellent new article giving an overview of hyperkinetic light-art.

Open access search?

12 Friday Jun 2009

Posted by futurilla in Academic search, How to improve academic search, My general observations

≈ 1 Comment

Following on from my previous post… a search for “open access” site:www.google.com/coop/ was discouraging. There are about twenty “living-dead” Custom Search Engines from 2006, but no large ones updated after 2006 (so far as I could tell from a quick visit).

Pouring out all this open access content is all very well, but where’s the competition and development in open access search?

And where are the simple common standards for flagging open content for search-engine discovery and sorting, for that matter? Judging by the structure and look of most academic repositories, internet search-engines are the last things on their minds.

Now of course I’m viewing things from the outside, as an independent curator and social entreprenuer, not a librarian or OA evangelist. But it seems to me that burying your PhD thesis deep in a repository cattle-car — seemingly with only a few keywords, an ugly template and an impenetrable URL for company — isn’t serving it or the author very well. Especially in terms of metadata and tagging leading to full-text search discovery. As the authors of “Experiences in Deploying Metadata Analysis Tools for Institutional Repositories” recently wrote in Cataloging & Classification Quarterly (No. 3/4, 2009)…

“Current institutional repository software provides few tools to help metadata librarians understand and analyse their collections.”

Which doesn’t bode well for search-engines aiming to hook into and sort the same metadata. That sort of statement might have been acceptable in 1999, but it’s a damning statement to hear from librarians in 2009. And another paper in the same issue concludes that there is…

“a pressing need for the building of a common data model that is interoperable across digital repositories”.

Now I wouldn’t know a Dublin Core from a Dublin Pint, but how difficult would it have been to build a search-engine friendly tag that allows a repository to tell the world “this is a root free-to-all full-text file” and “you’re not going to get any full-text for this title”? Or to allow the “one-click” filtering out of science and medical-related OA material across search results from a thousand repositories?

This could be done at the URL level. For example by using a standard universal URL structure that could be read by machines and humans alike. For a journal it might run something like:

www.technology-history.org/journal-issue-004/free-full-text/2009_adams_preindustrial_water_mills.html

Where preindustrial_water_mills are the first three words of the article title.

Without even accessing the document, a human can now glance at the URL in search results and read off:

For a repository it could look something like:

www.uni.edu/oa-repository/free-full-text/theses/history/history-of-technology/2009_adams_preindustrial_water_mills.html

And with a uniform standard for URL structures, university IT techies would not be allowed to fiddle with the directory structure and thus break the URL. All full-text files in U.S. repositories could then be searched simply by indexing one line:

http://www.*.edu/oa-repository/free-full-text/

Anyway, rant over. I did find a large Google CSE for Economics. Not much use for the arts and humanities you might think, and last updated in 2006, but due to its sheer size (23,613 sites from apparently reputable sources) searches for…

“creative economy” keyword

“creative industries” keyword

“art market” keyword

… all seem to show it still has some use as a discovery tool.

An academic Firefox plugin for the discovery and sharing of free scholarly articles

06 Saturday Jun 2009

Posted by futurilla in How to improve academic search, My general observations

≈ 5 Comments

Bingo! Towards a description of a low-overhead academic Firefox plugin for the discovery and sharing of free scholarly articles:

1. At the click of a toolbar button, a web-browser plugin ‘reads’ the text of an online academic paper you’re browsing (inc. a PDF if it’s opened within a browser). It seeks and finds the references / bibliography section.

2. The plugin automatically detects any full well-formed academic references in the standard formats, extracts and de-duplicates these, and then uses javascript to create simple new links. Each of these new links is built on-the-fly and embeds the exact title of the article, alongside the surname of the author. Possibly these links could be presented in a sidebar, or even as a page overlay. The plugin doesn’t try to seek or add any direct URLs for the article.

e.g.: it would search for… Craig “Werewolf Cinema of the 1930s”

3. The user can set the plugin to feed the overlay link to the main Google index or some other suitably deep search engine.

4. If the user clicks on such a “fuzzy” overlay link, they will hope to discover a direct link to the free full-text of the article near the top of the search results.

5. If they do find free full-text, then the user has the ability to click a simple feedback button on the browser toolbar. This passes the surname and article title information back to a public database of open academic article titles.

6. If it’s an article name/title the database hasn’t seen before, it will flag that unique combination of author and article title as having a reasonable probability of leading to open full-text. The database could also do automatic tagging and sorting by academic discipline, as judged by detecting common keywords and phrases in the title(s). At no time are any URLs (or downloading of pages/PDFs by the plugin database) involved in the process.

Once the database is established, in version 2.0 of the plugin a feedback loop might be created. The database would now indicate, via a three-star rating alongside the link, if the title had a high probability of being freely available somewhere. This might not need a dedicated server — the plugin might instead locally install a huge list of free article titles as detected by v1.0. For user convenience this list could be split up by discipline.

The plugin is thus using Surname “Article title” as the implicit unique identifier for the article, which is good enough for a search-engine even if it probably causes librarians to shiver in horror.

The database can’t use journal titles to determine if an article should be flagged as “likely to be free”, since journal titles are rarely extractable or detectable by search-engines in individual open-access articles. Nor can it use base URLs, since the plugin aims to completely bypass the need for direct-to-article URLs. The public database of open academic article titles would need to be hidden from search-engines, so as not to contaminate search results.

More than 2,500 ejournals now indexed

26 Tuesday May 2009

Posted by futurilla in My general observations

≈ Leave a comment

12 new titles added today, thus breaking the 2,500 mark for the number of ejournals indexed by JURN.

News from JURN

~ search tool for open access content

Category Archives: My general observations

Heads up

Tweaked JURN

An interview with your browser

Will your university press have to merge with the university library?

URLs to icons

The Dumbest Generation?

Seven things any new ejournal should consider

Open access search?

An academic Firefox plugin for the discovery and sharing of free scholarly articles

More than 2,500 ejournals now indexed