• Directory
  • FAQ: about JURN
  • Group tests
  • Guide to academic search
  • JURN’s donationware
  • Links
  • openEco: titles indexed

News from JURN

~ search tool for open access content

News from JURN

Author Archives: futurilla

Ancient Athens

15 Monday Jun 2009

Posted by futurilla in Academic search

≈ Leave a comment

From the blogs. ‘Why I made JURN’, part 199…

“…it’s so difficult to find a lot of academic articles online unless you subscribe to a service like JSTOR. It took me a good couple of hours to unearth a research article the other day using Athens/JSTOR (it was so much hassle that it would almost have been easier to go to the library).”

…and this from a savvy Web 2.0 staff-development trainer, blogging from a major British university. How must some of the less able undergraduates fare?

JURN Directory checked and refreshed

15 Monday Jun 2009

Posted by futurilla in JURN metrics

≈ Leave a comment

I ran the excellent Linkbot v5 link-checking software over the JURN Directory. The results are: five dead links removed (also removed one open journal that has sold out and is now located behind a paywall at SAGE); 15 moved links re-located and repaired; and I also removed a couple of links to journals that had only marginal free content.

If you are keeping a local saved copy of the JURN Directory page on your desktop (possibly to avoid the “bouncy puppy” concertina effects, which vanish if the page is saved locally), please refresh it.

Two new full-text research assistance services

13 Saturday Jun 2009

Posted by futurilla in Academic search, Spotted in the news

≈ Leave a comment

A couple of new commercial start-ups in medical/scientific full-text research assistance services, offering to outsource some of the heavy-lifting for librarians — Pubget and the rather clunkily-named Mighty Linkout Machine. Amazingly, given the seemingly enormous resources poured into science journals and elite universities, these services are said to be needed because scientists and doctors are…

“frustrated by the challenge of getting full-text PDF access to science journal articles — even while working inside well-endowed institutions like Harvard and Oxford”

Giving free JSTOR access to alumni

13 Saturday Jun 2009

Posted by futurilla in Academic search, Spotted in the news

≈ Leave a comment

Now here’s a nice move. Southern Illinois University is giving free JSTOR access to its alumni…

“SIU alums can access JSTOR anywhere in the country after registering on the Alumni Association Web site.”

If there was one thing that would get me back in touch with my old alma mater, after having lost touch with the alumni magazine during a few house moves, that would be it.

Workload allowances for journal editing

13 Saturday Jun 2009

Posted by futurilla in Economics of Open Access, Open Access publishing

≈ 1 Comment

An interesting point from the publisher of an independent commercial academic journal…

The [ universities and their various Research Assessment Exercises ] have created, and they sustain, an academic assessment system that is very heavily dependent on academic journals, but which gives no credit whatever for the editing of such journals. The universities offer precious little encouragement (read “no material support” and “no workload allowance”) for the editing or publication of academic journals.

Common Tag and Search BOSS

13 Saturday Jun 2009

Posted by futurilla in Academic search, How to improve academic search

≈ Leave a comment

This looks somewhat interesting. Just launched, Common Tag…

“is an open tagging format developed to make [ Web ] content more connected, discoverable and engaging. Unlike free-text tags, Common Tags are references to unique, well-defined concepts, complete with metadata and their own URLs.”

From what I read, it sounds a bit like herding cats — attempting to persuade (firstly) bloggers and social bookmarkers to use standardised vocabularies and terminology for content tagging. I suspect it’ll find difficulties in gaining traction, simply due to the sheer size of the Web. Nice logo, though…

commont

It would be interesting to see an academic version, which could auto-read a document and suggest and automatically embed (microformat or RDFa?) tags using the A&AT terms.

And I just found out about the Yahoo Search BOSS, which seems to have been around in mature form since late 08. It’s Yahoo’s competitor to Google CSE. It seems to have appeared during their recent takeover troubles, which doesn’t inspire confidence. However, it’s getting new features and appears to be under active development. New sorting functions have apparently been added to BOSS, offering sorting by date and/or a specified time range (although it seems that may be limited to custom News search?). There’s also a Python-driven mashup feature, although at present people seem to be using this to add rather naff-looking context-aware sidebars alongside search-results. There’s also a kicker in the small print…

In the near future, we will be introducing a fee structure for BOSS

If sorting by date was a feature that could be added to Google CSE results, and a keyword-targetted RSS feed was then allowed to run from that sorting, JURN could feed you a usable approximation of a rolling keyword-specific table-of-contents alert from 3,000+ ejournals. Does the current standard open access ejournal publishing software allow that sort of cross-journal alerting service, I wonder?

Getting only the free articles into JURN

13 Saturday Jun 2009

Posted by futurilla in JURN metrics

≈ Leave a comment

Someone asked about what comes into the JURN index, when a title is indexed but only offers a limited amount of free full-text or “free-sample” articles. Does the rest of the online material (link-less tables-of-contents, abstracts with no full-text links etc) from the journal also enter JURN? The answer is: no, not usually. It’s usually possible to filter at the URL level so that only the free content enters JURN. For example, by only indexing URLS such as:

http://www.journal.com/journal/sample/*.pdf

http://www.journal.edu/journalABC/documents/*.pdf

A real-world example is:

http://www.egyptpro.sci.waseda.ac.jp/pdf*/*/*.pdf

Where “*” is the Google CSE wildcard. Of course if some dimwit IT techie then decides to juggle the directory structure, it will erase the journal from JURN. But that’s a risk any directory or search-engine takes.

Sometimes a few PDFs to do with society or journal administration matters can be called into search along with the articles, if all the PDFs sit indiscriminately in a single URL path. A search for:

site:http://www.scholarly-society-journal.info/ filetype:pdf

… will usually show if there are too many of these. Google tends to bunch that sort of material at the top of site: search results. Usually there are only a dozen or so.

It’s different with the few ejournals that cheekily use standard ‘open access’ publishing software, but which actually keep recent articles locked away behind a one-year or even three-year rolling paywall. The software is not intelligent enough to place paywall article abstract pages on a different and distinctive URL path, and then to automatically transfer&bounce these when the article becomes free. But by indexing only the .pdf path in such cases, that will usually call only fulltext articles into JURN.

Open access search?

12 Friday Jun 2009

Posted by futurilla in Academic search, How to improve academic search, My general observations

≈ 1 Comment

Following on from my previous post… a search for “open access” site:www.google.com/coop/ was discouraging. There are about twenty “living-dead” Custom Search Engines from 2006, but no large ones updated after 2006 (so far as I could tell from a quick visit).

Pouring out all this open access content is all very well, but where’s the competition and development in open access search?

And where are the simple common standards for flagging open content for search-engine discovery and sorting, for that matter? Judging by the structure and look of most academic repositories, internet search-engines are the last things on their minds.

Now of course I’m viewing things from the outside, as an independent curator and social entreprenuer, not a librarian or OA evangelist. But it seems to me that burying your PhD thesis deep in a repository cattle-car — seemingly with only a few keywords, an ugly template and an impenetrable URL for company — isn’t serving it or the author very well. Especially in terms of metadata and tagging leading to full-text search discovery. As the authors of “Experiences in Deploying Metadata Analysis Tools for Institutional Repositories” recently wrote in Cataloging & Classification Quarterly (No. 3/4, 2009)…

“Current institutional repository software provides few tools to help metadata librarians understand and analyse their collections.”

Which doesn’t bode well for search-engines aiming to hook into and sort the same metadata. That sort of statement might have been acceptable in 1999, but it’s a damning statement to hear from librarians in 2009. And another paper in the same issue concludes that there is…

“a pressing need for the building of a common data model that is interoperable across digital repositories”.

Now I wouldn’t know a Dublin Core from a Dublin Pint, but how difficult would it have been to build a search-engine friendly tag that allows a repository to tell the world “this is a root free-to-all full-text file” and “you’re not going to get any full-text for this title”? Or to allow the “one-click” filtering out of science and medical-related OA material across search results from a thousand repositories?

This could be done at the URL level. For example by using a standard universal URL structure that could be read by machines and humans alike. For a journal it might run something like:

   www.technology-history.org/journal-issue-004/free-full-text/2009_adams_preindustrial_water_mills.html

Where preindustrial_water_mills are the first three words of the article title.

Without even accessing the document, a human can now glance at the URL in search results and read off:

   Journal name (Technology History)
   Issue number (Number 4)
   It’s from a journal
   It’s free full-text
   The year published (2009)
   The author surname (Adams)
   The first three words of the article title (“preindustrial water mills“)

For a repository it could look something like:

   www.uni.edu/oa-repository/free-full-text/theses/history/history-of-technology/2009_adams_preindustrial_water_mills.html

And with a uniform standard for URL structures, university IT techies would not be allowed to fiddle with the directory structure and thus break the URL. All full-text files in U.S. repositories could then be searched simply by indexing one line:

http://www.*.edu/oa-repository/free-full-text/

Anyway, rant over. I did find a large Google CSE for Economics. Not much use for the arts and humanities you might think, and last updated in 2006, but due to its sheer size (23,613 sites from apparently reputable sources) searches for…

“creative economy” keyword

“creative industries” keyword

“art market” keyword

… all seem to show it still has some use as a discovery tool.

A sea of CSEs

12 Friday Jun 2009

Posted by futurilla in Academic search, How to improve academic search

≈ 2 Comments

I had a quick look around for other Google Custom Search Engines, via a simple search for:

keyword site:www.google.com/coop/

Living-dead CSEs from circa-2006 litter the results, of course. Probably made in 30 minutes during the first flush of public interest in Google’s new toy, usually indexing less than 30 items, and then seemingly forgotten about within 30 days.

I guess that’s one of the main reasons why people don’t seem to hold specialist Google CSEs in high regard. Which probably helps to explain why a search for 2009 site:www.google.com/coop/ seems to show that only a mere 39 public CSE have either been built or updated in the last six months. It seems a shame that the academic community is fiddling with often-unlovable and quickly-stale niche wikis, while such a powerful tool is all-but unused except for an occasional private one-site index. It’s not as if CSEs don’t have tools for collaborative index-building and weeding.

With a few months of careful work by a professional or subject-specialist, there’s no reason why a CSE can’t hold its head up alongside funded/commercial services — as I hope I’ve shown with JURN. And if a developer plans ahead and uses some common tools, basic maintainance of a large curated engine — once complete — shouldn’t take more than a couple of days of work per year.

I did find a few CSEs in the humanities still showing some stamina…

Theological journal search (340+ titles inc. findarticles.com, last updated Jan 2009).

Online Biblical Studies journals (123 titles, the titles freely listed, last updated 2008).

Judaic Studies in English (278 sites, last updated Sept 2007).

Alcuin Society (139 sites on bibliophilia and book arts, last updated Oct 2008).

AuseSearch (All open access academic repositories in Australia that are listed in Kennan & Kingsley at Feb 2009).

Film Blogs (139 titles, the titles freely listed, last updated June 2009. Looks like a strong tool for quickly finding genuine reviews from film-buffs, as opposed to marketing psuedo-reviews).

Busador Cultural (a large academic-cultural-arts search-engine for Spanish-language material).

So where might there be scope for a strong new curated CSE, with a nice balance of focus and scope? It might be useful to have an engine for “books still of scholarly worth, and other useful non-fiction” which selects from the ebooks that are flooding out from the out-of-copyright book digitisation projects, indexing the full-text. Books such as Tom Wedgwood, the first photographer and Kitecraft and Kite Tournaments. There has to be a more enticing way to access this stuff than getting your keywords tangled in creaky Victorian potboilers and agricultural pamphlets from 1932, or ploughing through a daily list seemingly endlessly populated by thousands of 1920s pulp novels and Victorian romances. But I’m willing to bet that there’s no flag in the metadata which says “non-fiction / just the cool stuff”, so it might take a lot of work.

Blind Search

11 Thursday Jun 2009

Posted by futurilla in Academic search, How to improve academic search, Spotted in the news

≈ Leave a comment

The academic blog Walt at Random tries out a new search tool, Blind Search…

“You type in a search. You get back the first 10 results for each of three search engines, displayed in three parallel columns. You click on one of three “vote for this search engine” buttons, based on the column of results that seem to match your query best. Then, and only then, Blind Search shows you the engine used for each column.

Sure to be a fun ice-breaker in the hotel lobby at the First Conference on Open Access Scholarly Publishing, 14th – 16th Sept 09, Sweden.

← Older posts
Newer posts →
RSS Feed: Subscribe

 

Please become my patron at www.patreon.com/davehaden to help JURN survive and thrive.

JURN

  • JURN : directory of ejournals
  • JURN : main search-engine
  • JURN : openEco directory
  • JURN : repository search
  • Categories

    • Academic search
    • Ecology additions
    • Economics of Open Access
    • How to improve academic search
    • JURN blogged
    • JURN metrics
    • JURN tips and tricks
    • JURN's Google watch
    • My general observations
    • New media journal articles
    • New titles added to JURN
    • Official and think-tank reports
    • Ooops!
    • Open Access publishing
    • Spotted in the news
    • Uncategorized

    Archives

    • February 2026
    • January 2026
    • October 2025
    • May 2025
    • April 2025
    • September 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • June 2023
    • May 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
    • October 2014
    • September 2014
    • August 2014
    • July 2014
    • June 2014
    • May 2014
    • April 2014
    • March 2014
    • February 2014
    • January 2014
    • December 2013
    • November 2013
    • October 2013
    • September 2013
    • August 2013
    • July 2013
    • June 2013
    • May 2013
    • April 2013
    • March 2013
    • February 2013
    • January 2013
    • December 2012
    • November 2012
    • October 2012
    • September 2012
    • August 2012
    • June 2012
    • May 2012
    • April 2012
    • March 2012
    • February 2012
    • January 2012
    • December 2011
    • November 2011
    • October 2011
    • September 2011
    • August 2011
    • July 2011
    • June 2011
    • May 2011
    • April 2011
    • March 2011
    • February 2011
    • January 2011
    • December 2010
    • November 2010
    • October 2010
    • September 2010
    • August 2010
    • July 2010
    • June 2010
    • May 2010
    • April 2010
    • March 2010
    • February 2010
    • January 2010
    • December 2009
    • November 2009
    • October 2009
    • September 2009
    • August 2009
    • July 2009
    • June 2009
    • May 2009
    • April 2009
    • March 2009
    • February 2009

    Proudly powered by WordPress Theme: Chateau by Ignacio Ricci.