• Directory
  • FAQ: about JURN
  • Group tests
  • Guide to academic search
  • JURN’s donationware
  • Links
  • openEco: titles indexed

News from JURN

~ search tool for open access content

News from JURN

Category Archives: How to improve academic search

GeoDeepDive

04 Wednesday Jun 2014

Posted by futurilla in How to improve academic search, Spotted in the news

≈ Leave a comment

GeoDeepDive is software that helps…

geo-scientists extract data that is buried in the text, tables, and figures of journal articles and web sites […] As of today, GeoDeepDive has processed over 36K research papers and 134K web pages

Jisc: Action on discoverability

03 Tuesday Jun 2014

Posted by futurilla in How to improve academic search, Official and think-tank reports, Spotted in the news

≈ Leave a comment

David Prosser at Jisc blogs on the need for action on discoverability…

… 40% of researchers kicked off their project with a trawl through the Internet for material, while only 2% preferred to make a visit to a physical library space. [yet] nearly half of all items within digitised collections are not discoverable via major search engines by their name or title [and, even worse] digitised collections become harder and harder to find over time, for a variety of complex reasons.

Oaddo

18 Sunday May 2014

Posted by futurilla in Academic search, How to improve academic search, Spotted in the news

≈ Leave a comment

Oaddo is an early alpha of a cool new search tool. Imagine that Wikipedia and Pinterest combined to give autocomplete a usability makeover, with Trello acting as the makeup girl. The aim is to help you do deep ‘research search’ when you don’t really know what you’re searching for.

It has an interesting way of allowing your search terms to interact with clustered semantic tags, for drilling down to the best search result. Sort of like a Google autocomplete / autosuggest that’s slowed way down and is largely under your control, and is curated by humans — and as a consequence is not dumb.

Oaddo has a nice clean interface too, which is neatly poised between power and simplicity. The developer Tim Borny has obviously been looking at Trello and Pinterest for inspiration. Although at the moment the discarding of search modifier tags takes two clicks, instead of a fun one-click “fling it to the discard tray” movement.

The other innovation is that it aims to have a democratic user-driven model. That aspect might take Oaddo a long way, provided there’s a critical mass of people — and provided a mechanism can be found to reign in the inevitable SEO spivs, ideological censors, and WikiPolice types.

* Users will ‘vote’ on content, curate content and the database of related terms.

* The community will drive the addition of new features.

So, very interesting. Amid the sea of recent search launches, this is actually one to watch. Here’s Tim Borny’s full explanation…

[youtube https://www.youtube.com/watch?v=vCGeIV9NctA?rel=0&w=420&h=315]

New interview on COAR and repositories

04 Sunday May 2014

Posted by futurilla in How to improve academic search, Open Access publishing, Spotted in the news

≈ Leave a comment

New long interview with Kathleen Shearer, Executive Director of COAR, on repositories. With a strong focus on discoverability as seen from a broad strategic perspective. From the intro and questions…

“locating and accessing content in OA repositories remains a hit and miss affair, and while many researchers now turn to Google and Google Scholar when looking for research papers, Google Scholar has not been as receptive to indexing repository collections as OA advocates had hoped. … 15 years after the Santa Fe meeting they [researchers] still find it extremely difficult, if not impossible, to search effectively in and across OA repositories”

From the interview…

… “mega-journals” are essentially repositories with overlay services. We should be participating in projects that demonstrate the added value of repositories and repository networks across the research life cycle.” (Kathleen Shearer)

Bing Predict

23 Wednesday Apr 2014

Posted by futurilla in How to improve academic search, My general observations, Spotted in the news

≈ Leave a comment

The Bing search engine is now offering predictions…

“… teams within Bing have been experimenting with useful ways that we can harness the power of Bing to model outcomes of events. … Today we are bringing these insights directly to our search results pages. Based on a variety of different signals including search queries and social input from Facebook and Twitter, we are unveiling an experiment we’ve built to give you our prediction of the outcome of a given event.”

The front cover of the latest Smithsonian magazine also heralds the Future Studies meme…

smithson

Schema.org

27 Thursday Mar 2014

Posted by futurilla in How to improve academic search, Open Access publishing, Spotted in the news

≈ Leave a comment

I had a quick look at the full list of Schema.org tags, which are now available in Google CSEs. They can be used to filter the CSE’s site list, serving to “Restrict pages from the above site list to only those that contain [chosen] Schema.org types”. Handy if you have a huge single site of HTML/CSS/XML that you can grep, and you want to prepare it for selective CSE search without having to juggle directories and file names.

It looks to me like those tagging open access scholarly articles would need to be able to chain Schema.org tags into something like…

CreativeWork: ScholarlyArticle: TransferAction: DownloadAction: GiveAction:

Whereas paywall publishers might need something like:

CreativeWork: ScholarlyArticle: TransferAction: DownloadAction: SellAction:

But at present there seems to be only the basic undifferentiated…

CreativeWork: ScholarlyArticle:

Even if there were workable OA additions to Schema.org, there would still the huge problems of: i) persuading people to add the tags to all their ongoing content at the article level, and to do so correctly and consistently; and ii) to have them go back and accurately tag perhaps two decades or more of existing open access articles.

Google Scholar’s debris

21 Friday Mar 2014

Posted by futurilla in Ecology additions, How to improve academic search, JURN's Google watch, Spotted in the news

≈ 1 Comment

I found a 2013 article from geoscientists who had tested Google Scholar: “Literature searches with Google Scholar: Knowing what you are and are not getting”. Although the body of the paper states that their test phrase was “wildfire-related debris flows”, the data shows they actually tested Scholar with the keywords wildfire-related debris flows. They reportedly found that…

“free articles were available in PDF format for 88% of citations returned by Google Scholar. They were available from open-access journals or via links to organizational sites where authors had posted their publications.”

However if you actually look at their linked search-results data file, then the above statement needs additional clarification. Since it’s clear that paywall articles from Elsevier, Springer and the like, appearing in their Scholar results, were being counted toward those “free articles”. It turns out that many of these were “free” only via a DigiTop proxy overlay for Scholar that is, in the words of DigiTop, “available to USDA employees only”. Nice if you work under the U.S. Department of Agriculture umbrella, but it seems that those outside have to pay.

Does Google Scholar perhaps need to add some kind of “paywall box detector” to its scraper bots? Then perhaps something like  [PDF] [-||-]  could be added on the right-hand column of the Scholar results, to indicate a PDF that’s “available maybe” — but which will prove to have a paywall that needs to be either backed out from or negotiated? And perhaps  [PDF] [-~-]  could indicate a genuine direct link to a bona fide PDF file?

Anyway… this is what geoscientists are talking about when they refer to wildfire-related debris flows. Seems like it might be a geological process that intelligent farmers, hiker-campers, and treeline homesteaders around the world would like to learn some precise details about…

33-debav

3607389_orig

Giant mudslides, basically.

Incidentally, the same wildfire-related debris flows search in JURN needs to be tightened up just a little for strong results. Using wildfire-related “debris flows” works better, though the first six pages of good results do stray just a little (to pick up what seem to be three articles about prehistoric ‘dinosaur-era’ debris flow events). Yet even on this test JURN appears to be doing about twice as well as Google Scholar in terms of getting open articles, once Scholar’s ‘false-positive’ paywall PDFs from Elsevier & co. are subtracted from Scholar’s results.

‘A long time ago in a galaxy far, far away…’

10 Monday Mar 2014

Posted by futurilla in How to improve academic search

≈ Leave a comment

Ten years ago, today…

JISC ITT commission: A study to forecast a delivery, management & access model for eprints & open access journals within Further and Higher Education. … Access should be streamlined and free at the point of use, irrespective of the source of content.

Submission deadline:
   10th March 2004 12:00
Funding:
   £30,000

Global Social Science & Humanities Publishing 2013-2014

03 Monday Mar 2014

Posted by futurilla in How to improve academic search, Official and think-tank reports, Spotted in the news

≈ Leave a comment

Joseph Esposito has usefully had a peek inside a very expensive commercial market report titled Global Social Science & Humanities Publishing 2013-2014.

Social/Humanities publishing is found to be perhaps 25% of the size of Science/Technology/Medicine, at around $5bn. That actually strikes me as something of an achievement, when you consider that we have far smaller research funding inputs and a smaller technical/training infrastructure to call on. But perhaps the $5bn figure is given a strong boost by teacher training textbooks, social work manuals and the like?

Joseph highlights the report’s finding of a highly fragmented market. This market fragmentation is one of the reasons I’m skeptical about the success of a ‘one metadata to rule them all’ solution to OA indexing and discovery. It seems that DOAJ-listed OA journal titles can’t even find their way in full-text into the largest of commercial databases (such as EBSCO Complete) at higher levels than just over 20%. When last heard of the Web of Science / Scopus seemed to be barely scraping 1,000 OA titles indexed. One art history study found that Google Scholar could index only half the DOAJ’s OA art history titles. A dastardly conspiracy to keep OA titles out of these big indexes seems unlikely. So I suspect it’s largely due to many OA editors in the arts and humanities not giving a fig about providing the means to automatically index their content. Their widespread lack of something as basic as RSS feeds seems to confirm that. Add to that the fact that only 56% of DOAJ journals can supply the DOAJ with article metadata. Persuading non-librarian types to do something as simple tag all their back-issue content with some simple new machine-readable OA tag thus seems rather a long shot. Persuading mainstream publishers to do the same? Well… maybe, but what’s their incentive for that? Even if they do, will they allow mass harvesting of the OA articles? Nor are librarians likely to be of much use, after the fact of publication — since they seem to have mostly failed to apply even their own metadata standards to open content, and open repository metadata quality is reported to be dire.

Towards a Google Scholar API

27 Thursday Feb 2014

Posted by futurilla in How to improve academic search, JURN's Google watch

≈ Leave a comment

Wouter has hacked out a Google Scholar API workflow today, sort of. I suspect the reason Scholar has never offered an API is the agreements Google has with the large commercial journal publishers and citation database providers.

← Older posts
Newer posts →
RSS Feed: Subscribe

 

Please become my patron at www.patreon.com/davehaden to help JURN survive and thrive.

JURN

  • JURN : directory of ejournals
  • JURN : main search-engine
  • JURN : openEco directory
  • JURN : repository search
  • Categories

    • Academic search
    • Ecology additions
    • Economics of Open Access
    • How to improve academic search
    • JURN blogged
    • JURN metrics
    • JURN tips and tricks
    • JURN's Google watch
    • My general observations
    • New media journal articles
    • New titles added to JURN
    • Official and think-tank reports
    • Ooops!
    • Open Access publishing
    • Spotted in the news
    • Uncategorized

    Archives

    • February 2026
    • January 2026
    • October 2025
    • May 2025
    • April 2025
    • September 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • June 2023
    • May 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • March 2020
    • February 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    • December 2016
    • November 2016
    • October 2016
    • September 2016
    • August 2016
    • July 2016
    • June 2016
    • May 2016
    • April 2016
    • March 2016
    • February 2016
    • January 2016
    • December 2015
    • November 2015
    • October 2015
    • September 2015
    • August 2015
    • July 2015
    • June 2015
    • May 2015
    • April 2015
    • March 2015
    • February 2015
    • January 2015
    • December 2014
    • November 2014
    • October 2014
    • September 2014
    • August 2014
    • July 2014
    • June 2014
    • May 2014
    • April 2014
    • March 2014
    • February 2014
    • January 2014
    • December 2013
    • November 2013
    • October 2013
    • September 2013
    • August 2013
    • July 2013
    • June 2013
    • May 2013
    • April 2013
    • March 2013
    • February 2013
    • January 2013
    • December 2012
    • November 2012
    • October 2012
    • September 2012
    • August 2012
    • June 2012
    • May 2012
    • April 2012
    • March 2012
    • February 2012
    • January 2012
    • December 2011
    • November 2011
    • October 2011
    • September 2011
    • August 2011
    • July 2011
    • June 2011
    • May 2011
    • April 2011
    • March 2011
    • February 2011
    • January 2011
    • December 2010
    • November 2010
    • October 2010
    • September 2010
    • August 2010
    • July 2010
    • June 2010
    • May 2010
    • April 2010
    • March 2010
    • February 2010
    • January 2010
    • December 2009
    • November 2009
    • October 2009
    • September 2009
    • August 2009
    • July 2009
    • June 2009
    • May 2009
    • April 2009
    • March 2009
    • February 2009

    Proudly powered by WordPress Theme: Chateau by Ignacio Ricci.