Google Translate page embed

07 Monday Dec 2015

Posted by futurilla in JURN tips and tricks, JURN's Google watch

Google Translate option embedded on a front page, as a top menu-bar. Nice. First time I’ve seen this, and I visit a lot of Brazilian and Spanish front-pages…

Seems to be done fairly simply, by adding…

… to the foot of the page’s HTML.

Some intitle: changes in JURN

04 Friday Dec 2015

Posted by futurilla in JURN tips and tricks, JURN's Google watch

≈ Leave a comment

JURN search results seem to have changed a bit recently. Specifically, those that are obtained with the popular intitle: search modifier. It seems Google is now running intitle: against the actual document title, rather than against the text that forms the hyperlink.

For instance, search via JURN for…

intitle:turtles Hawaii “longline fishing” bycatch

… and some of the ’10 blue links’ titles returned will lack the word ‘turtles’ in them. I don’t remember that happening before. Did Google break? It seems not — loading up such links shows that the fulltext does indeed have ‘turtles’ in the article title (the title that heads the actual document).

This means that users of JURN should not overlook intitle: results links that seem to lack their desired keyword or phase.

My guess is that Google Search’s document title identification and extraction is improving, behind the scenes. But that the results server is told not to waste good computational time, and so is free not to plug each and every article title into the results links. Maybe the Googleplex figures that anyone smart enough to use intitle: will pretty soon figure out that all the search results need to be considered when using intitle:, whether or not the desired intitle: keyword appears in a blue link or not.

I also noticed that Google may even be truncating the oh-so-hip preambles that are common on academic article titles in the arts and humanities. For instance, the results link that in 2009 appeared in JURN worded as…

“Home on the Range: Space, Nation, and Mobility in John Ford’s The …”

… now appears in JURN simply as…

“Space, Nation, and Mobility in John Ford’s The Searchers”

I haven’t tested this very extensively, but if I’m correct it may be more evidence of Google getting better at article title identification and manipulation, and/or at weighing a long article title against the article’s abstract. Something like…

   SPLIT document title on “:”
   MATCH both sides of “:” against the article abstract
   IF the words that occur before a “:” DO NOT MATCH words in the abstract
   THEN truncate the document title before “:”

The snippet below a JURN results link is also starting to be a citation of sorts, in certain circumstances…

seachers2015

Author surname in capitals, even. Very nice, even if it is taken from the document’s own formatting rather than from some new gee-whizz improvement in Google. Journal editors, and those slapping generic cover pages on repository PDFs, might do well to check out that particular PDF article’s front page and seek to duplicate its simplicity. Since it obviously plays so well with Google.

Google Scholar banned in China

28 Saturday Nov 2015

Posted by futurilla in JURN's Google watch, Spotted in the news

≈ 1 Comment

Paul Stapleton, associate professor at the Hong Kong Institute of Education, writes in the South China Morning Post today that… “China must unblock Google Scholar”…

… it is curious to note what has happened recently on the mainland [China]. Google Scholar is no longer available there.”

It’s a weak article but at least it’s made me aware that Scholar, as well as the main Google Search, had been blocked in mainland China. Looking back through the surprisingly sparse western news reports, I see that the Chinese national block reportedly began in earnest on 29th May 2014. The New York Times reported in September 2014 “China Clamps Down on Web, Pinching Companies Like Google”…

… blocking virtually all access to Google websites [from 29th May 2014 onwards, and … ] the block has largely remained in place ever since. […] Jin Hetian, an archaeologist in Beijing […] said. “When in China, I’m almost never able to access Google Scholar, so I’m left badly informed of the latest findings.”

Back in January 2015 The New York Times reported “China Further Tightens Grip on the Internet”…

In recent weeks, a number of Chinese academics have gone online to express their frustrations, particularly over their inability to reach Google Scholar, a search engine that provides links to millions of scholarly papers from around the world. [there is now an energy-sapping] unending scramble to find ways around website blockages… “

An April 2015 Forbes article “How The Great Firewall Prevents China From Becoming A World Education Power” failed to mention Scholar, but the journalist (visiting Shanghai at the time) opened by reminding readers that…

all things Google are all blocked [in China]”

The reason for the ban appears to be ideological. The respected Index on Censorship had an article “Return of the Red Guards: the risks faced by students and teachers criticising the government line in China” in their June 2015 issue, that opened…

Since Xi Jinping came to power nearly three years ago, China has witnessed an intense campaign against anyone who criticises the party. Recently this campaign has moved into universities and sought to muffle both teachers and students alike. […] In January 2015, the Chinese leadership released guidelines that said universities must prioritise ideological loyalty to the party, the teaching of Marxism and Xi Jinping’s ideas. In the days following this announcement, education minister Yuan Guiren announced to a room of leaders from several prominent universities that the use of Western textbooks would be restricted and any that promote “Western values” would be banned. […] “By no means allow teaching materials that disseminate Western values in our classrooms,” Yuan told the gathering. “Never allow statements that attack and slander party leaders and malign socialism to be heard in classrooms.”

Google New

26 Thursday Nov 2015

Posted by futurilla in JURN's Google watch, Spotted in the news

≈ Leave a comment

A long article in TIME magazine this week on Google’s roadmap for voice recognition / voice-controlled services on everyday platforms — such as phones, wrist-bands, smartcars, and perhaps even dolls and fridges. Robot cats, even. Sadly, TIME remarks that…

At the product meetings where Google plans out the future of its search products, the desktop is rarely discussed.”

Let’s hope that’s because their ‘new product’ marketeers are confident that desktop keyword search is still being steadily advanced, and by the world’s best techies, somewhere far below them in the Google-bunkers.

Kittenbots unleashed!

28 Wednesday Oct 2015

Posted by futurilla in JURN's Google watch, Spotted in the news

≈ Leave a comment

It seems that JURN is now partly powered by kitten-trained sort-of semantic artificial intelligence. At least when one searches for novel new academic words, or when mis-typing complex academic terms…

For the past few months, a “very large fraction” of the millions of queries a second that people type into [Google, and hence JURN] have been interpreted by an artificial intelligence system […] said Greg Corrado, a senior research scientist with the company, outlining for the first time the emerging role of AI in search. [If it]… sees a word or phrase it isn’t familiar with, the machine can make a guess as to what words or phrases might have a similar meaning and filter the result accordingly, making it more effective at handling never-before-seen search queries.

Google Scholar and grey literature

28 Monday Sep 2015

Posted by futurilla in Academic search, JURN's Google watch, Spotted in the news

≈ Leave a comment

Interesting new paper at PLOS One, “The Role of Google Scholar in Evidence Reviews and Its Applicability to Grey Literature Searching”.

Test searches were drawn from review papers…

“…chosen as they covered a diverse range of topics in environmental management and conservation, and included interdisciplinary elements relevant to public health, social sciences and molecular biology.”

… and compared alongside Web of Science results…

Surprisingly, we found relatively little overlap between Google Scholar and Web of Science (10–67% of WoS results were returned using searches in Google Scholar using title searches).

Unsurprisingly, Google Scholar wasn’t found to be the one-stop shop many assume it to be…

… some important evidence was not identified at all by Google Scholar … [so it] should not be used as a standalone resource in evidence-gathering exercises such as systematic [literature] reviews.”

Interesting finding also that…

“Peak” grey literature content (i.e. the point at which the volume of grey literature per page of search results was at its highest and where the bulk of grey literature is found) occurred [in Google Scholar] on average at page 80 (±15 (SD)) for full text results … page 35 (± 25 (SD)) for title [search] results.”

So this suggests that one might usefully flick through to result 700 (of 1000) and work a few hundred results starting from there, if seeking grey literature with a very well-formed topic search? By well-formed I mean the sort of sophisticated literature-review style of search term chaining being used in this study, for example…

“oil palm” AND tropic* AND (diversity OR richness OR abundance OR similarity OR composition OR community OR deforestation OR “land use change” OR fragmentation OR “habitat loss” OR connectivity OR “functional diversity” OR ecosystem OR displacement)

It appears that the researchers only auto-extracted “citation records” from the search results, and then classified into broad categories based on those alone. There appears to have been no checking as to the validity of the link, and/or downloading and scrutiny of PDFs. So there are no measurements of how many of Google Scholar’s links work or lead to free no-paywall fulltext articles.

Lastly, I noted…

Google Scholar has a low threshold for repetitive activity that triggers an automated block to a user’s IP address (in our experience the export of approximately 180 citations or 180 individual searches). Thankfully this can be readily circumvented with the use of IP-mirroring software such as Hola (https://hola.org/)”

Google Books corpora

05 Saturday Sep 2015

Posted by futurilla in JURN's Google watch

≈ Leave a comment

Google Books corpora, an alternative search interface for Google Books…

This new interface for Google Books allows you to search more than 200 billion words [though it is] not an official product of Google or Google Books. Rather it was created by Mark Davies, Professor of Linguistics at Brigham Young University…”

A Google link: search suggests it’s had some notice from the field of linguistics, but not from outside. For non-linguists the tool seems to serve as a rather useful way of quickly bouncing a Google Books search through to a specific decade, without spending a minute fiddling around with the custom range date fields in the small Google Books date drop-down box. The tool may be especially useful for those who need to do this sort of Google Books search many times for many different decades.

As a test I looked for the phrase “a whit” (as in “not a whit of it” and “He had not changed a whit”), and then clicked on the link to occurrences from the period 1900-1910. I was taken to the book results on Google Books, and saw that the custom date range was automatically constrained to 1 Jan 1900 – 31 Dec 1909. There was some confusion by Google with “a Whit-sunday”, but finessing the search terms would have probably fixed that.

Google Scholar’s advantages

29 Wednesday Jul 2015

Posted by futurilla in Academic search, JURN's Google watch

≈ Leave a comment

A new blog post from Aaron Tay, “5 things Google Scholar does better than your library discovery service”, looking at the huge market advantages enjoyed by Google Scholar. The main points in summary:

* Intake and update: Google intakes, refreshes and updates very quickly.

* Automated detection: The Google bot spots and indexes academic articles wherever those are located.

* Relevancy ranking: It’s certainly not perfect, but is vastly better than anyone else’s.

* Clear and fast: Simple interface, a few useful widgets and filters. Additional features are accessed only via typed-in search modifiers or the well-hidden “Advanced” form.

* Cross-platform: Scholar can be tweaked to become a seamless gateway into paid subscription services.

I would also add…

* De-duplication in results. Not always perfect, not always even seen by the end user, but pretty intelligent.

GoogleMonkeyR fix – July 2015

23 Thursday Jul 2015

Posted by futurilla in JURN tips and tricks, JURN's Google watch

≈ Leave a comment

Changes at Google Search have broken GoogleMonkeyR in Firefox. Working fix here.

This GoogleMonkeyR fix will break an old version of Google Hit Hider By Domain, but upgrading Hit Hider to the latest 1.6.6 version will fix that too. Install 1.6.6, copy the blocklist from the old 1.6.x, import it over into 1.6.6 then disable 1.6.

The result is perfect for desktop-based ‘power searchers’ — an elegant no-scrolling at-a-glance display of Google Search results, suitable for a widescreen PC monitor and with unwanted results/domains hidden.

DOAJ to reinstate article pages

10 Friday Apr 2015

Posted by futurilla in JURN's Google watch, Spotted in the news

≈ Leave a comment

The DOAJ has announced that, sometime in 2015, “every single article entry in DOAJ will have, once again, its own landing page”, and each page will also have enhanced metadata to make it more Google Scholar friendly.

News from JURN

~ search tool for open access content

Category Archives: JURN's Google watch

Google Translate page embed

Some intitle: changes in JURN

Google Scholar banned in China

Google New

Kittenbots unleashed!

Google Scholar and grey literature

Google Books corpora

Google Scholar’s advantages

GoogleMonkeyR fix – July 2015

DOAJ to reinstate article pages