Found an additional way to auto-check JURN

I’m pleased to say that I’ve found a robust way to auto-check if Google is still “seeing” content at the article-level URLs indexed by JURN. It’s a software based solution, and is basically ‘dark side’ SEO software that I’ve turned to the good side. It auto-prepends the site: modifier to each of the URLs contained in the JURN index, and then checks if those URLs are actually indexed by Google. It then logs any wholly un-indexed URLs. It just chugs away in the background and is very slow — so as not to trigger flood-control blocking measures. But it’s certainly better than doing the checking by hand.

If you have such a list you want to check, it’s probably best to remove or cut back any URLs containing multiple wildcards such as /*/*/. Google has also been known to choke on URLs containing question-marks (it can see them as evidence of someone trying a scripting exploit on Google), although I don’t see this happening during the checking. But if you’re doing the checking in blocks of 200, it’s not difficult to correct those sort of URLs first.

Spamming Google Scholar

Spamming Google Scholar. Very possible, or so it seems…

“…we conducted several tests on Google Scholar. The results show that academic search engine spam is indeed – and with little effort – possible: We increased rankings of academic articles on Google Scholar by manipulating their citation counts; Google Scholar indexed invisible text we added to some articles, making papers appear for keyword searches the articles were not relevant for; Google Scholar indexed some nonsensical articles we randomly created with the paper generator SciGen; and Google Scholar linked to manipulated versions of research papers that contained a Viagra advertisement.”

Beel, J. (2010)
Academic Search Engine Spam and Google Scholar’s Resilience Against it.
Journal of Electronic Publishing 13 (3), December 2010.

Four more ejournals added today

Added to the JURN site-index today:—

Journal of the Oxford University History Society

Chronicles of Oklahoma (Oklahoma history – full-text 1923 – 1962, thereafter TOCs only)

Trabalhos de Arqueologia (Portuguese, some articles in English – e.g.: “Portuguese-derived ship design methods in southern India?”)

Societas Magica Newsletter (Scholarly study of historical magic, with academic contributors and substantial articles)

Chinese journals

I seem to have missed out on mentioning a couple of recent articles on the state of ejournals in China:

1. An article from Nature, on China’s severe problems with academic journals

“in a Correspondence to Nature last week, Yuehong Zhang of the Journal of Zhejiang University–Science reported that a staggering 31% of the papers submitted to that campus journal contained plagiarized material (Nature 467, 153; 2010).”

2. And a long article in the New York Times

The Lancet, the British medical journal, warned that faked or plagiarized research posed a threat to President Hu Jintao’s vow to make China a “research superpower” by 2020.”

“a recent government study in which a third of the 6,000 scientists at six of the nation’s top institutions admitted they had engaged in plagiarism or the outright fabrication of research data.”

As far as I know, no mainland Chinese journals are accessible via JURN, since the Chinese state requires them all to be kept on a central server in page-scanned image form only (i.e.: no Googleable text).