Saturday, December 24, 2005

Special Issue on Searching and Mining Literature Digital Libraries

Must read: Special Issue on Searching and Mining Literature Digital Libraries of the "Bulletin of the Technical Committee on Data Engineering" Most interesting perhaps is a new study on the IMPACT OF OPEN ACCESS: Ten-Year Cross-Disciplinary Comparison of the Growth of Open Access and How it Increases Research Citation Impact by Chawki Hajjem, Stevan Harnad, Yves Gingras See the BOLD spots for remarkable info in the abstract! Abstract In 2001, Lawrence found that articles in computer science that were openly accessible (OA) on the Web were cited substantially more than those that were not. We have since replicated this effect in physics. To further test its cross-disciplinary generality, we used 1,307,038 articles published across 12 years (1992-2003) in 10 disciplines (Biology, Psychology, Sociology, Health, Political Science, Economics, Education, Law, Business, Management). We designed a robot that trawls the Web for full-texts using reference metadata (author, title, journal, etc.) and citation data from the Institute for Scientific Information (ISI) database. A preliminary signal-detection analysis of the robotÂ’s accuracy yielded a signal detectability dÂ’=2.45 and bias β = 0.52. The overall percentage of OA (relative to total OA + NOA) articles varies from 5%-16% (depending on discipline, year and country) and is slowly climbing annually (correlation r=.76, sample size N=12, probability p <> .90, N=12, p < .0005) and the effect is greater with the more highly cited articles (r = .98, N=6, p < .005). Causality cannot be determined from these data, but our prior finding of a similar pattern in physics, where percent OA is much higher (and even approaches 100% in some subfields), makes it unlikely that the OA citation advantage is merely or mostly a self-selection bias (for making only oneÂ’s better articles OA). Further research will analyze the effectÂ’s timing, causal components and relation to other variables, such as, download counts, journal citation averages, article quality, co-citation measures, hub/authority ranks, growth rate, longevity, and other new impact measures generated by the growing OA database.
