Research entity information and coverage in eight free access scholarly databases
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The main objective of this study is to evaluate the coverage and information quality ofresearch entities (authors, organizations, venues, and disciplines) in eight new academic databases.Design/methodology/approach: A random Crossref sample of over 115k DOIs was chosen andsubsequently searched across seven databases. Dimensions and OpenAlex are the best products processing authors because they have the lowest percentage of authors with one publication (Dimensions, 88.1%; OpenAlex, 89.9%), and the lowest slope coefficient (old OpenAlex, α=3.25; Dimensions, α=3.46). They also show low average author variation (Dimensions, .12; OpenAlex, .17). Microsoft Academic is the database that detects the most affiliations (87%) and organizations (71.2%). Crossref-based products such as Dimensions (98.1%), Scilit (99.3%), and The Lens (96.4%) identify more venues and publishers than other products. Semantic Scholar is highlighted as the database that thematically classifies the most publications (94.1%). Regarding document types, the study alsoidentifies transversal problems in the extraction and identification of entities in books and bookchapters. This is the first study that compares the largest number of free-access scholarlydatabases, exploring the completeness degree and quality of the research entity information(authors, organizations, disciplines and venues). The results of this study have important implications for selecting different databases when it comes to search literature for reviews, meta-analyses andother studies.