Survey on Information Requirements on the Google Books Ngram Corpus
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The development of word frequencies over time is the subject of research in different branches of the humanities.Large temporal n-gram corpora have been created for this purpose, most notably the Google Books Ngram Corpus .While the concrete research questions vary between the different research works, there are similarities in the more abstract underlying information requirements, i.e., the structure of queries against a potential database system.Based on a systematic literature review, we extract these information requirements, leading to a categorization of existing articles into macro-areas of information requirements.Furthermore, we collect existing query systems for temporal n-gram corpora and evaluate their expressiveness regarding the information requirements we found.