Survey on Information Requirements on the Google Books Ngram Corpus

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The development of word frequencies over time is the subject of research in different branches of the humanities.Large temporal n-gram corpora have been created for this purpose, most notably the Google Books Ngram Corpus .While the concrete research questions vary between the different research works, there are similarities in the more abstract underlying information requirements, i.e., the structure of queries against a potential database system.Based on a systematic literature review, we extract these information requirements, leading to a categorization of existing articles into macro-areas of information requirements.Furthermore, we collect existing query systems for temporal n-gram corpora and evaluate their expressiveness regarding the information requirements we found.

Article activity feed