Bag-of-Frames: Improving Bag-of-Words for a better similarity measure
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This paper introduces the Bag-of-Frames (BoF) model, a novel approach to textual document representation that extends and improves the classical Bag-of-Words (BoW) model using VerbNet frames instead of words. While BoW treats documents as collections of word frequencies, BoF captures semantic content conveyed by verbal frames rather than lexical items. Our experiments suggest that BoF can improve performance in various natural language processing tasks. BoF dimensionality is considerably lower than the dimensionality of BoW. Compared with BoW, the lower dimensionality results in reduced complexity and improved performance. BoF discovers frame-level similarity where BoW finds none, because BoW works at the word-level similarity . We implement BoF and present empirical results for estimating similarity between sets of sentences. A sentence, as well as a document, is represented by a vector. Therefore, sentences and documents can be interpreted and viewed as points in a multidimensional vector space.