Classification of Bio-Data with Interval Dissimilarities: A Multidimensional Scaling Framework
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Interval-valued data (IVD) is a form of symbolic data in which observations are represented not by single numerical values but by intervals. It can also often arise during the process of combining big databases into a manageable format. In the real world, many scenarios may involve significant information that is inaccurate, unclear, or variable. In such cases, interval-valued data is more capable of expressing data variability and uncertainty than point data. Multi-dimensional Scaling is the process of examining the similarities and differences in data by smoothing out noise and preserving important information. Vertices Principal Component Analysis (VPCA) is an extension of classical PCA that applies PCA by reconstructing IVD as classical data via vertex representation; however, for large datasets, VPCA becomes computationally impractical as it expands the data size exponentially. This limitation underscores the necessity of developing methods that operate directly on interval-valued data without transformation into classical form. One such approach is the recently proposed I-Scal method, which uses linear transformation to unfold the structure of the data. In this paper, we introduce IMDS-a supervised Multidimensional Scaling framework for interval-valued data. IMDS reconstructs IVD, incorporating class separability directly into the stress function. This ensures that the projected 1 data preserves within-class compactness and between-class separation, thereby enhancing discriminative visualization and analysis. For classification, we integrate the generalization of Fisher’s Linear Discriminant Analysis (FLDA) for IVD to operate on the transformed interval data. We evaluate the proposed method numerically using several datasets from the UCI repository in a two-step experimental design. Performance comparisons are made with state-of-the-art methods, including VPCA and the I-Scal algorithm, to demonstrate the effectiveness and scalability of IMDS for interval-valued data classification. Experimental results show that IMDS outperforms both baseline algorithms, VPCA and I-Scal, in terms of classification accuracy and computational efficiency across multiple interval-valued datasets. MSC Classification: 68T10 , 62H30 , 05C12 , 90C90 , 92-05