Detection of Speech Steganography for VoIP Stream Based on Deep Learning Approach in G.729 Codec

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

During the last few years, Voice over Internet Protocol (VoIP) has been extensively utilized in real-time communications and social networks so that has become an applicable carrier for steganography techniques and secret communication. To counter these security threats, numerous steganalysis approaches have been developed, among which the integration of signal processing and machine learning techniques has enabled the creation of highly accurate steganalyzers. This research paper proposes a hybrid approach that combines speech signal processing methods with Artificial Intelligence (AI). Data preprocessing is first applied to audio signals compressed in G.729 codec, which effectively extracts intra-frame features and inter-frame correlations. The resulting data are given into a deep learning network for the training model to distinguish between cover data and stego data. The evaluation of the implementation results demonstrates significant improvements in both detection accuracy and computational efficiency. The proposed technique is evaluated for two major steganography families, namely Quantization Index Modulation (QIM) and Pitch Modulation Steganography (PMS), as well as their combined application, Heterogeneous Parallel Steganography (HPS). This method is tested and implemented for various embedding rates (From a range of 10–100%) and diverse segment lengths (From a range of 100 ms to 1000 ms) of the audio data. All three techniques: QIM, PMS, and HPS show a notable superiority in accuracy (Detection accuracy of improvement from a range of 1–11% based on segment lengths and embedding rates) when compared to conventional methods. During the steganalysis testing phase for 1000 ms audio files, the response test time was less than 5ms, highlighting the high speed of the suggested model in the testing step. Besides, the deep learning model's architecture depicts superior execution speed in both the training step (In most cases, minimum time improvement is ¼ ) and testing step (In most cases, minimum time improvement is ⅓ ) when compared to similar approaches.

Article activity feed