FF3F: Feature-Fused 3 Frame Hybrid NeuralNetwork Framework for 3D Tracking ofFast-Moving Small Objects

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Detecting and tracking fast-moving small objects with LiDAR remains a challenge. Conventional computer vision approaches rely heavily on contour segmentation, which degrades sharply when the accessible features of the targets are limited. To address this limitation, we propose the Feature-Fused ThreeFrame (FF3F) detection algorithm, which integrates lightweight low-level neural networks and conditional convolution layers to achieve a balance between accuracy and efficiency. FF3F fuses pixel drifts, centroid displacements, and velocity estimates within a confidence-aware framework, enabling robust motion estimation even under scarce signal or noisy environments. The “three-frame” design refers to integrating temporal features across three consecutive frames, thereby strengthening recognition consistency. The approach is tested and compared against an end-to-end three-frame neural network-based detection (EE3F) model and a traditional feature-based optical flow (FBOF) algorithm. This study uses solid-state LiDAR sensors. Performance was evaluated by Precision, Recall, F1(the harmonic mean of its precision and recall rate), and IoU (Intersection over Union). When target objects are moving at 9–12 m/s and covering approximately 128 pixels, the FF3F model achieved an average recall of 0.89, outperforming other frameworks (EE3F: 0.78, FBOF: 0.59). FF3F had a latency between 27.31ms and 92.35 ms. In comparison, EE3F exhibited higher latency (59.42 ms –171.64 ms), while FBOF was faster (17.13 ms – 55.43 ms) but substantially less accurate. These results confirm that FF3F effectively balances accuracy and computational efficiency under visibility constraints.

Article activity feed