Coarse2Fine ResNet
A robust and high-precision generalized deep learning framework for time delay estimation
Time delay estimation remains an active research area with broad applications across multiple fields. While conventional Generalized Cross-Correlation based approaches focus primarily on errors following normal distribution, they struggle with large estimation errors that deviate from normal distribution due to significant signal shifts. This paper presents Coarse2Fine-ResNet, a robust deep learning framework that effectively handles both types of errors by leveraging 2D GCC patterns and ResNet architecture. Evaluated on multiple datasets including acoustic drone flying, speaker localization, and optical fiber sensing, our method outperforms existing GCCs and state-of-the-art deep learning models in both estimation accuracy and reduction of Large Errors.
As illustrated in Figure 1, the Coarse2Fine-ResNet framework transforms raw 1D signals into 2D correlation patterns using GCC-PHATs with different β values (0, 1/2, 1) to capture both phase and magnitude information. The Coarse Estimator analyzes these patterns to overcome large estimation errors by identifying true peaks among ghost peaks, producing an initial delay estimate. This estimate guides the selection of a Region of Interest (ROI) for the Fine Estimator to perform precise delay calculation. This dual-stage approach effectively mitigates both large estimation errors and improves overall accuracy.