HRSpecNet: Deep Learning-Based High-Resolution Radar Micro-Doppler Signature Reconstruction

HRSpecNet is a deep learning architecture designed to overcome the limitations of traditional short-time Fourier transform (STFT) methods for generating micro-Doppler signatures (µ-DS) in radar-based human activity recognition (HAR). While STFT suffers from a trade-off between time and frequency resolution, parameter tuning complexity, and noise sensitivity, HRSpecNet delivers high-resolution, noise-robust time–frequency (TF) representations directly from 1-D complex radar signals.
Methodology
HRSpecNet consists of three key modules:
- 1D Convolutional Autoencoder (AE) – Enhances signal-to-noise ratio (SNR) by removing noise from raw radar signals.
- Learned Convolutional STFT Block – Learns adaptive Fourier-like transformations to create proxy TF feature maps without fixed-window constraints.
- U-Net Reconstruction Module – Combines multi-scale TF features to generate high-resolution, sparser, and cleaner spectrograms.
The framework uses a weighted multi-stage loss:
- L1: AE output vs. clean signal (noise suppression).
- L2: STFT block output vs. ground truth TF (proxy representation accuracy).
- L3: U-Net output vs. ground truth TF (final reconstruction quality).
Ground truth TF labels are generated with a novel process that maintains STFT-like dimensions while improving resolution.
Evaluation
Simulation Tests:
- More robust than STFT to window size changes.
- Maintains resolution across varying sampling rates without parameter retuning.
- Superior ability to distinguish closely spaced frequencies.
- Preserves amplitude and detects intersection points of instantaneous frequencies.
Real-World Tests:
- Evaluated on a challenging 100-class American Sign Language (ASL) radar dataset (3000 samples).
- µ-DS generated by HRSpecNet led to +3.14% classification accuracy improvement over STFT and better performance than TFA-Net, Deepfreq, and Cresfreq across multiple CNN classifiers.
- Outperformed others in NMSE, SSIM, and PSNR across SNR levels from -5 to 20 dB.
Computational Efficiency:
- Most efficient among deep learning-based TF reconstruction methods while delivering top accuracy.
- Avoids the high latency and complexity of complex-valued neural networks.
Impact
HRSpecNet enables:
- High-fidelity radar spectrograms for HAR.
- Robust performance in low-SNR and noisy environments.
- Broad applicability in security, health monitoring, gesture recognition, and autonomous systems.
Read the full paper: IEEE Xplore – HRSpecNet
