HRSpecNet: Deep Learning-Based High-Resolution Radar Micro-Doppler Signature Reconstruction

HRSpecNet Architecture

HRSpecNet is a deep learning architecture designed to overcome the limitations of traditional short-time Fourier transform (STFT) methods for generating micro-Doppler signatures (µ-DS) in radar-based human activity recognition (HAR). While STFT suffers from a trade-off between time and frequency resolution, parameter tuning complexity, and noise sensitivity, HRSpecNet delivers high-resolution, noise-robust time–frequency (TF) representations directly from 1-D complex radar signals.

Methodology

HRSpecNet consists of three key modules:

  • 1D Convolutional Autoencoder (AE) – Enhances signal-to-noise ratio (SNR) by removing noise from raw radar signals.
  • Learned Convolutional STFT Block – Learns adaptive Fourier-like transformations to create proxy TF feature maps without fixed-window constraints.
  • U-Net Reconstruction Module – Combines multi-scale TF features to generate high-resolution, sparser, and cleaner spectrograms.

The framework uses a weighted multi-stage loss:

  • L1: AE output vs. clean signal (noise suppression).
  • L2: STFT block output vs. ground truth TF (proxy representation accuracy).
  • L3: U-Net output vs. ground truth TF (final reconstruction quality).

Ground truth TF labels are generated with a novel process that maintains STFT-like dimensions while improving resolution.

Evaluation

Simulation Tests:

  • More robust than STFT to window size changes.
  • Maintains resolution across varying sampling rates without parameter retuning.
  • Superior ability to distinguish closely spaced frequencies.
  • Preserves amplitude and detects intersection points of instantaneous frequencies.

Real-World Tests:

  • Evaluated on a challenging 100-class American Sign Language (ASL) radar dataset (3000 samples).
  • µ-DS generated by HRSpecNet led to +3.14% classification accuracy improvement over STFT and better performance than TFA-Net, Deepfreq, and Cresfreq across multiple CNN classifiers.
  • Outperformed others in NMSE, SSIM, and PSNR across SNR levels from -5 to 20 dB.

Computational Efficiency:

  • Most efficient among deep learning-based TF reconstruction methods while delivering top accuracy.
  • Avoids the high latency and complexity of complex-valued neural networks.

Impact

HRSpecNet enables:

  • High-fidelity radar spectrograms for HAR.
  • Robust performance in low-SNR and noisy environments.
  • Broad applicability in security, health monitoring, gesture recognition, and autonomous systems.

Read the full paper: IEEE Xplore – HRSpecNet