GCC-PHAT Re-Imagined - A U-Net Filter for Audio TDOA Peak-Selection
Time-difference-of-arrival (TDOA) estimation from GCC-PHAT is not always as straight forward as finding the maximum peak. This work views the GCC output as an image, with time on the vertical axis and TDOA horizontally, to explore if image-to-image machine learning methods can make a more robust filter. The Structure from Sound Database provides audio recorded with a distributed microphone setup a
