The geometry of room acoustics is such that the reverberant signal can be seen as the same waveform emitted from multiple locations. In analogy with the rake receiver from wireless communications, we propose several beamforming strategies that exploit, rather than suppress, this additional spatio-temporal diversity. Unlike earlier work in the frequency domain, time domain designs allow to shape the impulse response of the beamformer. In particular, we can control perceptually relevant parameters, such as the amount of early echoes or the length of the beamformer response. Relying on the knowledge of the image sources position, we derive different optimal beamformers. Leveraging perceptual cues, we show how to improve interference and noise reduction without degrading the perceptual quality. The designs are validated through simulation. Using early echoes is shown to strictly improve the signal to interference and noise ratio. Code and speech samples are available online at http:// lcav.epfl.ch/Robin_Scheibler.
This repository contains all the code to reproduce the results of the paper Raking Echoes in the Time Domain.
Using the simple python room acoustics framework created for the Acoustic Rake Receiver, we demonstrate two time domain beamformer designs that use echoes constructively to improve the signal to interference and noise ratio. These designs are explained in details in the paper Raking Echoes in the Time Domain. All the figures of the paper can be recreated by calling simple scripts leveraging this framework. In addition to the results of the paper, we include spectrograms and samples of speech samples. We strongly hope that this code will be useful beyond the scope of this paper and plan to develop it into a standalone python package in the future.
We are available for any question or request relating to either the code or the theory behind it. Just ask!
In a UNIX terminal, run the following script.
Alternatively, type in the following commands in an ipython shell.
run figure_beampatterns.py run figure_SINR_sim.py run figure_SINR_plot.py
The figures and sound samples generated are collected in
The SINR simulation can be very long. For that reason, we split
the simulation loop and the plotting routine into two files.
We recommend to first modify the value of the
figure_SINR_sim.py to 10 and time the simulation to have an
estimate of the total time required to run the simulatioin.
The output of the simulation is saved in
The data can be plotted using
The simulation data from the paper is available in
and can be processed with
While it was omitted in the paper for space reasons, we provide here simulation results for speech samples.
The spectrogram of all samples as well as the desired sound are provided.
The samples and the spectrograms are created by the script
Beyond the simulation results presented in the paper, the raking beamformers
were validated with room impulse responses (RIR) measured in a classroom at
EPFL. The RIR are stored as wav files in
BC329_RIR_8kHz/RIRs and the
methodology for the measurements is outlined in
The RIR were measured for a static linear array of 8 microphones spaced by 8 cm. RIRs were measured for 12 different source positions. In the validation all 132 combinations of desired/interfering sources are tested. Sound propagation is simulated by convolving speech samples with the RIR.
Using the measured room dimension and locations of sources, as well as approximate reflection coefficients for the walls, both Rake MVDR and Rake Perceptual beamformers are computed. The output of the beamformers is evaluated with respect to the clean desired speech signal using the perceptual evaluation of speech quality (PESQ) metric.
Even though the calibration of room characteristics, microphones, and speakers locations was only approximate, we observe the same kind of thread than in the pure SINR simulation. That is speech quality is improved by using extra reflections in the room.
The simulation can be run as
ipython figure_Experiment.py 132 and the output of the
simulation can be plotted using
ipython figure_Experiment_plot.py (with optional
simulation output filename argument).
We also include extra scripts that let us play with the different beamformers. The script names should be self-explanatory.
RakeLstSqFromFD.py RakeMVDR.py RakeMaxSINR.py RakeMaxUDR.py RakeOF.py RakePerceptual.py
figure_Experiment.py. In a pinch, the reference implementation of the ITU can be used. To compile it follow the instructions.
Download the source files
of the ITU P.862 compliance tool from the ITU website. Follow the instructions
to compile the tool and copy the executable produced in the
Execute the following sequence of commands to get to the source code.
mkdir PESQ cd PESQ wget 'https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-P.862-200511-I!Amd2!SOFT-ZST-E&type=items' unzip dologin_pub.asp\?lang\=e\&id\=T-REC-P.862-200511-I\!Amd2\!SOFT-ZST-E\&type\=items cd Software unzip 'P862_annex_A_2005_CD wav final.zip' cd P862_annex_A_2005_CD/source/
Software/P862_annex_A_2005_CD/source/ directory, create a file called
Makefile and copy
the following into it.
CC=gcc CFLAGS=-O2 OBJS=dsp.o pesqdsp.o pesqio.o pesqmod.o pesqmain.o DEPS=dsp.h pesq.h pesqpar.h %.o: %.c $(DEPS) $(CC) -c -o $@ $< $(CFLAGS) pesq: $(OBJS) $(CC) -o $@ $^ $(CFLAGS) .PHONY : clean clean : -rm pesq $(OBJS)
Execute compilation by typing this.
Finally move the
pesq binary to
* The files input to the pesq utility must be 16 bit PCM wav files.
* File names longer than 14 characters (suffix included) cause the utility to
crash with the message
Abort trap(6) or similar.
Open visual studio, create a new project from existing files and select the directory
containing the source code of PESQ (
FILE -> New -> Project From Existing Code...
Visual C++ from the dropdown menu, then next.
BUILD -> Configuration Manager...
and change active solution configuration from
Release. Then Close.
BUILD -> Build Solution
Copy the executable
Release\pesq.exe to the bin folder.
(tested with Microsoft Windows Server 2012)
| Machine | IBM System X iDataPlex dx360 M3 | |---------|---------------------------------| | System | Ubuntu 12.04.5 | | CPU | 2x X5675 | | RAM | 96 GB |
System Info: Linux 3.11.0-26-generic #45~precise1-Ubuntu SMP Tue Jul 15 04:02:35 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Python Info: Python 2.7.8 :: Anaconda 2.0.1 (64-bit) Python Packages Info: accelerate 1.7.0 np19py27_p0 anaconda 2.0.1 np18py27_0 ipython 2.1.0 py27_2 ipython-notebook 2.1.0 py27_0 ipython-qtconsole 2.1.0 py27_0 matplotlib 1.3.1 np18py27_1 mkl 11.1 np19py27_p3 mkl-rt 11.1 p0 mkl-service 1.0.0 py27_p1 mklfft 1.0 np19py27_p0 numpy 1.9.0 py27_p0 [mkl] scipy 0.14.0 np19py27_p0 [mkl]
| Machine | MacBook Pro Retina 15-inch, Early 2013 | |---------|----------------------------------------| | System | OS X Maverick 10.9.4 | | CPU | Intel Core i7 | | RAM | 16 GB |
System Info: Darwin 13.3.0 Darwin Kernel Version 13.3.0: Tue Jun 3 21:27:35 PDT 2014; root:xnu-2422.110.17~1/RELEASE_X86_64 x86_64 Python Info: Python 2.7.8 :: Anaconda 2.0.1 (x86_64) Python Packages Info: accelerate 1.7.0 np19py27_p0 anaconda 2.0.1 np18py27_0 ipython 2.1.0 py27_2 ipython-notebook 2.1.0 py27_0 ipython-qtconsole 2.1.0 py27_0 matplotlib 1.3.1 np18py27_1 mkl 11.1 np19py27_p3 mkl-rt 11.1 p0 mkl-service 1.0.0 py27_p1 mklfft 1.0 np19py27_p0 numpy 1.9.0 py27_p0 [mkl] scipy 0.14.0 np19py27_p0 [mkl]
Copyright (c) 2014, Robin Scheibler, Ivan Dokmanić, Martin Vetterli
This code is free to reuse for non-commercial purpose such as academic or educational. For any other use, please contact the authors.
Time Domain Acoustic Rake Receiver by Robin Scheibler, Ivan Dokmanić, Martin Vetterli is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at https://github.com/LCAV/AcousticRakeReceiver.