Re: Newbie question: window

Date	1998-06-15 16:18
From	Mike Berry
Subject	Re: Newbie question: window
	The tradeoff in a FFT-based analysis system is between temporal accuracy and frequency accuracy. An FFT is an "analysis" of a chunk of samples in time. If your window is 1024 samples, then you get a single set of FFT data which represents these 1024 samples. You would get 512 different amplitudes and phases of frequency bands, the bands equally spaced across the spectrum. At 44.1kHz, this means a band every 43 Hz (22050 / 512). However, your temporal information is completely hidden inside the transform. If a note were to start in the middle of a window, you would not know exactly when it started. In effect, the temporal data is "quantized" (I say quantized but really it is a kind of indeterminacy) to 23 ms in this case (1024 samples / 44.1kHz). So here's where the tradeoff begins. Make the window 512 samples. Then you quantize to 11.6 ms, but you only get bands every 86 Hz. 2048 samples = 22 Hz but 46 ms. The Phase Vocoding process assumes that the data in the FFT analysis can be stretched or compressed temporally. This assumption is more or less valid depending on the original material and how it was analyzed. As a general rule, continuously pitched material is going to sound better if a large window is used, because the frequencies are going to be more accurate. Music with many "events" will sound better with a small window. Unfortunately, most music has both elements, so you need to reach a compromise that suits your purposes. That's why the window size in not defined for you. > I realise that is a simplistic explaination, but..... the larger the > window the faster the PV, but the more smearing there is as it does > not track changes so well. Small windows are slow, and not so > accurate as there is insufficient data to get accurate estimates. -- Mike Berry mikeb@nmol.com http://www.nmol.com/users/mikeb