[Csnd] info about pvsanal opcode

Date	2008-01-13 20:47
From	Uğur Güney
Subject	[Csnd] info about pvsanal opcode
Attachments	None

Date	2008-01-14 11:11
From	Richard Dobson
Subject	[Csnd] Re: info about pvsanal opcode
	Uğur Güney wrote: .. > > # Why there are 513 bins, not 512? Is bin number 0 for DC component or > fundamental component? Is Nth_Bin for the frequency > (SampleRate/1024)(N-1)? > # So should my function table, which will contain the transfer > function of the filter, has 512 or 513 points? > 513. The bottom and top bins cover DC and Nyquist. The exact mathematical details are rather complex (pun intended); simplistically it is much like saying that in counting from 0 to 10 there are 11 numbers. A more mathematical way of looking at it is that a purely "real" signal expressed in the form of a complex (real+imaginary) spectrum is exactly symmetrical around Nyquist, so the upper values are redundant; but we keep DC and Nyquist as per the 0-10 idea. > # In manual it says also: > > Currently only one format is implemented by this opcode: 0 = amplitude > + frequency > > # What does this exactly mean? I understand something like this: > # fsig is an array. Its elements are pair of numbers, one for > amplitude and one for frequency. Other formats are possible, such as amplitude + phase (and even a raw complex real+imag form), but I never got around to implementing the other forms in the opcodes. The provision is there though for when I or someone gets around to it (I haven't even looked at the code for ages, pehaps someone has already done it!). The expensive step is the conversion to amp/phase; moving that to amp/freq is a simple arithmetic step. So using amp/phase would not noticeably save processing time. The associated PVOCEX file format ~does~ support all three formats though. Because pvsanal does not take FFT but > makes a phase-vocoder analysis, Taking the FFT (of a windowed block of samples) is the first stage in making a phase vocoder analysis. So the FFT usage is pretty heavy! In turn, the phase vocoder is the first step in other techniques such as partial tracking. frequency values are not exactly > integer multiples of some fundamental, SR/1024, but they deviate from > their corresponding bin value. How are these deviations stored? This involves the "phase" part of the phase vocoder. Each bin has a nominal fixed centre frequency of sr/N Hz bin number (so frequencies in the lowest bins can even be negative - the DC bin might range between +- 43 Hz, for example), but a relatively limited bandwidth dependent on the amount of overlap between frames. A common alternative view is to see each bin as a very simple bandpass filter where the filters all overlap somewhat. Frequency is defined as "the rate of change of phase", and thus the differences in phase between successive windows (phase in turn is obtained from the raw real/imaginary values emerging from the FFT, using good old Pythagoras's theorem) can get converted into a true frequency value. But note the plain phase vocoder cannot in itself track moving frequency components; at some point the information moves into higher or lower bins - a bit like the image of a football moving between multiple TV screens in a mega-display, where the images overlap a bit but the cameras are fixed. In the limit of single-sample overlap (the "Sliding Phase Vocoder" or SPV which has recently been incorporated), the bandwidth of each bin is DC to Nyquist, such that those filters now do not overlap but fully stack on top of each other! We can undertsand this intuitively by considering what we might be able to deduce about frequency changes between widely-spaced frames. In this case, rapid deviations are simply missed, such that the content of each bin is more like a crude average of the start and end values from the FFT. We will be missing important information (transients, moving pitches generally), and the effective bandwidth of each bin becomes very narrow as little deviation is measurable. Conversely, with maximum possible overlap, we track frequency changes at single sample resolution, over the whole range. How accurate the result is (in terms of frequency resolution within a frame) is still dependent on the size of the window (fftsize). [To continue the tv screen analogy: widely-spaced frames are like having one tv showing David Beckham kicking the football from the centre of the field; the next tv shows the ball entering the goal. We ~assume~ it is the ball Beckham kicked, and may even extrapolate the path it took; but we can't be abolutely sure someone else didn't kick it in between, or even replace it with another one. Or use two balls!] There is no escaping the maths jargon when it comes to explaining how the phase vocoder works, sorry! Or, just accept what is contained in a frame, call it magic, and just use it. The SPV has some truly weird aspects which still need further investigation. We can no longer make any assumptions at all about what frequency might be in what bin - a high bin might well contain a very low frequency, especially if the source is something simple with very few components. See http://dream.cs.bath.ac.uk/SDFT/index.html for much more information and wacky sound examples. Are > the freq. values absolute freqencies in Hz, or are they deviations > (delta f) from the frequencies corresponding each bin, or something > like these? See above - true frequencies in Hz (albeit sampled, as this is a sampled system - we might have to calculate some amplitude interpolation through adjacent bins to find the "true" frequency of a source partial at that position); representing delta ~phase~ between frames. Richard Dobson

Date	2008-01-17 19:02
From	Uğur Güney
Subject	[Csnd] Re: Re: info about pvsanal opcode
Attachments	None