FFT opcodes pvs2array rfft mags phs etc..
Date | 2016-11-30 14:43 |
From | Ed Costello |
Subject | FFT opcodes pvs2array rfft mags phs etc.. |
Hi,
Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here
I was wondering if it was possible to use the mags opcode with the result from pvs2array, and if not was it possible to do get the magnitudes (or indeed frequencies) from the result of pvs2array without using a loop in Csound? Thanks Ed |
Date | 2016-11-30 22:05 |
From | Oeyvind Brandtsegg |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
Victor would know this better, so let's hope he chimes in. Here's my 2 cents, hopefully wwith correct change: I was able to use c2r to extract the amplitudes from the pvs2array output. I know this is not the intended use for c2r, but since it simply discards every second value in the array, it seems to do the job ok. I think that mags and phs will not work, since the pvsanal output is in AMP+FREQ format (mags and phs expecting real+imaginary format). Not sure how to extract the frequencies, except to use pfsftw to write to separate tables for amps and freqs, then use copyf2array to get the table into an array. best Oeyvind 2016-11-30 6:43 GMT-08:00 Ed Costello |
Date | 2016-11-30 23:19 |
From | Justin Smith |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
Out of curiosity, why even store the freqs? The bins are guaranteed to be multiples of a single analysis window length right? On Wed, Nov 30, 2016 at 2:06 PM Oeyvind Brandtsegg <oyvind.brandtsegg@ntnu.no> wrote: Victor would know this better, so let's hope he chimes in. |
Date | 2016-11-30 23:27 |
From | Oeyvind Brandtsegg |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
Unless I have completely missed the point this is not necessarily so, since it is a phase vocoder analysis, and the phase vocoder tracks are what we here refer to as "bins", and the frequency for each track may slide according to content. For a simple single sine tone for example, you may have several tracks lumped around the fundamental. For a pitch sweep, the frequency of tracks may follow the sweep, instead of the sweep switching from bin to bin. Those who know better correct me if I'm way off the charts here. 2016-11-30 15:19 GMT-08:00 Justin Smith |
Date | 2016-11-30 23:42 |
From | Justin Smith |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
My understanding was that nothing about phase vocoding allows selecting arbitrary frequencies for the bins - it's still fundamentally based on a transform that uses a fixed frequency spacing for the output bins that does not adapt dynamically to the input. It's possible to remap the bins during resynthesis of course, but there is no intermediate representation where the bins change in pitch, or even explicitly carry frequency information aside for position in the vector of values (unless I'm severely mistaken). On Wed, Nov 30, 2016 at 3:27 PM Oeyvind Brandtsegg <oyvind.brandtsegg@ntnu.no> wrote: Unless I have completely missed the point this is not necessarily so, |
Date | 2016-11-30 23:54 |
From | Justin Smith |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
to cite a quote from this paper: http://www.panix.com/~jens/pvoc-dolson.par "these center frequencies are equally spaced across the entire spectrum from 0 Hz to half the sampling rate" Here "equally spaced" means each hop is the same separation in hz (eg. bins of 15, 30, 45, 60, 75 ... hz) and the size of the hops is dependent on the reciprocal of the window size used during analysis. So, back to my original point, if you know the spacing of the bins, knowing the frequency of each bin is a single multiplication of an index by a constant, and storing the frequencies directly in the signal would seem to be a waste of RAM (though I don't know csound's internal structure, hopefully someone else can weight in on this). On Wed, Nov 30, 2016 at 3:42 PM Justin Smith <noisesmith@gmail.com> wrote:
|
Date | 2016-12-01 00:20 |
From | Steven Yi |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
Perhaps section 3 from: http://blogs.zynaptiq.com/bernsee/pitch-shifting-using-the-ft/ will help to explain what the frequency values are in the analysis signal. They are derived from the bin frequency + phase. On Wed, Nov 30, 2016 at 6:54 PM, Justin Smith |
Date | 2016-12-01 00:27 |
From | Justin Smith |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
To be clear, does this mean the PVOC signal contains the phase-corrected frequency values? On Wed, Nov 30, 2016 at 4:20 PM Steven Yi <stevenyi@gmail.com> wrote: Perhaps section 3 from: |
Date | 2016-12-01 00:33 |
From | Oeyvind Brandtsegg |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
Try this to see the frequency values of the first few bins when analyzing white noise: |
Date | 2016-12-01 06:36 |
From | Victor Lazzarini |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
They contain a detected frequency in Hz for every bin, which may vary from frame to frame. The format is based on a equally-spaced binwise pair of amp-freq values.
Victor Lazzarini Dean of Arts, Celtic Studies, and Philosophy
Maynooth University
Ireland
|
Date | 2016-12-01 07:39 |
From | Oeyvind Brandtsegg |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
Hi Victor, thanks for chiming in. I wonder about what it means when it is "based on a equally-spaced binwise pair of amp-freq values." When I print the frequency of the first few bins (as in the recently pasted example in this thread), I get: first few bin freqs: 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 first few bin freqs: 0.000000 21.533203 43.066406 86.132812 107.666016 129.199219 first few bin freqs: 21.533203 4.657142 22.820997 101.494026 89.441116 149.400269 first few bin freqs: -21.533203 11.820784 50.410923 75.383156 120.206375 137.014877 first few bin freqs: 0.000000 5.535223 43.823414 73.622749 103.943481 138.594788 The printed frequencies are not equally spaced, and not even always in increasing order, so what is equally spaced? 2016-11-30 22:36 GMT-08:00 Victor Lazzarini |
Date | 2016-12-01 08:16 |
From | Richard Dobson |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
It can be helpful to inject a simple sinusoid into the fft/pvoc process, and remember the "filter bank" model of the FFT. Each bin has not only a nominal centre frequency, it also has a bandwidth (they overlap). Given a strong component such as a fixed sinusoid, adjacent bin frequencies can so to speak bunch together around that frequency, up to the limits of the bandwidth of the bin (related to 2Pif and the frame overlap amount - too early in the morning for me to be more mathematical than that...). There will be one (or maybe two) bins with a peak amplitude, and neighbouring amplitudes monotonically decreasing. Around DC you can even get negative frequency values. When pitch shifting, you want to preserve this local bunching - this is in principle what the technique of "phase locking does. This is why, for example, IRCAM's "FFT-1" synthesis technique requires anything up to 9 sequential bins to define a component (of arbitrary frequency) fully. Frequency is the rate of change of phase, from frame to frame - in effect the accumulated phase differences. So it is always possible to derive the one from the other (it is how FFT-based resynthesis is performed, indeed). An interesting aspect of the Sliding Pvoc (single sample update) is that the frame overlap is single-sample, and the bandwidth of each bin then extends over the whole frequency range. Thus when performing pitch shifting, using a full resynthesis oscillator bank, there is no need to fiddle with things to get the new frequency into the "right" bin. Richard Dobson On 01/12/2016 07:39, Oeyvind Brandtsegg wrote: > Hi Victor, thanks for chiming in. > I wonder about what it means when it is "based on a equally-spaced > binwise pair of amp-freq values." > When I print the frequency of the first few bins (as in the recently > pasted example in this thread), I get: > > first few bin freqs: 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 > first few bin freqs: 0.000000 21.533203 43.066406 86.132812 107.666016 > 129.199219 > first few bin freqs: 21.533203 4.657142 22.820997 101.494026 89.441116 > 149.400269 > first few bin freqs: -21.533203 11.820784 50.410923 75.383156 > 120.206375 137.014877 > first few bin freqs: 0.000000 5.535223 43.823414 73.622749 103.943481 138.594788 > > The printed frequencies are not equally spaced, and not even always in > increasing order, so what is equally spaced? > > Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here |
Date | 2016-12-01 09:04 |
From | Victor Lazzarini |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
What I meant was that the centre frequency of each bin is equally spaced and for each bin you get a pair of amp-freq values. Richard Dobson’s reply complements this point by commenting on bandwidths and what happens if two components are found in the same bin. ======================== Prof. Victor Lazzarini Dean of Arts, Celtic Studies, and Philosophy, Maynooth University, Maynooth, Co Kildare, Ireland Tel: 00 353 7086936 Fax: 00 353 1 7086952 > On 1 Dec 2016, at 07:39, Oeyvind Brandtsegg |
Date | 2016-12-02 02:24 |
From | Oeyvind Brandtsegg |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
Oh. wow. Thanks. I did not know that the maximum deviation of the frequency for each bin relates to the overlap, neither did I know that there was a maximum deviation (from the equally spaced grid). I tried it out with the printing example csd (above) and indeed saw that doubling the overlap also doubles the range of possible frequencies for each bin. And the bin bandwidth also accounts for the possibility of negative frequencies around DC (?). Thanks again Oeyvind 2016-12-01 1:04 GMT-08:00 Victor Lazzarini |
Date | 2016-12-02 02:29 |
From | Steven Yi |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
Thanks Richard for this explanation! On Thu, Dec 1, 2016 at 3:16 AM, Richard Dobson |
Date | 2016-12-02 08:33 |
From | Victor Lazzarini |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
My idea is that, at the analysis stage, bin bandwidth is determined by the window type and size. So the hopping does not actually affect it directly. I looked at this in detail and could never find a relationship between hop and bandwidth, but I might have missed something. It makes sense that however much we are hoping, the DFT will still select components around its centre frequency (ie. a sinusoid will show up in the bins around its frequency depending also on the window shape). But if there is an expression that shows a relationship between hopsize and bin bandwidth, I have not seen it or managed to work it out. At the synthesis stage, however, I see that we can be liberal with bin freq values if we hop by one, and it will work up to the Nyquist. Of course, if we use additive synthesis, the issue also disappears. Victor Lazzarini Dean of Arts, Celtic Studies, and Philosophy Maynooth University Ireland > On 2 Dec 2016, at 02:29, Steven Yi |
Date | 2016-12-02 10:03 |
From | Richard Dobson |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
I am probably using the term bandwidth too loosely, but I am not sure what the alternative is. Of course for a single FFT taken in isolation, the bin bandwidth is just that of the classical model, determined by length of frame and sample rate. Accumulating phase over multiple frames, as we do in amp/freq pvoc, introduces this additional dimension where there is an "analysis rate", setting limits on the range and rate of phase changes that can be tracked. We have to (or at least prefer to) wrap the accumulated phase via 2.pi.f, and it would ~appear~ that this "wrapping bandwidth" expands as the overlap reduces until in the SPV limit I found (much to my surprise at the time!) that applying the process to a single input sinusoid resulted in almost all the bin frequencies converging to the frequency of the signal. So I have used the term bandwidth to describe that effect. There are figures in one of the published research papers that illustrate the effect graphically. I have generally assumed it is a consequence of spectral leakage spreading over all bins, and all those bins being able to be "tracked" in sync. PV frame overlap has historically been described very much in relation to windowing, defining the minimum overlap required by this or that window for proper reconstruction and the usual concerns over capture of transients. Could it be that there is still some useful research to be done on this frequency mapping in the special cases where overlap increases all the way to single-sample? Richard Dobson On 02/12/2016 08:33, Victor Lazzarini wrote: > My idea is that, at the analysis stage, bin bandwidth is determined by the window type and size. So the hopping does not actually affect it directly. I looked at this in detail and > could never find a relationship between hop > and bandwidth, but I might have missed something. It makes sense that however much we are hoping, the DFT will still select > components around its centre frequency (ie. a sinusoid will show up in the bins around its frequency depending also on the window shape). But if there is an expression that shows a relationship between hopsize and bin bandwidth, I have not seen it or managed to work it out. > > At the synthesis stage, however, I see that we can be liberal with bin freq values if we > hop by one, and it will work up to the Nyquist. Of course, if we use additive synthesis, the issue also disappears. > Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here |
Date | 2016-12-02 11:03 |
From | Victor Lazzarini |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
I think so, there is a missing/ignored piece. I have been trying to arrive at an expression that captures the idea but still have not found the route. We don’t seem to get there by looking at the analysis part, because whatever we do, the result will be limited by the window bandwidth, so there is nothing to see there. But I think this appears in the synthesis in a number of related questions: 1) the one you mentioned, where the hop-1 PV can have any frequency in any bin. 2) the fact that the freqs will alias beyond this ‘hopsize-related' bin bandwidth: if you glide the freq of a given bin, it will disappear at one end of the bandwidth (window bw) and reappear at the other end (wrapping around). Reducing the hopsize will make it disappear but take longer to reappear, which makes me conclude the hop-size-related width is getting bigger (as it is taking longer to wrap around). But because of the windowing, we can’t tell how high it is going as the sound disappears. If we however, double the window size, we will hear the glide continuing for longer. So the window size is still the limiting factor here. I suspect this is slightly different in the sliding PV because the reconstruction is different. Here’s the example to test this idea: instr 1 ihops = 256 iwins = 1024 a1 = 0 fs1 pvsanal a1,1024,ihops,iwins,1 kline line 440, 1, 880 pvsftr fs1, 1, 2 tablew 0.75, 10, 1 tablew kline, 10, 2 asig pvsynth fs1 outs asig,asig endin changing the hopsize ihops from 256 to 128 to 64 will introduce gaps between the successive glides. Changing the window size iwins will allow us to reduce the gaps and hear the continuing glissando. =================================== Prof. Victor Lazzarini Dean of Arts, Celtic Studies, and Philosophy, Maynooth University, Maynooth, Co Kildare, Ireland Tel: 00 353 7086936 Fax: 00 353 1 7086952 > On 2 Dec 2016, at 10:03, Richard Dobson |
Date | 2016-12-02 11:29 |
From | Victor Lazzarini |
Subject | Re: FFT opcodes pvs2array rfft mags phs etc.. |
Actually, it's probably more trivial than we have assumed. The width is related to the frame rate, 256 is cf \pm 86hz and this happens to be the von hann width as well. So the two things go together. Victor Lazzarini Dean of Arts, Celtic Studies, and Philosophy Maynooth University Ireland > On 2 Dec 2016, at 11:03, Victor Lazzarini |