[Csnd] pvs, pitching, pvstanal
Date | 2011-10-28 10:53 |
From | Oeyvind Brandtsegg |
Subject | [Csnd] pvs, pitching, pvstanal |
Hello I'm trying to do a rather drastic pitch shift of some material containing very low frequencies (<5Hz). I'm comparing the results from three pvs based methods. 1: pvsanal and pvscale, pvsynth 2: pvsifd, partials and resyn 3: pvstanal (and pvslock), pvsynth They sound quite different, and the pvsifd/partials method definitaly gives more details. However, as I'm running an fftsize of 32768 to get enough frequency resolution, pvsifd/partials is quite expensive. In the final implementation, I want to build something that runs in realtime, so performance is a consideration, but for the first experiments I do it in deferred time. My first question is: If I do not do time modification, and (for the time being) discard transient detection, what is the technical difference between using pvstanal (and its pitch shifting method), compared to using pvsanal for analysis and pvscale for pitch shift ? They do sound different, with pvstanal having much less of the high frequency fft "shimmer" artifacts. And: Would it be possible to have a "live input" version of pvstanal (without time modification)?, so that it does not need to work on the sound in a table but could use audio input, like pvsanal. And: Any tips or tricks to get better frequency resolution is greatly appreciated. I realize I'm at the wrong end of the spectrum for fine frequency resolution, but almost all of my signal is < 200Hz and the most significant parts are < 20Hz. I thought about doing the frequency analysis at a lower sampling rate, but I need to import the data into an orchestra running at normal sr (44.1 or thereabouts) to use it for musical applications. best Oeyvind Send bugs reports to the Sourceforge bug tracker https://sourceforge.net/tracker/?group_id=81968&atid=564599 Discussions of bugs and features can be posted here To unsubscribe, send email sympa@lists.bath.ac.uk with body "unsubscribe csound" |
Date | 2011-10-28 11:16 |
From | peiman khosravi |
Subject | Re: [Csnd] pvs, pitching, pvstanal |
> And: > Would it be possible to have a "live input" version of pvstanal > (without time modification)?, so that it does not need to work on the > sound in a table but could use audio input, like pvsanal. > + 1 P > And: > Any tips or tricks to get better frequency resolution is greatly > appreciated. I realize I'm at the wrong end of the spectrum for fine > frequency resolution, but almost all of my signal is < 200Hz and the > most significant parts are < 20Hz. I thought about doing the frequency > analysis at a lower sampling rate, but I need to import the data into > an orchestra running at normal sr (44.1 or thereabouts) to use it for > musical applications. > > best > Oeyvind > > > Send bugs reports to the Sourceforge bug tracker > https://sourceforge.net/tracker/?group_id=81968&atid=564599 > Discussions of bugs and features can be posted here > To unsubscribe, send email sympa@lists.bath.ac.uk with body "unsubscribe csound" > > Send bugs reports to the Sourceforge bug tracker https://sourceforge.net/tracker/?group_id=81968&atid=564599 Discussions of bugs and features can be posted here To unsubscribe, send email sympa@lists.bath.ac.uk with body "unsubscribe csound" |
Date | 2011-10-28 11:21 |
From | peiman khosravi |
Subject | Re: [Csnd] pvs, pitching, pvstanal |
Oeyvind, This is for Victor to say but I suspect that the only difference between pvsanal and pvstanal is the transient detection. Problem with pvscale is that when you start changing the freq scaling the bin phases are all messed up and the transients become blurred. So I suspect that without the transient detection pvstanal would give a similar result as pvscale. P On 28 October 2011 10:53, Oeyvind Brandtsegg |
Date | 2011-10-28 12:52 |
From | Richard Dobson |
Subject | Re: [Csnd] pvs, pitching, pvstanal |
I think you really do need to resample the source file to a much lower sampling rate first, as an intermediate pre-processing step, which will move the 5Hz signal higher up the range, relative to Nyquist, which in turn will give the pitch shift an easier time (and a smaller frame size). The important calculation is the ratio of resolution to Nyquist. 5/1000 is a much bigger number than 5/22050. Leave room in the bandwidth for at least first stage of pitch shift. So maybe in practice something like sr=4KHz. Otherwise, at say sr=44K, 5Hz is not that far off DC, and any fft system will struggle if the sampling rate is 44KHz, the more so in 32bits as it converts to a very small "normalised" number. To resolve 5Hz as a peak, your FFT resolution needs to be a fraction of that (~at least~ a factor of 4, preferably more), suggesting a frame size more like 64K or even 128K. Ordinary filters will have similar difficulties, for the same reason. Are you able to do an offline pre-processing step of this kind? If you are trying to do it in real time together with other more orthodox full-bandwidth processing, you have no choice but to use large framesizes (need doubles build of Csound), and accept both the CPU cost and the latency. Richard Dobson On 28/10/2011 10:53, Oeyvind Brandtsegg wrote: > Hello > > I'm trying to do a rather drastic pitch shift of some material > containing very low frequencies (<5Hz). > ... Send bugs reports to the Sourceforge bug tracker https://sourceforge.net/tracker/?group_id=81968&atid=564599 Discussions of bugs and features can be posted here To unsubscribe, send email sympa@lists.bath.ac.uk with body "unsubscribe csound" |
Date | 2011-10-28 18:37 |
From | Oeyvind Brandtsegg |
Subject | Re: [Csnd] pvs, pitching, pvstanal |
Thanks for the explanation Richard, I guess I'll have to figure out how to do some preprocessing, but I'd ideally like to get it done as "streaming" audio, continuous realtime audio in and out... hehe, I got time to think about how to do it, the RT version will not happen for a year at least. But I'm doing pre-production now and will compose with offline rendered sounds, so I'll experiment with lower sampling rates. Thanks again for confirming. One last question about the frequency resolution, though: In the "pvsbin.csd" included below, I generate a 3Hz sine wave with an oscil and then analyze it with pvsanal. Inspecting the first few bins shows that there is indeed a bin with frequency 3Hz that has significant amplitude. How does the pvs opcodes tackle frequecies that deviate from the center frequency of each bin (as in linearly spaced fft bins)? I would expect the lowest frequency to be at about 43Hz in the example, but there are several bins (16 of them) that reports frequencues below that. best Oeyvind ;********************** |
Date | 2011-10-28 18:48 |
From | Victor Lazzarini |
Subject | Re: [Csnd] pvs, pitching, pvstanal |
> > My first question is: > If I do not do time modification, and (for the time being) discard > transient detection, what is the technical difference between using > pvstanal (and its pitch shifting method), compared to using pvsanal > for analysis and pvscale for pitch shift ? They do sound different, > with pvstanal having much less of the high frequency fft "shimmer" > artifacts. The difference between pvstanal and pvsanal + pvscale is this 1) pvstanal shifts the pitch in the time domain (resampling) and then does the pv analysis (timescaling), keeping time intact. There is no anti- aliasing filtering, so sounds with very high frequency content (or with a dramatic pitch shift) will alias. In practice, the result appears to be OK with regards to this. pvscale scales the frequencies found in the analysis bins, so the operation is totally spectral domain, no aliasing. Because of the bin shifting there is a certain loss of phase coherency, which I guess is the cause of the shimmer. > And: > Would it be possible to have a "live input" version of pvstanal > (without time modification)?, so that it does not need to work on the > sound in a table but could use audio input, like pvsanal. Well, as I said, there's inherent time modification in pvstanal, so any 'live input' would require a certain amount of buffering. I could create an opcode, but this in fact can be achieved by combining pvstanal and tabw. > And: > Any tips or tricks to get better frequency resolution is greatly > appreciated. I realize I'm at the wrong end of the spectrum for fine > frequency resolution, but almost all of my signal is < 200Hz and the > most significant parts are < 20Hz. I thought about doing the frequency > analysis at a lower sampling rate, but I need to import the data into > an orchestra running at normal sr (44.1 or thereabouts) to use it for > musical applications. Nothing beyond the usual things occurs to me at the moment. > best > Oeyvind > > > Send bugs reports to the Sourceforge bug tracker > https://sourceforge.net/tracker/?group_id=81968&atid=564599 > Discussions of bugs and features can be posted here > To unsubscribe, send email sympa@lists.bath.ac.uk with body > "unsubscribe csound" > Dr Victor Lazzarini Senior Lecturer Dept. of Music NUI Maynooth Ireland tel.: +353 1 708 3545 Victor dot Lazzarini AT nuim dot ie Send bugs reports to the Sourceforge bug tracker https://sourceforge.net/tracker/?group_id=81968&atid=564599 Discussions of bugs and features can be posted here To unsubscribe, send email sympa@lists.bath.ac.uk with body "unsubscribe csound" |
Date | 2011-10-28 19:06 |
From | Oeyvind Brandtsegg |
Subject | Re: [Csnd] pvs, pitching, pvstanal |
Ah good. That explains perfectly why the two methods sound so different. And no need for a new opcode, it's quite ok to do it with pvstanal and tabw. So I would need a long enough buffer to accomodate for pitch shifting of one fft window? e.g. a 2048 samples buffer to transpose up one octave when using a window size of 1024 ? best Oeyvind 2011/10/28 Victor Lazzarini |
Date | 2011-10-28 19:20 |
From | Victor Lazzarini |
Subject | Re: [Csnd] pvs, pitching, pvstanal |
well I'd say you will need a table long enough to accommodate (one window + hopsize) * pitchshift samples (read every hopsize). This is because two windows separated by a hopsize are needed for each analysis and the data going into the window will be resampled from the table. Say you are shifting one octave, then the analysis will jump every other sample on the table to fill on window, which means N*2 samples length, you will also need the adjacent frame which adds a hopsize*2 samples. I think that should do. Victor On 28 Oct 2011, at 19:06, Oeyvind Brandtsegg wrote: > Ah good. That explains perfectly why the two methods sound so > different. > And no need for a new opcode, it's quite ok to do it with pvstanal > and tabw. > So I would need a long enough buffer to accomodate for pitch shifting > of one fft window? > e.g. a 2048 samples buffer to transpose up one octave when using a > window size of 1024 ? > > best > Oeyvind > > 2011/10/28 Victor Lazzarini |
Date | 2011-10-28 19:21 |
From | Victor Lazzarini |
Subject | Re: [Csnd] pvs, pitching, pvstanal |
By the way, for vocal pitch shifting, pvscale is much better because it preserves formants (quite well with the latest code). On 28 Oct 2011, at 19:06, Oeyvind Brandtsegg wrote: > Ah good. That explains perfectly why the two methods sound so > different. > And no need for a new opcode, it's quite ok to do it with pvstanal > and tabw. > So I would need a long enough buffer to accomodate for pitch shifting > of one fft window? > e.g. a 2048 samples buffer to transpose up one octave when using a > window size of 1024 ? > > best > Oeyvind > > 2011/10/28 Victor Lazzarini |
Date | 2011-10-29 00:25 |
From | Oeyvind Brandtsegg |
Subject | Re: [Csnd] pvs, pitching, pvstanal |
thanks. Oeyvind 2011/10/28 Victor Lazzarini |
Date | 2011-10-29 01:20 |
From | Richard Dobson |
Subject | Re: [Csnd] pvs, pitching, pvstanal |
Unfortunately that is a short question that really requires a long answer, but the short answer is "spectral leakage" - the "ideal" situation is for a source sinusoid to register in exactly one bin; but that only happens (a) if the frequency matches the cenre freq of the in exactly and (b) you use a plain rectangular window. But for most purposes that is a bad idea. frequencies generally do not fit FFt bins eactly, and the result is that even thoigh the inout is a sinusoid, literally ~all~ the bins get varying amounts of energy - the above-mentioned spectral leakage. We can mitigate the effects of this by using a window such as Hamming (it's really a sort of filter), but we can't eliminate it entirely. The added problem in the phase vocoder, which tracks phase from one frame to the next, in order to get the instantaneous frequency, or rate of change of phase, is that it does not simply track the phase of the input sinusoid as we would like, it tracks the phase of everything else too, which leads to all sorts of mayhem and complications. It remains a miracle the darned thing works at all, really! Each bin has a bandwidth (a fraction of +- Nyquist ; so you can even get some negative frequencies in the low bins), and the best we can hope for is that the computed frequencies in adjacent bins sort of bunch up close together around a peak. But defining a peak that low down is going to be a bit, um, hit and miss. that phase tracking depends on the overlap factor, and the rule of thumb is that the greater the overlap, the better the tracking. So try ifftsize/8, or even more, it might (but I am guessing here, too late to test anything) help sort out those low frequencies a bit. But you are asking quite a lot of pvoc down at that range! And you probablky still need a fftsize equal to the sample rate, i.e. covering a whole second of audio. Richard Dobson On 28/10/2011 18:37, Oeyvind Brandtsegg wrote: .. > > One last question about the frequency resolution, though: > In the "pvsbin.csd" included below, I generate a 3Hz sine wave with an > oscil and then analyze it with pvsanal. > Inspecting the first few bins shows that there is indeed a bin with > frequency 3Hz that has significant amplitude. > How does the pvs opcodes tackle frequecies that deviate from the > center frequency of each bin (as in linearly spaced fft bins)? > I would expect the lowest frequency to be at about 43Hz in the > example, but there are several bins (16 of them) that reports > frequencues below that. > Send bugs reports to the Sourceforge bug tracker https://sourceforge.net/tracker/?group_id=81968&atid=564599 Discussions of bugs and features can be posted here To unsubscribe, send email sympa@lists.bath.ac.uk with body "unsubscribe csound" |
Date | 2011-11-01 15:16 |
From | andreas russo |
Subject | [Csnd] pvstanal not installed? [fix] |
I apologize, I pasted the wrong example. turnon 1 gifil ftgen 0, 0, 0, 1, "become.wav", 0, 0, 1 instr 1 anoise gauss 10 fsrc pvsanal anoise, 1024, 1024/4, 1024, 0 fdest1 pvstanal 1,1,1,gifil fsig pvscross fsrc, fdest1, 0, 1 ares pvsadsyn fsig, 512, .25 out ares endin |