Csound Csound-dev Csound-tekno Search About

[Csnd] How Does FFT Overlap Add Analysis Work?

Date2017-10-29 20:44
FromEmmett Palaima
Subject[Csnd] How Does FFT Overlap Add Analysis Work?
Hi, I have been doing some research on spectral processing, including C++ implementation using the FFT class in JUCE. The analysis method for this class is fairly barebones, analyzing one block of samples at a time. As such it sounds a little different from the csound pvsanal / pvsynth opcodes, which use overlap-add analysis.

I am wondering how fft analysis methods use overlapping windows. Are the windows averaged at some point during analysis, or simply windowed and summed at the output? 

Can anyone give an explanation of how this works or perhaps point to some literature describing the topic?

Thanks,
Emmett
Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Date2017-10-30 10:06
FromRichard Dobson
SubjectRe: [Csnd] How Does FFT Overlap Add Analysis Work?
It's not necessarily "simple", but the resynthesised frames (Inverse 
FFT) are summed to produce the output. The idea of windowing is to act 
as a sort of low pass filter, smoothing the ends of each FFT-sized block 
of samples (as one must expect such an arbitrary block to have abrupt 
mid-cycle terminations). The window has certain properties (defined 
mathematically) such that the overlapped output "sums to 1" - so no net 
amplitude modulation in the output. Not all window shapes (e.g. used 
purely for analysis) offer this property.  The simplest such method just 
overlaps by 50% - almost literally one frame is dovetailed with the 
next. But for other reasons, including the need to capture more or less 
sudden transients in the source, a phase vocoder may overlap as many as 
eight frames...and in the recent "sliding phase vocoder" approach the 
overlap is effectively sample by sample. The result of all this is that 
in the absence of any frame modifications, the resynthesised signal is 
identical to the source. Or course, for us, it is those changes that 
make it all very "interesting".


The original, classic, paper by Mark Dolson is available online here:

https://www.eumus.edu.uy/eme/ensenanza/electivas/dsp/presentaciones/PhaseVocoderTutorial.pdf

and of course the more recent books on Csound (and Audio Programming) 
also cover it in detail. That reading will be essential if you want to 
get into the mathematics of it all!

Richard Dobson


On 29/10/2017 20:44, Emmett Palaima wrote:
> Hi, I have been doing some research on spectral processing, including 
> C++ implementation using the FFT class in JUCE. The analysis method for 
> this class is fairly barebones, analyzing one block of samples at a 
> time. As such it sounds a little different from the csound pvsanal / 
> pvsynth opcodes, which use overlap-add analysis.
> 
> I am wondering how fft analysis methods use overlapping windows. Are the 
> windows averaged at some point during analysis, or simply windowed and 
> summed at the output?
> 
> Can anyone give an explanation of how this works or perhaps point to 
> some literature describing the topic?
> 
> Thanks,
> Emmett
> Csound mailing list Csound@listserv.heanet.ie 
>  
> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to 
> https://github.com/csound/csound/issues Discussions of bugs and features 
> can be posted here

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Date2017-11-05 03:42
FromEmmett Palaima
SubjectRe: [Csnd] How Does FFT Overlap Add Analysis Work?
Okay, I looked at the audio programming book example for the phase vocoder and implemented it in JUCE. After I had the basic phase vocoder conversion working well, I started experimenting with spectral effects, specifically remaking the pvsblur opcode. 

I've gotten this to work fairly well, but am experiencing a couple of issues. With an fftsize of 1024 and an overlap of 512 the effect sounds fine, but still a little dirtier than the csound version which is really clean. With an fftsize of 1024 and an overlap of 256 the effect starts getting really noisy, a problem which persists even after I turn blur time to zero, leading me to it's an issue with the phase averaging. I've confirmed this by setting the phase to zero for all pvoc windows, which gets rid of the noise (though at the cost of creating the robotic sound associated with deleting phase information).

I've been checking through the Csound source code and can't find anything I am doing differently. The effect simply does pvoc analysis averages frames for an amount of time set by a blur time slider (and potentially modulated by an envelope follower, though I mostly turn this off for testing), then resynthesizes. 

Was wondering if anyone else might be able to help me out at this point, I've made a git repo which y'all can check out here (all the DSP relevant stuff is in PluginProcessor.ccp):  


Thanks!



On Mon, Oct 30, 2017 at 6:06 AM, Richard Dobson <richard@rwdobson.com> wrote:
It's not necessarily "simple", but the resynthesised frames (Inverse FFT) are summed to produce the output. The idea of windowing is to act as a sort of low pass filter, smoothing the ends of each FFT-sized block of samples (as one must expect such an arbitrary block to have abrupt mid-cycle terminations). The window has certain properties (defined mathematically) such that the overlapped output "sums to 1" - so no net amplitude modulation in the output. Not all window shapes (e.g. used purely for analysis) offer this property.  The simplest such method just overlaps by 50% - almost literally one frame is dovetailed with the next. But for other reasons, including the need to capture more or less sudden transients in the source, a phase vocoder may overlap as many as eight frames...and in the recent "sliding phase vocoder" approach the overlap is effectively sample by sample. The result of all this is that in the absence of any frame modifications, the resynthesised signal is identical to the source. Or course, for us, it is those changes that make it all very "interesting".


The original, classic, paper by Mark Dolson is available online here:

https://www.eumus.edu.uy/eme/ensenanza/electivas/dsp/presentaciones/PhaseVocoderTutorial.pdf

and of course the more recent books on Csound (and Audio Programming) also cover it in detail. That reading will be essential if you want to get into the mathematics of it all!

Richard Dobson



On 29/10/2017 20:44, Emmett Palaima wrote:
Hi, I have been doing some research on spectral processing, including C++ implementation using the FFT class in JUCE. The analysis method for this class is fairly barebones, analyzing one block of samples at a time. As such it sounds a little different from the csound pvsanal / pvsynth opcodes, which use overlap-add analysis.

I am wondering how fft analysis methods use overlapping windows. Are the windows averaged at some point during analysis, or simply windowed and summed at the output?

Can anyone give an explanation of how this works or perhaps point to some literature describing the topic?

Thanks,
Emmett
Csound mailing list Csound@listserv.heanet.ie <mailto:Csound@listserv.heanet.ie> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
       https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Date2017-11-05 22:30
FromAndrea Crespi <4ndr34cr35p1@GMAIL.COM>
SubjectRe: [Csnd] How Does FFT Overlap Add Analysis Work?

Hi, not sure if I can help here (and I have not had time to look into the code yet), but I it sounds strange to me when you write that you are averaging the phase components over time: I believe that the original opcode runs the lowpass filter on the frequency components instead.

 

Sent from my Windows 10 phone

 

From: Emmett Palaima
Sent: Sunday 5 November 2017 04:43
To: CSOUND@LISTSERV.HEANET.IE
Subject: Re: [Csnd] How Does FFT Overlap Add Analysis Work?

 

Okay, I looked at the audio programming book example for the phase vocoder and implemented it in JUCE. After I had the basic phase vocoder conversion working well, I started experimenting with spectral effects, specifically remaking the pvsblur opcode. 

 

I've gotten this to work fairly well, but am experiencing a couple of issues. With an fftsize of 1024 and an overlap of 512 the effect sounds fine, but still a little dirtier than the csound version which is really clean. With an fftsize of 1024 and an overlap of 256 the effect starts getting really noisy, a problem which persists even after I turn blur time to zero, leading me to it's an issue with the phase averaging. I've confirmed this by setting the phase to zero for all pvoc windows, which gets rid of the noise (though at the cost of creating the robotic sound associated with deleting phase information).

 

I've been checking through the Csound source code and can't find anything I am doing differently. The effect simply does pvoc analysis averages frames for an amount of time set by a blur time slider (and potentially modulated by an envelope follower, though I mostly turn this off for testing), then resynthesizes. 

 

Was wondering if anyone else might be able to help me out at this point, I've made a git repo which y'all can check out here (all the DSP relevant stuff is in PluginProcessor.ccp):  

 

 

Thanks!

 

 

 

On Mon, Oct 30, 2017 at 6:06 AM, Richard Dobson <richard@rwdobson.com> wrote:

It's not necessarily "simple", but the resynthesised frames (Inverse FFT) are summed to produce the output. The idea of windowing is to act as a sort of low pass filter, smoothing the ends of each FFT-sized block of samples (as one must expect such an arbitrary block to have abrupt mid-cycle terminations). The window has certain properties (defined mathematically) such that the overlapped output "sums to 1" - so no net amplitude modulation in the output. Not all window shapes (e.g. used purely for analysis) offer this property.  The simplest such method just overlaps by 50% - almost literally one frame is dovetailed with the next. But for other reasons, including the need to capture more or less sudden transients in the source, a phase vocoder may overlap as many as eight frames...and in the recent "sliding phase vocoder" approach the overlap is effectively sample by sample. The result of all this is that in the absence of any frame modifications, the resynthesised signal is identical to the source. Or course, for us, it is those changes that make it all very "interesting".


The original, classic, paper by Mark Dolson is available online here:

https://www.eumus.edu.uy/eme/ensenanza/electivas/dsp/presentaciones/PhaseVocoderTutorial.pdf

and of course the more recent books on Csound (and Audio Programming) also cover it in detail. That reading will be essential if you want to get into the mathematics of it all!

Richard Dobson




On 29/10/2017 20:44, Emmett Palaima wrote:

Hi, I have been doing some research on spectral processing, including C++ implementation using the FFT class in JUCE. The analysis method for this class is fairly barebones, analyzing one block of samples at a time. As such it sounds a little different from the csound pvsanal / pvsynth opcodes, which use overlap-add analysis.

I am wondering how fft analysis methods use overlapping windows. Are the windows averaged at some point during analysis, or simply windowed and summed at the output?

Can anyone give an explanation of how this works or perhaps point to some literature describing the topic?

Thanks,
Emmett

Csound mailing list Csound@listserv.heanet.ie <mailto:Csound@listserv.heanet.ie> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here


Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
       https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 


Date2017-11-06 00:45
FromEmmett Palaima
SubjectRe: [Csnd] How Does FFT Overlap Add Analysis Work?
The opcode is described as follows: "Average the amp/freq time functions of each analysis channel for a specified time."

In the implementation one can see both components of the pvoc buffer being averaged here (pvsbasic.c lines 2121-2128):

        for (j = first; j != countr; j = (j + framesize) % mdel) {
          amp += delay[j + i];
          freq += delay[j + i + 1];
        }

        fout[i] = (float) (amp / delayframes);
        fout[i + 1] = (float) (freq / delayframes);
        amp = freq = 0.;
 
Is there something I am missing here?

On Sun, Nov 5, 2017 at 5:30 PM, Andrea Crespi <4ndr34cr35p1@gmail.com> wrote:

Hi, not sure if I can help here (and I have not had time to look into the code yet), but I it sounds strange to me when you write that you are averaging the phase components over time: I believe that the original opcode runs the lowpass filter on the frequency components instead.

 

Sent from my Windows 10 phone

 

From: Emmett Palaima
Sent: Sunday 5 November 2017 04:43
To: CSOUND@LISTSERV.HEANET.IE
Subject: Re: [Csnd] How Does FFT Overlap Add Analysis Work?

 

Okay, I looked at the audio programming book example for the phase vocoder and implemented it in JUCE. After I had the basic phase vocoder conversion working well, I started experimenting with spectral effects, specifically remaking the pvsblur opcode. 

 

I've gotten this to work fairly well, but am experiencing a couple of issues. With an fftsize of 1024 and an overlap of 512 the effect sounds fine, but still a little dirtier than the csound version which is really clean. With an fftsize of 1024 and an overlap of 256 the effect starts getting really noisy, a problem which persists even after I turn blur time to zero, leading me to it's an issue with the phase averaging. I've confirmed this by setting the phase to zero for all pvoc windows, which gets rid of the noise (though at the cost of creating the robotic sound associated with deleting phase information).

 

I've been checking through the Csound source code and can't find anything I am doing differently. The effect simply does pvoc analysis averages frames for an amount of time set by a blur time slider (and potentially modulated by an envelope follower, though I mostly turn this off for testing), then resynthesizes. 

 

Was wondering if anyone else might be able to help me out at this point, I've made a git repo which y'all can check out here (all the DSP relevant stuff is in PluginProcessor.ccp):  

 

 

Thanks!

 

 

 

On Mon, Oct 30, 2017 at 6:06 AM, Richard Dobson <richard@rwdobson.com> wrote:

It's not necessarily "simple", but the resynthesised frames (Inverse FFT) are summed to produce the output. The idea of windowing is to act as a sort of low pass filter, smoothing the ends of each FFT-sized block of samples (as one must expect such an arbitrary block to have abrupt mid-cycle terminations). The window has certain properties (defined mathematically) such that the overlapped output "sums to 1" - so no net amplitude modulation in the output. Not all window shapes (e.g. used purely for analysis) offer this property.  The simplest such method just overlaps by 50% - almost literally one frame is dovetailed with the next. But for other reasons, including the need to capture more or less sudden transients in the source, a phase vocoder may overlap as many as eight frames...and in the recent "sliding phase vocoder" approach the overlap is effectively sample by sample. The result of all this is that in the absence of any frame modifications, the resynthesised signal is identical to the source. Or course, for us, it is those changes that make it all very "interesting".


The original, classic, paper by Mark Dolson is available online here:

https://www.eumus.edu.uy/eme/ensenanza/electivas/dsp/presentaciones/PhaseVocoderTutorial.pdf

and of course the more recent books on Csound (and Audio Programming) also cover it in detail. That reading will be essential if you want to get into the mathematics of it all!

Richard Dobson




On 29/10/2017 20:44, Emmett Palaima wrote:

Hi, I have been doing some research on spectral processing, including C++ implementation using the FFT class in JUCE. The analysis method for this class is fairly barebones, analyzing one block of samples at a time. As such it sounds a little different from the csound pvsanal / pvsynth opcodes, which use overlap-add analysis.

I am wondering how fft analysis methods use overlapping windows. Are the windows averaged at some point during analysis, or simply windowed and summed at the output?

Can anyone give an explanation of how this works or perhaps point to some literature describing the topic?

Thanks,
Emmett

Csound mailing list Csound@listserv.heanet.ie <mailto:Csound@listserv.heanet.ie> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here


Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
       https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Date2017-11-06 00:49
FromEmmett Palaima
SubjectRe: [Csnd] How Does FFT Overlap Add Analysis Work?
Wait sorry I think I now understand what you are saying. pvsblur does indeed average frequency data, which controls phase data down the line when it is converted back into fft format. 

On Sun, Nov 5, 2017 at 7:45 PM, Emmett Palaima <epalaima@berklee.edu> wrote:
The opcode is described as follows: "Average the amp/freq time functions of each analysis channel for a specified time."

In the implementation one can see both components of the pvoc buffer being averaged here (pvsbasic.c lines 2121-2128):

        for (j = first; j != countr; j = (j + framesize) % mdel) {
          amp += delay[j + i];
          freq += delay[j + i + 1];
        }

        fout[i] = (float) (amp / delayframes);
        fout[i + 1] = (float) (freq / delayframes);
        amp = freq = 0.;
 
Is there something I am missing here?

On Sun, Nov 5, 2017 at 5:30 PM, Andrea Crespi <4ndr34cr35p1@gmail.com> wrote:

Hi, not sure if I can help here (and I have not had time to look into the code yet), but I it sounds strange to me when you write that you are averaging the phase components over time: I believe that the original opcode runs the lowpass filter on the frequency components instead.

 

Sent from my Windows 10 phone

 

From: Emmett Palaima
Sent: Sunday 5 November 2017 04:43
To: CSOUND@LISTSERV.HEANET.IE
Subject: Re: [Csnd] How Does FFT Overlap Add Analysis Work?

 

Okay, I looked at the audio programming book example for the phase vocoder and implemented it in JUCE. After I had the basic phase vocoder conversion working well, I started experimenting with spectral effects, specifically remaking the pvsblur opcode. 

 

I've gotten this to work fairly well, but am experiencing a couple of issues. With an fftsize of 1024 and an overlap of 512 the effect sounds fine, but still a little dirtier than the csound version which is really clean. With an fftsize of 1024 and an overlap of 256 the effect starts getting really noisy, a problem which persists even after I turn blur time to zero, leading me to it's an issue with the phase averaging. I've confirmed this by setting the phase to zero for all pvoc windows, which gets rid of the noise (though at the cost of creating the robotic sound associated with deleting phase information).

 

I've been checking through the Csound source code and can't find anything I am doing differently. The effect simply does pvoc analysis averages frames for an amount of time set by a blur time slider (and potentially modulated by an envelope follower, though I mostly turn this off for testing), then resynthesizes. 

 

Was wondering if anyone else might be able to help me out at this point, I've made a git repo which y'all can check out here (all the DSP relevant stuff is in PluginProcessor.ccp):  

 

 

Thanks!

 

 

 

On Mon, Oct 30, 2017 at 6:06 AM, Richard Dobson <richard@rwdobson.com> wrote:

It's not necessarily "simple", but the resynthesised frames (Inverse FFT) are summed to produce the output. The idea of windowing is to act as a sort of low pass filter, smoothing the ends of each FFT-sized block of samples (as one must expect such an arbitrary block to have abrupt mid-cycle terminations). The window has certain properties (defined mathematically) such that the overlapped output "sums to 1" - so no net amplitude modulation in the output. Not all window shapes (e.g. used purely for analysis) offer this property.  The simplest such method just overlaps by 50% - almost literally one frame is dovetailed with the next. But for other reasons, including the need to capture more or less sudden transients in the source, a phase vocoder may overlap as many as eight frames...and in the recent "sliding phase vocoder" approach the overlap is effectively sample by sample. The result of all this is that in the absence of any frame modifications, the resynthesised signal is identical to the source. Or course, for us, it is those changes that make it all very "interesting".


The original, classic, paper by Mark Dolson is available online here:

https://www.eumus.edu.uy/eme/ensenanza/electivas/dsp/presentaciones/PhaseVocoderTutorial.pdf

and of course the more recent books on Csound (and Audio Programming) also cover it in detail. That reading will be essential if you want to get into the mathematics of it all!

Richard Dobson




On 29/10/2017 20:44, Emmett Palaima wrote:

Hi, I have been doing some research on spectral processing, including C++ implementation using the FFT class in JUCE. The analysis method for this class is fairly barebones, analyzing one block of samples at a time. As such it sounds a little different from the csound pvsanal / pvsynth opcodes, which use overlap-add analysis.

I am wondering how fft analysis methods use overlapping windows. Are the windows averaged at some point during analysis, or simply windowed and summed at the output?

Can anyone give an explanation of how this works or perhaps point to some literature describing the topic?

Thanks,
Emmett

Csound mailing list Csound@listserv.heanet.ie <mailto:Csound@listserv.heanet.ie> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here


Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
       https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here


Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Date2017-11-06 07:48
FromAndrea Crespi <4ndr34cr35p1@GMAIL.COM>
SubjectRe: [Csnd] How Does FFT Overlap Add Analysis Work?

Yes, not sure what would be the effect of averaging phase. Did this solve your issues with noise in the resynthesised signal?

 

Sent from my Windows 10 phone

 

From: Emmett Palaima
Sent: Monday 6 November 2017 01:49
To: CSOUND@LISTSERV.HEANET.IE
Subject: Re: [Csnd] How Does FFT Overlap Add Analysis Work?

 

Wait sorry I think I now understand what you are saying. pvsblur does indeed average frequency data, which controls phase data down the line when it is converted back into fft format. 

 

On Sun, Nov 5, 2017 at 7:45 PM, Emmett Palaima <epalaima@berklee.edu> wrote:

The opcode is described as follows: "Average the amp/freq time functions of each analysis channel for a specified time."

 

In the implementation one can see both components of the pvoc buffer being averaged here (pvsbasic.c lines 2121-2128):

 

        for (j = first; j != countr; j = (j + framesize) % mdel) {

          amp += delay[j + i];

          freq += delay[j + i + 1];

        }

 

        fout[i] = (float) (amp / delayframes);

        fout[i + 1] = (float) (freq / delayframes);

        amp = freq = 0.;

 

Is there something I am missing here?

 

On Sun, Nov 5, 2017 at 5:30 PM, Andrea Crespi <4ndr34cr35p1@gmail.com> wrote:

Hi, not sure if I can help here (and I have not had time to look into the code yet), but I it sounds strange to me when you write that you are averaging the phase components over time: I believe that the original opcode runs the lowpass filter on the frequency components instead.

 

Sent from my Windows 10 phone

 

From: Emmett Palaima
Sent: Sunday 5 November 2017 04:43
To: CSOUND@LISTSERV.HEANET.IE
Subject: Re: [Csnd] How Does FFT Overlap Add Analysis Work?

 

Okay, I looked at the audio programming book example for the phase vocoder and implemented it in JUCE. After I had the basic phase vocoder conversion working well, I started experimenting with spectral effects, specifically remaking the pvsblur opcode. 

 

I've gotten this to work fairly well, but am experiencing a couple of issues. With an fftsize of 1024 and an overlap of 512 the effect sounds fine, but still a little dirtier than the csound version which is really clean. With an fftsize of 1024 and an overlap of 256 the effect starts getting really noisy, a problem which persists even after I turn blur time to zero, leading me to it's an issue with the phase averaging. I've confirmed this by setting the phase to zero for all pvoc windows, which gets rid of the noise (though at the cost of creating the robotic sound associated with deleting phase information).

 

I've been checking through the Csound source code and can't find anything I am doing differently. The effect simply does pvoc analysis averages frames for an amount of time set by a blur time slider (and potentially modulated by an envelope follower, though I mostly turn this off for testing), then resynthesizes. 

 

Was wondering if anyone else might be able to help me out at this point, I've made a git repo which y'all can check out here (all the DSP relevant stuff is in PluginProcessor.ccp):  

 

 

Thanks!

 

 

 

On Mon, Oct 30, 2017 at 6:06 AM, Richard Dobson <richard@rwdobson.com> wrote:

It's not necessarily "simple", but the resynthesised frames (Inverse FFT) are summed to produce the output. The idea of windowing is to act as a sort of low pass filter, smoothing the ends of each FFT-sized block of samples (as one must expect such an arbitrary block to have abrupt mid-cycle terminations). The window has certain properties (defined mathematically) such that the overlapped output "sums to 1" - so no net amplitude modulation in the output. Not all window shapes (e.g. used purely for analysis) offer this property.  The simplest such method just overlaps by 50% - almost literally one frame is dovetailed with the next. But for other reasons, including the need to capture more or less sudden transients in the source, a phase vocoder may overlap as many as eight frames...and in the recent "sliding phase vocoder" approach the overlap is effectively sample by sample. The result of all this is that in the absence of any frame modifications, the resynthesised signal is identical to the source. Or course, for us, it is those changes that make it all very "interesting".


The original, classic, paper by Mark Dolson is available online here:

https://www.eumus.edu.uy/eme/ensenanza/electivas/dsp/presentaciones/PhaseVocoderTutorial.pdf

and of course the more recent books on Csound (and Audio Programming) also cover it in detail. That reading will be essential if you want to get into the mathematics of it all!

Richard Dobson




On 29/10/2017 20:44, Emmett Palaima wrote:

Hi, I have been doing some research on spectral processing, including C++ implementation using the FFT class in JUCE. The analysis method for this class is fairly barebones, analyzing one block of samples at a time. As such it sounds a little different from the csound pvsanal / pvsynth opcodes, which use overlap-add analysis.

I am wondering how fft analysis methods use overlapping windows. Are the windows averaged at some point during analysis, or simply windowed and summed at the output?

Can anyone give an explanation of how this works or perhaps point to some literature describing the topic?

Thanks,
Emmett

Csound mailing list Csound@listserv.heanet.ie <mailto:Csound@listserv.heanet.ie> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here


Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
       https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 


Date2017-11-06 15:13
FromEmmett Palaima
SubjectRe: [Csnd] How Does FFT Overlap Add Analysis Work?
Oh, I was averaging frequency already. Just phrased it that way since it ultimately showed up as a phase issue. 

On Mon, Nov 6, 2017 at 2:48 AM Andrea Crespi <4ndr34cr35p1@gmail.com> wrote:

Yes, not sure what would be the effect of averaging phase. Did this solve your issues with noise in the resynthesised signal?

 

Sent from my Windows 10 phone

 

From: Emmett Palaima
Sent: Monday 6 November 2017 01:49


To: CSOUND@LISTSERV.HEANET.IE
Subject: Re: [Csnd] How Does FFT Overlap Add Analysis Work?

 

Wait sorry I think I now understand what you are saying. pvsblur does indeed average frequency data, which controls phase data down the line when it is converted back into fft format. 

 

On Sun, Nov 5, 2017 at 7:45 PM, Emmett Palaima <epalaima@berklee.edu> wrote:

The opcode is described as follows: "Average the amp/freq time functions of each analysis channel for a specified time."

 

In the implementation one can see both components of the pvoc buffer being averaged here (pvsbasic.c lines 2121-2128):

 

        for (j = first; j != countr; j = (j + framesize) % mdel) {

          amp += delay[j + i];

          freq += delay[j + i + 1];

        }

 

        fout[i] = (float) (amp / delayframes);

        fout[i + 1] = (float) (freq / delayframes);

        amp = freq = 0.;

 

Is there something I am missing here?

 

On Sun, Nov 5, 2017 at 5:30 PM, Andrea Crespi <4ndr34cr35p1@gmail.com> wrote:

Hi, not sure if I can help here (and I have not had time to look into the code yet), but I it sounds strange to me when you write that you are averaging the phase components over time: I believe that the original opcode runs the lowpass filter on the frequency components instead.

 

Sent from my Windows 10 phone

 

From: Emmett Palaima
Sent: Sunday 5 November 2017 04:43
To: CSOUND@LISTSERV.HEANET.IE
Subject: Re: [Csnd] How Does FFT Overlap Add Analysis Work?

 

Okay, I looked at the audio programming book example for the phase vocoder and implemented it in JUCE. After I had the basic phase vocoder conversion working well, I started experimenting with spectral effects, specifically remaking the pvsblur opcode. 

 

I've gotten this to work fairly well, but am experiencing a couple of issues. With an fftsize of 1024 and an overlap of 512 the effect sounds fine, but still a little dirtier than the csound version which is really clean. With an fftsize of 1024 and an overlap of 256 the effect starts getting really noisy, a problem which persists even after I turn blur time to zero, leading me to it's an issue with the phase averaging. I've confirmed this by setting the phase to zero for all pvoc windows, which gets rid of the noise (though at the cost of creating the robotic sound associated with deleting phase information).

 

I've been checking through the Csound source code and can't find anything I am doing differently. The effect simply does pvoc analysis averages frames for an amount of time set by a blur time slider (and potentially modulated by an envelope follower, though I mostly turn this off for testing), then resynthesizes. 

 

Was wondering if anyone else might be able to help me out at this point, I've made a git repo which y'all can check out here (all the DSP relevant stuff is in PluginProcessor.ccp):  

 

 

Thanks!

 

 

 

On Mon, Oct 30, 2017 at 6:06 AM, Richard Dobson <richard@rwdobson.com> wrote:

It's not necessarily "simple", but the resynthesised frames (Inverse FFT) are summed to produce the output. The idea of windowing is to act as a sort of low pass filter, smoothing the ends of each FFT-sized block of samples (as one must expect such an arbitrary block to have abrupt mid-cycle terminations). The window has certain properties (defined mathematically) such that the overlapped output "sums to 1" - so no net amplitude modulation in the output. Not all window shapes (e.g. used purely for analysis) offer this property.  The simplest such method just overlaps by 50% - almost literally one frame is dovetailed with the next. But for other reasons, including the need to capture more or less sudden transients in the source, a phase vocoder may overlap as many as eight frames...and in the recent "sliding phase vocoder" approach the overlap is effectively sample by sample. The result of all this is that in the absence of any frame modifications, the resynthesised signal is identical to the source. Or course, for us, it is those changes that make it all very "interesting".


The original, classic, paper by Mark Dolson is available online here:

https://www.eumus.edu.uy/eme/ensenanza/electivas/dsp/presentaciones/PhaseVocoderTutorial.pdf

and of course the more recent books on Csound (and Audio Programming) also cover it in detail. That reading will be essential if you want to get into the mathematics of it all!

Richard Dobson




On 29/10/2017 20:44, Emmett Palaima wrote:

Hi, I have been doing some research on spectral processing, including C++ implementation using the FFT class in JUCE. The analysis method for this class is fairly barebones, analyzing one block of samples at a time. As such it sounds a little different from the csound pvsanal / pvsynth opcodes, which use overlap-add analysis.

I am wondering how fft analysis methods use overlapping windows. Are the windows averaged at some point during analysis, or simply windowed and summed at the output?

Can anyone give an explanation of how this works or perhaps point to some literature describing the topic?

Thanks,
Emmett

Csound mailing list Csound@listserv.heanet.ie <mailto:Csound@listserv.heanet.ie> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here


Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
       https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here
Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Date2017-11-07 23:22
FromEmmett Palaima
SubjectRe: [Csnd] How Does FFT Overlap Add Analysis Work?
One thing I am noticing which might be causing the discrepancy in effects is this: 

In the AudioProgramming book it appears that the FFT function takes and returns an array of the same size, e.g. you pass in 1024 samples, you get 512 real+imaginary pairs as an output. With the JUCE functions you pass in an array of length FFTSIZE * 2, the first half of which should be filled with real samples, meaning that you pass in 1024 samples and get 1024 real+imaginary pairs as an output.

Am I understanding this correctly? What is the reason for this difference?

On Mon, Nov 6, 2017 at 10:13 AM, Emmett Palaima <epalaima@berklee.edu> wrote:
Oh, I was averaging frequency already. Just phrased it that way since it ultimately showed up as a phase issue. 

On Mon, Nov 6, 2017 at 2:48 AM Andrea Crespi <4ndr34cr35p1@gmail.com> wrote:

Yes, not sure what would be the effect of averaging phase. Did this solve your issues with noise in the resynthesised signal?

 

Sent from my Windows 10 phone

 

From: Emmett Palaima
Sent: Monday 6 November 2017 01:49


To: CSOUND@LISTSERV.HEANET.IE
Subject: Re: [Csnd] How Does FFT Overlap Add Analysis Work?

 

Wait sorry I think I now understand what you are saying. pvsblur does indeed average frequency data, which controls phase data down the line when it is converted back into fft format. 

 

On Sun, Nov 5, 2017 at 7:45 PM, Emmett Palaima <epalaima@berklee.edu> wrote:

The opcode is described as follows: "Average the amp/freq time functions of each analysis channel for a specified time."

 

In the implementation one can see both components of the pvoc buffer being averaged here (pvsbasic.c lines 2121-2128):

 

        for (j = first; j != countr; j = (j + framesize) % mdel) {

          amp += delay[j + i];

          freq += delay[j + i + 1];

        }

 

        fout[i] = (float) (amp / delayframes);

        fout[i + 1] = (float) (freq / delayframes);

        amp = freq = 0.;

 

Is there something I am missing here?

 

On Sun, Nov 5, 2017 at 5:30 PM, Andrea Crespi <4ndr34cr35p1@gmail.com> wrote:

Hi, not sure if I can help here (and I have not had time to look into the code yet), but I it sounds strange to me when you write that you are averaging the phase components over time: I believe that the original opcode runs the lowpass filter on the frequency components instead.

 

Sent from my Windows 10 phone

 

From: Emmett Palaima
Sent: Sunday 5 November 2017 04:43
To: CSOUND@LISTSERV.HEANET.IE
Subject: Re: [Csnd] How Does FFT Overlap Add Analysis Work?

 

Okay, I looked at the audio programming book example for the phase vocoder and implemented it in JUCE. After I had the basic phase vocoder conversion working well, I started experimenting with spectral effects, specifically remaking the pvsblur opcode. 

 

I've gotten this to work fairly well, but am experiencing a couple of issues. With an fftsize of 1024 and an overlap of 512 the effect sounds fine, but still a little dirtier than the csound version which is really clean. With an fftsize of 1024 and an overlap of 256 the effect starts getting really noisy, a problem which persists even after I turn blur time to zero, leading me to it's an issue with the phase averaging. I've confirmed this by setting the phase to zero for all pvoc windows, which gets rid of the noise (though at the cost of creating the robotic sound associated with deleting phase information).

 

I've been checking through the Csound source code and can't find anything I am doing differently. The effect simply does pvoc analysis averages frames for an amount of time set by a blur time slider (and potentially modulated by an envelope follower, though I mostly turn this off for testing), then resynthesizes. 

 

Was wondering if anyone else might be able to help me out at this point, I've made a git repo which y'all can check out here (all the DSP relevant stuff is in PluginProcessor.ccp):  

 

 

Thanks!

 

 

 

On Mon, Oct 30, 2017 at 6:06 AM, Richard Dobson <richard@rwdobson.com> wrote:

It's not necessarily "simple", but the resynthesised frames (Inverse FFT) are summed to produce the output. The idea of windowing is to act as a sort of low pass filter, smoothing the ends of each FFT-sized block of samples (as one must expect such an arbitrary block to have abrupt mid-cycle terminations). The window has certain properties (defined mathematically) such that the overlapped output "sums to 1" - so no net amplitude modulation in the output. Not all window shapes (e.g. used purely for analysis) offer this property.  The simplest such method just overlaps by 50% - almost literally one frame is dovetailed with the next. But for other reasons, including the need to capture more or less sudden transients in the source, a phase vocoder may overlap as many as eight frames...and in the recent "sliding phase vocoder" approach the overlap is effectively sample by sample. The result of all this is that in the absence of any frame modifications, the resynthesised signal is identical to the source. Or course, for us, it is those changes that make it all very "interesting".


The original, classic, paper by Mark Dolson is available online here:

https://www.eumus.edu.uy/eme/ensenanza/electivas/dsp/presentaciones/PhaseVocoderTutorial.pdf

and of course the more recent books on Csound (and Audio Programming) also cover it in detail. That reading will be essential if you want to get into the mathematics of it all!

Richard Dobson




On 29/10/2017 20:44, Emmett Palaima wrote:

Hi, I have been doing some research on spectral processing, including C++ implementation using the FFT class in JUCE. The analysis method for this class is fairly barebones, analyzing one block of samples at a time. As such it sounds a little different from the csound pvsanal / pvsynth opcodes, which use overlap-add analysis.

I am wondering how fft analysis methods use overlapping windows. Are the windows averaged at some point during analysis, or simply windowed and summed at the output?

Can anyone give an explanation of how this works or perhaps point to some literature describing the topic?

Thanks,
Emmett

Csound mailing list Csound@listserv.heanet.ie <mailto:Csound@listserv.heanet.ie> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here


Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
       https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Date2017-11-08 07:47
FromAndrea Crespi <4ndr34cr35p1@GMAIL.COM>
SubjectRe: [Csnd] How Does FFT Overlap Add Analysis Work?

I am not familiar with JUCE, but it seems reasonable to me. If I understand correctly, the Csound function assumes that you are computing the DFT of a real signal and simply ignores all the information above the Nyquist frequency, as it is redundant in the case of real input signals. Apparently the FFT function you are using in JUCE is more general and leaves to the user the freedom to keep that information or neglect it. Of course, in the Csound version, the user also needs to take care that the correct IFFT algorithm is used when resynthesising. Please someone correct me if this is not true or contains mistakes.

Cheers

 

Sent from my Windows 10 phone

 

From: Emmett Palaima
Sent: Wednesday 8 November 2017 00:23
To: CSOUND@LISTSERV.HEANET.IE
Subject: Re: [Csnd] How Does FFT Overlap Add Analysis Work?

 

One thing I am noticing which might be causing the discrepancy in effects is this: 

 

In the AudioProgramming book it appears that the FFT function takes and returns an array of the same size, e.g. you pass in 1024 samples, you get 512 real+imaginary pairs as an output. With the JUCE functions you pass in an array of length FFTSIZE * 2, the first half of which should be filled with real samples, meaning that you pass in 1024 samples and get 1024 real+imaginary pairs as an output.

 

Am I understanding this correctly? What is the reason for this difference?

 

On Mon, Nov 6, 2017 at 10:13 AM, Emmett Palaima <epalaima@berklee.edu> wrote:

Oh, I was averaging frequency already. Just phrased it that way since it ultimately showed up as a phase issue. 

 

On Mon, Nov 6, 2017 at 2:48 AM Andrea Crespi <4ndr34cr35p1@gmail.com> wrote:

Yes, not sure what would be the effect of averaging phase. Did this solve your issues with noise in the resynthesised signal?

 

Sent from my Windows 10 phone

 

From: Emmett Palaima
Sent: Monday 6 November 2017 01:49


To: CSOUND@LISTSERV.HEANET.IE
Subject: Re: [Csnd] How Does FFT Overlap Add Analysis Work?

 

Wait sorry I think I now understand what you are saying. pvsblur does indeed average frequency data, which controls phase data down the line when it is converted back into fft format. 

 

On Sun, Nov 5, 2017 at 7:45 PM, Emmett Palaima <epalaima@berklee.edu> wrote:

The opcode is described as follows: "Average the amp/freq time functions of each analysis channel for a specified time."

 

In the implementation one can see both components of the pvoc buffer being averaged here (pvsbasic.c lines 2121-2128):

 

        for (j = first; j != countr; j = (j + framesize) % mdel) {

          amp += delay[j + i];

          freq += delay[j + i + 1];

        }

 

        fout[i] = (float) (amp / delayframes);

        fout[i + 1] = (float) (freq / delayframes);

        amp = freq = 0.;

 

Is there something I am missing here?

 

On Sun, Nov 5, 2017 at 5:30 PM, Andrea Crespi <4ndr34cr35p1@gmail.com> wrote:

Hi, not sure if I can help here (and I have not had time to look into the code yet), but I it sounds strange to me when you write that you are averaging the phase components over time: I believe that the original opcode runs the lowpass filter on the frequency components instead.

 

Sent from my Windows 10 phone

 

From: Emmett Palaima
Sent: Sunday 5 November 2017 04:43
To: CSOUND@LISTSERV.HEANET.IE
Subject: Re: [Csnd] How Does FFT Overlap Add Analysis Work?

 

Okay, I looked at the audio programming book example for the phase vocoder and implemented it in JUCE. After I had the basic phase vocoder conversion working well, I started experimenting with spectral effects, specifically remaking the pvsblur opcode. 

 

I've gotten this to work fairly well, but am experiencing a couple of issues. With an fftsize of 1024 and an overlap of 512 the effect sounds fine, but still a little dirtier than the csound version which is really clean. With an fftsize of 1024 and an overlap of 256 the effect starts getting really noisy, a problem which persists even after I turn blur time to zero, leading me to it's an issue with the phase averaging. I've confirmed this by setting the phase to zero for all pvoc windows, which gets rid of the noise (though at the cost of creating the robotic sound associated with deleting phase information).

 

I've been checking through the Csound source code and can't find anything I am doing differently. The effect simply does pvoc analysis averages frames for an amount of time set by a blur time slider (and potentially modulated by an envelope follower, though I mostly turn this off for testing), then resynthesizes. 

 

Was wondering if anyone else might be able to help me out at this point, I've made a git repo which y'all can check out here (all the DSP relevant stuff is in PluginProcessor.ccp):  

 

 

Thanks!

 

 

 

On Mon, Oct 30, 2017 at 6:06 AM, Richard Dobson <richard@rwdobson.com> wrote:

It's not necessarily "simple", but the resynthesised frames (Inverse FFT) are summed to produce the output. The idea of windowing is to act as a sort of low pass filter, smoothing the ends of each FFT-sized block of samples (as one must expect such an arbitrary block to have abrupt mid-cycle terminations). The window has certain properties (defined mathematically) such that the overlapped output "sums to 1" - so no net amplitude modulation in the output. Not all window shapes (e.g. used purely for analysis) offer this property.  The simplest such method just overlaps by 50% - almost literally one frame is dovetailed with the next. But for other reasons, including the need to capture more or less sudden transients in the source, a phase vocoder may overlap as many as eight frames...and in the recent "sliding phase vocoder" approach the overlap is effectively sample by sample. The result of all this is that in the absence of any frame modifications, the resynthesised signal is identical to the source. Or course, for us, it is those changes that make it all very "interesting".


The original, classic, paper by Mark Dolson is available online here:

https://www.eumus.edu.uy/eme/ensenanza/electivas/dsp/presentaciones/PhaseVocoderTutorial.pdf

and of course the more recent books on Csound (and Audio Programming) also cover it in detail. That reading will be essential if you want to get into the mathematics of it all!

Richard Dobson




On 29/10/2017 20:44, Emmett Palaima wrote:

Hi, I have been doing some research on spectral processing, including C++ implementation using the FFT class in JUCE. The analysis method for this class is fairly barebones, analyzing one block of samples at a time. As such it sounds a little different from the csound pvsanal / pvsynth opcodes, which use overlap-add analysis.

I am wondering how fft analysis methods use overlapping windows. Are the windows averaged at some point during analysis, or simply windowed and summed at the output?

Can anyone give an explanation of how this works or perhaps point to some literature describing the topic?

Thanks,
Emmett

Csound mailing list Csound@listserv.heanet.ie <mailto:Csound@listserv.heanet.ie> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here


Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
       https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

 


Date2017-11-08 21:28
FromRichard Dobson
SubjectRe: [Csnd] How Does FFT Overlap Add Analysis Work?
I can't answer specifically for JUCE as I have not used that, but this 
is called "zero padding" of the time domain signal, which results in 
interpolation in the frequency domain. Useful for e.g. displaying the 
spectrum of a signal (which might be the assumed use case in JUCE). You 
can have quite large "zero padding ratios", e.g. when obtaining the 
spectrum of a filter's impulse response.

Richard Dobson

On 07/11/2017 23:22, Emmett Palaima wrote:
> One thing I am noticing which might be causing the discrepancy in 
> effects is this:
> 
> In the AudioProgramming book it appears that the FFT function takes and 
> returns an array of the same size, e.g. you pass in 1024 samples, you 
> get 512 real+imaginary pairs as an output. With the JUCE functions you 
> pass in an array of length FFTSIZE * 2, the first half of which should 
> be filled with real samples, meaning that you pass in 1024 samples and 
> get 1024 real+imaginary pairs as an output.
> 
> Am I understanding this correctly? What is the reason for this difference?
> 
...

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Date2017-11-08 21:47
FromEmmett Palaima
SubjectRe: [Csnd] How Does FFT Overlap Add Analysis Work?
I believe the answer suggested by Andrea is correct. The JUCE FFT class is designed to be minimal and functional, designed to be a basic FFT, so I doubt it would have a built in and unavoidable amount of zero padding equal to double the size of the input. I've also managed to get a phase vocoder working and sounding clean for realtime input, which I think would be impossible if that amount of padding was being added. 

At this point I've successfully optimized by only processing the bottom half of the fft in pvoc format, with no sacrifice in sound quality.

One thing I've noticed is that a lot of the Csound opcodes skip processing on the first bin? Is that just due to the way the csound pvoc is formatted, or is there another reason?

On Wed, Nov 8, 2017 at 4:28 PM, Richard Dobson <richard@rwdobson.com> wrote:
I can't answer specifically for JUCE as I have not used that, but this is called "zero padding" of the time domain signal, which results in interpolation in the frequency domain. Useful for e.g. displaying the spectrum of a signal (which might be the assumed use case in JUCE). You can have quite large "zero padding ratios", e.g. when obtaining the spectrum of a filter's impulse response.

Richard Dobson


On 07/11/2017 23:22, Emmett Palaima wrote:
One thing I am noticing which might be causing the discrepancy in effects is this:

In the AudioProgramming book it appears that the FFT function takes and returns an array of the same size, e.g. you pass in 1024 samples, you get 512 real+imaginary pairs as an output. With the JUCE functions you pass in an array of length FFTSIZE * 2, the first half of which should be filled with real samples, meaning that you pass in 1024 samples and get 1024 real+imaginary pairs as an output.

Am I understanding this correctly? What is the reason for this difference?

...


Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
       https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Date2017-11-09 11:52
FromAndrea Crespi <4ndr34cr35p1@GMAIL.COM>
SubjectRe: [Csnd] How Does FFT Overlap Add Analysis Work?
If my math is not too rusty, the frequency related to the first bin is fixed to 0 Hz. This is due to the fact that the phase in that channel is also fixed to 0 radians as the complex sinusoid in the Fourier summation degenerates to a real constant (of value 1) when the bin index k is set to 0. So when you compute the time derivative of the phase in the first channel you always get 0 radians (and thus 0 Hz) as frequency deviation from 0 Hz (which is the first bin's central frequency). I guess that in most cases it does not really make sense to run a target PV processing algorithm on a channel which is fixed to DC.

However, I am no real expert so I suggest we wait for more experienced fellows to confirm or correct these statements.

2017-11-08 22:47 GMT+01:00 Emmett Palaima <epalaima@berklee.edu>:
I believe the answer suggested by Andrea is correct. The JUCE FFT class is designed to be minimal and functional, designed to be a basic FFT, so I doubt it would have a built in and unavoidable amount of zero padding equal to double the size of the input. I've also managed to get a phase vocoder working and sounding clean for realtime input, which I think would be impossible if that amount of padding was being added. 

At this point I've successfully optimized by only processing the bottom half of the fft in pvoc format, with no sacrifice in sound quality.

One thing I've noticed is that a lot of the Csound opcodes skip processing on the first bin? Is that just due to the way the csound pvoc is formatted, or is there another reason?

On Wed, Nov 8, 2017 at 4:28 PM, Richard Dobson <richard@rwdobson.com> wrote:
I can't answer specifically for JUCE as I have not used that, but this is called "zero padding" of the time domain signal, which results in interpolation in the frequency domain. Useful for e.g. displaying the spectrum of a signal (which might be the assumed use case in JUCE). You can have quite large "zero padding ratios", e.g. when obtaining the spectrum of a filter's impulse response.

Richard Dobson


On 07/11/2017 23:22, Emmett Palaima wrote:
One thing I am noticing which might be causing the discrepancy in effects is this:

In the AudioProgramming book it appears that the FFT function takes and returns an array of the same size, e.g. you pass in 1024 samples, you get 512 real+imaginary pairs as an output. With the JUCE functions you pass in an array of length FFTSIZE * 2, the first half of which should be filled with real samples, meaning that you pass in 1024 samples and get 1024 real+imaginary pairs as an output.

Am I understanding this correctly? What is the reason for this difference?

...


Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
       https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here