| Tim Mortimer wrote:
> OK, the saga continues...
>
> i used an existing .pvx file i had to run through
> -U pv_export
> to get some sort of idea of pertinent header info
>
> I then created the attached testout.txt file & used
> -U pv_import
> to create the attached testout.pvx file
>
> & then i play it back using readcreated.csd - also attached
>
> but it sounds "broken"
>
> how can i improve / fix it?
>
I get clean sine tones (albeit with hard on/off transients) with
pvocadditivetest.csd, but clearly not with the frequencies you
intend...so I assume that is what you mean by "broken". With
readcreated.csd (and setting aside the opening frame which is DC) there
are slight discontinuities which I am guessing arise from the use of the
Hamming window. I have kind of gone off Hamming simply because the
non-zero ends so often lead to reconstruction problems of this kind,
and now would use Hann, or Kaiser when available.
> also i note that despite defining the freq of the test bin as 500hz, the
> freq seems to change depending on what bin i put the amp & freq data in, &
> yet by my understanding (& in fact i did perform some alternate tests at one
> stage to to back this up - but of course now failing to be able to
> reproduce...) you can pretty much arbitrarily define any freq in any bin?
>
For standard FFT this is not the case. Each bin has a specific centre
frequency (Sr/FFTsize) and for overlapped frames a more or less narrow
bandwidth (+- sr/2/overlap) (if I remember the maths offhand!). A single
FFT frame has bins with essentially no bandwidth at all (i.e. fixed
frequencies which are harmonics of the FFT fundamental)- a source
partial is represented by ~all~ the bins, to a greater or lesser extent
(the one exception is when a cycle fits ~exactly~ in an FFT frame, with
no windowing other than rectangular). With standard pvoc it is always
important to ensure the frequency value is put in a bin that can
accommodate it. In this case (N=2048, sr=48000) the target bin(s) for
500Hz is bin 23. Or rather, since 500Hz does not sit ~exactly~ on a bin
centre, you will probably want to put that same frequency in adjacent
bins too (so at least 22,23,24), with an amplitude shape like the
central lobe of a sinc function. It is all pretty complicated! Some
years ago IRCAM patented an "fft-1" algorithm for doing fast
additive synthesis via FFT by creating the correct (if truncated)
bunches of bins for each partial (using anything up to 9 bins for each
partial IIRC). Getting this right is admittedly probably like pulling
teeth. And rumour has it it didn't end up being all that fast anyway.
But it could still be very interesting to come up with a version of
fft-1 for Csound. Might raise a few hackles in Paris.
One way to see what they are trying to do is just to look at the output
FFT of an input sinusoid of arbitrary frequency. There will be a peak
bin of course, plus bins either side at reducing amplitudes (the shape
and levels depending on the window used). BUT: the left-of centre bin
may be larger or smaller than the right-of-centre bin, depending. The
FFT-1 algorithm works out just what each amplitude should be, so that
the output frequency is nailed exactly.
Use oscillator-bank resynthesis and you then can have any frequency in
any bin. With the FFT, The greater the overlap, the wider the bandwidth
of a bin, until in the limit where overlap = 1 sample you have the
Sliding DFT and hence the Sliding Phase Vocoder (SPV) and each bin
enjoys the full audio bandwidth. It is especially expensive is it is a
"no-compromise" algorithm (need something like a 50GFlop machine to do
it in real time), so you will more likely want to use pvsadsyn which
tries to do fast oscillators and some interpolation across frames.
And of course, by specifying a target bin by number in your score,
rather than computing it relative to the FFTsize, the output will not be
consistent (or even particularly predictable) if/when you change the
fftsize in pvsinit. You would really want to put the desired frequency
in your score file, and compute the required bin(s) in the orch.
Note that (with respect to your example csd) if you specify amp and freq
ftables of 1025, the net FFTsize in pvsinit should be 2048, not 1024.
For a low bin number it has no obvious negative effect (you are just
wasting ftable real estate), except that things are probably going in
incorrect bins.
One of the things one rapidly discovers with all this pvoc/FFT stuff is
that there really is no free lunch. To get any degree of control and
predictability takes a lot of computation, whether by creating the best
bin bunches per partial (and dealing with enveloping of the sounds as
well!), or by using oscillator-bank resynthesis. There are (we hope!)
"optimum" realisations in everything, but that does depend on what you
are trying to do, what control over detail you want, and what
compromises are acceptable, etc. Just as large FFT windows have
problems tracking transients (and non-static frequencies within a
frame), you will have a job synthesising such things too.
Sorry I cannot devote more time to all these interesting issues - just
too much other work on at the moment!
Richard Dobson
|