Csound Csound-dev Csound-tekno Search About

[PBATISTA@colep.mailpac.pt: Re: neural nets]

Date1999-04-01 06:31
Fromjpff@maths.bath.ac.uk
Subject[PBATISTA@colep.mailpac.pt: Re: neural nets]
Message written at 31 Mar 1999 16:26:28 +0530

------- Start of forwarded message -------
Date: 29 Mar 99 08:56 GMT
Priority: normal
P1-Message-ID: pt*mailpac*gtw-ms;0922697806/0077436395/1
Original-Encoded-Information-Types: IA5-Text
From: Pedro Batista 
To: j.p.ffitch@maths.bath.ac.uk
Subject: Re: neural nets


__________________

>>do you have any ideas of what exactly you'll use the network for?
>
>Good point and maybe it isn't worth the effort although there are some
>people who claim to be doing some interesting things with them.  Check out

Surely there are a number of interesting projects around! I was concerned 
exclusively with practical uses, particularly within a csound environment
Some interesting projects involve physical modelling, analogue systems 
modelling, timbre recognition, all sorts of speech synth/resynthesis, etc; 
theres this guy using genetic algos to evolve vco+vcf patches to suit a 
particular target sound (not a bad idea, indeed); but all this work requires 
huge amounts of processing power, I mean who has a spare server to run this 
stuff? (I at least dont...)
which makes it kind of absurd (and since I perhaps naively, have started to 
make this attempts, all I get asked is 'what the hell for?') to use a neural 
model for any kind of synthesis process, when we have specific, efficient 
models to do the same thing already!

where can we use neural models in sonic domain, then? if its obviously 
inefficient (to me, anyway) to use nn's to do what synthesis algorithms 
already do well, then I'll try to use nn's to do what I cant do with a 
deterministic algorithm

I believe there are basically three applications worth a shot, corresponding 
to three different neural architectures: feature detection and unsupervised 
learning, which can lead to the field of ICA, feed-forward pattern 
associators in all flavours, which can be used for sound morphing and 'sound 
learning' (I'll explain this shortly), provided some clever symbolic coding 
is used, and finally recurrent nets (explicit or implicit like in FIR 
synapses) which have their field of excellence in time series prediction, 
and whose most obvious use would be sound modelling

Now, the feature detectors are easily implemented and fast, but relatively 
uninteresting, at least for what Id like to do; there are other more 
efficient stochastic methods for ICA, anyway

The recurrent models are very cool :) but bare in mind that a recurrent net 
computationally unfolds to several cascaded feed-forward nets, with the 
exponential growth in processing time; nevertheless (one never knows unless 
one tries) I've implemented a FIR network in csound: you dont wanna run it! 
I never got the chance to test it properly, cause it needs tenths of 
thousands of iterations to get close to the target, and I really dindnt have 
thetime

The feed-forward nets are the most viable application; I have been coding 
several back-prop nets, and have achieved good results with rprop (one of 
the optimized versions of backprop); I have it coded in C as well as csound, 
and I can help you code it, if you want
but what can be done with it? (again, having fast sound apps targeted at 
csound, in mind)

well, 90% of what can be done, results from the symbolic coding we use; It 
is more or less obvious to me that some sort of time/freq mapping must be 
established 1st, like a multiresolution parsing of the sound, that would 
break it to manageable 'symbols', leaving to the net the task of uncovering 
the 'grammar' behind these symbols
we have a number of such methods, from vocoders and filterbanks, to wavelets 
and FFT's, it just remains to figure out the best
The major limitation I come across, is the need to have the network see the 
sound in its globallity (something that would be appropriate for a 
time-dependent net) without resorting to recurrent methods;
What I mean, is having the output of the net depend not only of the present 
input, but also on the past history of inputs; now, this is what recurrent 
or time-delayed nets are used for in the 1st place, but as I said, I dont 
have a server on me right now
So to overcome it, I want to place more emphasis on the symbolic coding 
(that is, have this coding already reflect the evolution of the sound over 
time) and sticking with a static neural architecture; perhaps the most 
obvious way of doing this is by training the net with spectral data

an example: a neural vocoder
we separate the sound over several bands, have a network learn each band's 
behaviour, and then run each band thru a resynthesis process, controlled by 
the network's output; unless we use freq values as the network material, we 
wont take advantage of the multiband analysis, since we wont be able to make 
assumptions to the range of values each band will have

another example: learning a modular patch
inspired by the paper at narx, I started to contemplate the idea of making 
of a neural based patch learner in csound; we would have a symbolic coding 
which would describe one possible architecture; the net would for instance 
learn combinations of vco's and vcf's together with their connections and 
gains between units, to reach a specified target sound; what is needed here 
is a measure of how well the network is performing, something like a way of 
mathematically comparing two spectra and have some quantity measure how much 
they sound alike! (the ga paper uses the euclidian distance between both 
fft's)
that solved, and with an appropriate mapping, we could use a neural ugen, to 
learn, say, fof settings within a fof bank to suit a particular sound
I have this notion (I need to think more thoroughly of this) we could have a 
generic neural opcode in csound, which would learn suitable parameters to 
synthesize some kind of target sound
Actually thats the beauty of distributed processing: you can have it learn 
things you dont know in the first place!

>I got the Stuttgart package working and trained their demo network with it
>now, so I understand a little more about what the network is doing.  I had
>forgotten that neurons are binary so as far as I can tell it must take a
>bunch of neurons to do anything useful.

man, if that simulator only works with binary neurons than its use is even 
more limited than I thought! the rprop algo's I was talking about works with 
a continuous input and output range (limited: the input should work in the 
 -5.0 to +5.0 range, for instance, and the output is -1.0 to +1.0, but its 
just a simple scaling process)
binary neurons have their own problems: how do you code a continuous value 
to a binary pattern? you can just binary code it, but then you have the 
problem of binary 1000 being very far from binary 0000, while for the 
network they are very similar; you can use grey codes, but that wont help 
much, and you can have 'termometer' scales which solve the problem of 
discrepancies between input values proximity, but require small input 
ranges, or huge input dimensions

>Still learning,

same here... actually, I hope I can say that till the day I die!
then again, even longer :)

pedro