| Message written at 31 Mar 1999 16:26:28 +0530
------- Start of forwarded message -------
Date: 29 Mar 99 08:56 GMT
Priority: normal
P1-Message-ID: pt*mailpac*gtw-ms;0922697806/0077436395/1
Original-Encoded-Information-Types: IA5-Text
From: Pedro Batista
To: j.p.ffitch@maths.bath.ac.uk
Subject: Re: neural nets
__________________
>>do you have any ideas of what exactly you'll use the network for?
>
>Good point and maybe it isn't worth the effort although there are some
>people who claim to be doing some interesting things with them. Check out
Surely there are a number of interesting projects around! I was concerned
exclusively with practical uses, particularly within a csound environment
Some interesting projects involve physical modelling, analogue systems
modelling, timbre recognition, all sorts of speech synth/resynthesis, etc;
theres this guy using genetic algos to evolve vco+vcf patches to suit a
particular target sound (not a bad idea, indeed); but all this work requires
huge amounts of processing power, I mean who has a spare server to run this
stuff? (I at least dont...)
which makes it kind of absurd (and since I perhaps naively, have started to
make this attempts, all I get asked is 'what the hell for?') to use a neural
model for any kind of synthesis process, when we have specific, efficient
models to do the same thing already!
where can we use neural models in sonic domain, then? if its obviously
inefficient (to me, anyway) to use nn's to do what synthesis algorithms
already do well, then I'll try to use nn's to do what I cant do with a
deterministic algorithm
I believe there are basically three applications worth a shot, corresponding
to three different neural architectures: feature detection and unsupervised
learning, which can lead to the field of ICA, feed-forward pattern
associators in all flavours, which can be used for sound morphing and 'sound
learning' (I'll explain this shortly), provided some clever symbolic coding
is used, and finally recurrent nets (explicit or implicit like in FIR
synapses) which have their field of excellence in time series prediction,
and whose most obvious use would be sound modelling
Now, the feature detectors are easily implemented and fast, but relatively
uninteresting, at least for what Id like to do; there are other more
efficient stochastic methods for ICA, anyway
The recurrent models are very cool :) but bare in mind that a recurrent net
computationally unfolds to several cascaded feed-forward nets, with the
exponential growth in processing time; nevertheless (one never knows unless
one tries) I've implemented a FIR network in csound: you dont wanna run it!
I never got the chance to test it properly, cause it needs tenths of
thousands of iterations to get close to the target, and I really dindnt have
thetime
The feed-forward nets are the most viable application; I have been coding
several back-prop nets, and have achieved good results with rprop (one of
the optimized versions of backprop); I have it coded in C as well as csound,
and I can help you code it, if you want
but what can be done with it? (again, having fast sound apps targeted at
csound, in mind)
well, 90% of what can be done, results from the symbolic coding we use; It
is more or less obvious to me that some sort of time/freq mapping must be
established 1st, like a multiresolution parsing of the sound, that would
break it to manageable 'symbols', leaving to the net the task of uncovering
the 'grammar' behind these symbols
we have a number of such methods, from vocoders and filterbanks, to wavelets
and FFT's, it just remains to figure out the best
The major limitation I come across, is the need to have the network see the
sound in its globallity (something that would be appropriate for a
time-dependent net) without resorting to recurrent methods;
What I mean, is having the output of the net depend not only of the present
input, but also on the past history of inputs; now, this is what recurrent
or time-delayed nets are used for in the 1st place, but as I said, I dont
have a server on me right now
So to overcome it, I want to place more emphasis on the symbolic coding
(that is, have this coding already reflect the evolution of the sound over
time) and sticking with a static neural architecture; perhaps the most
obvious way of doing this is by training the net with spectral data
an example: a neural vocoder
we separate the sound over several bands, have a network learn each band's
behaviour, and then run each band thru a resynthesis process, controlled by
the network's output; unless we use freq values as the network material, we
wont take advantage of the multiband analysis, since we wont be able to make
assumptions to the range of values each band will have
another example: learning a modular patch
inspired by the paper at narx, I started to contemplate the idea of making
of a neural based patch learner in csound; we would have a symbolic coding
which would describe one possible architecture; the net would for instance
learn combinations of vco's and vcf's together with their connections and
gains between units, to reach a specified target sound; what is needed here
is a measure of how well the network is performing, something like a way of
mathematically comparing two spectra and have some quantity measure how much
they sound alike! (the ga paper uses the euclidian distance between both
fft's)
that solved, and with an appropriate mapping, we could use a neural ugen, to
learn, say, fof settings within a fof bank to suit a particular sound
I have this notion (I need to think more thoroughly of this) we could have a
generic neural opcode in csound, which would learn suitable parameters to
synthesize some kind of target sound
Actually thats the beauty of distributed processing: you can have it learn
things you dont know in the first place!
>I got the Stuttgart package working and trained their demo network with it
>now, so I understand a little more about what the network is doing. I had
>forgotten that neurons are binary so as far as I can tell it must take a
>bunch of neurons to do anything useful.
man, if that simulator only works with binary neurons than its use is even
more limited than I thought! the rprop algo's I was talking about works with
a continuous input and output range (limited: the input should work in the
-5.0 to +5.0 range, for instance, and the output is -1.0 to +1.0, but its
just a simple scaling process)
binary neurons have their own problems: how do you code a continuous value
to a binary pattern? you can just binary code it, but then you have the
problem of binary 1000 being very far from binary 0000, while for the
network they are very similar; you can use grey codes, but that wont help
much, and you can have 'termometer' scales which solve the problem of
discrepancies between input values proximity, but require small input
ranges, or huge input dimensions
>Still learning,
same here... actually, I hope I can say that till the day I die!
then again, even longer :)
pedro |