[PBATISTA@colep.mailpac.pt: Re: neural nets]

Date	1999-04-01 06:31
From	jpff@maths.bath.ac.uk
Subject	[PBATISTA@colep.mailpac.pt: Re: neural nets]
	Message written at 31 Mar 1999 16:26:28 +0530 ------- Start of forwarded message ------- Date: 29 Mar 99 08:56 GMT Priority: normal P1-Message-ID: ptmailpacgtw-ms;0922697806/0077436395/1 Original-Encoded-Information-Types: IA5-Text From: Pedro Batista To: j.p.ffitch@maths.bath.ac.uk Subject: Re: neural nets __________________ >>do you have any ideas of what exactly you'll use the network for? > >Good point and maybe it isn't worth the effort although there are some >people who claim to be doing some interesting things with them. Check out Surely there are a number of interesting projects around! I was concerned exclusively with practical uses, particularly within a csound environment Some interesting projects involve physical modelling, analogue systems modelling, timbre recognition, all sorts of speech synth/resynthesis, etc; theres this guy using genetic algos to evolve vco+vcf patches to suit a particular target sound (not a bad idea, indeed); but all this work requires huge amounts of processing power, I mean who has a spare server to run this stuff? (I at least dont...) which makes it kind of absurd (and since I perhaps naively, have started to make this attempts, all I get asked is 'what the hell for?') to use a neural model for any kind of synthesis process, when we have specific, efficient models to do the same thing already! where can we use neural models in sonic domain, then? if its obviously inefficient (to me, anyway) to use nn's to do what synthesis algorithms already do well, then I'll try to use nn's to do what I cant do with a deterministic algorithm I believe there are basically three applications worth a shot, corresponding to three different neural architectures: feature detection and unsupervised learning, which can lead to the field of ICA, feed-forward pattern associators in all flavours, which can be used for sound morphing and 'sound learning' (I'll explain this shortly), provided some clever symbolic coding is used, and finally recurrent nets (explicit or implicit like in FIR synapses) which have their field of excellence in time series prediction, and whose most obvious use would be sound modelling Now, the feature detectors are easily implemented and fast, but relatively uninteresting, at least for what Id like to do; there are other more efficient stochastic methods for ICA, anyway The recurrent models are very cool :) but bare in mind that a recurrent net computationally unfolds to several cascaded feed-forward nets, with the exponential growth in processing time; nevertheless (one never knows unless one tries) I've implemented a FIR network in csound: you dont wanna run it! I never got the chance to test it properly, cause it needs tenths of thousands of iterations to get close to the target, and I really dindnt have thetime The feed-forward nets are the most viable application; I have been coding several back-prop nets, and have achieved good results with rprop (one of the optimized versions of backprop); I have it coded in C as well as csound, and I can help you code it, if you want but what can be done with it? (again, having fast sound apps targeted at csound, in mind) well, 90% of what can be done, results from the symbolic coding we use; It is more or less obvious to me that some sort of time/freq mapping must be established 1st, like a multiresolution parsing of the sound, that would break it to manageable 'symbols', leaving to the net the task of uncovering the 'grammar' behind these symbols we have a number of such methods, from vocoders and filterbanks, to wavelets and FFT's, it just remains to figure out the best The major limitation I come across, is the need to have the network see the sound in its globallity (something that would be appropriate for a time-dependent net) without resorting to recurrent methods; What I mean, is having the output of the net depend not only of the present input, but also on the past history of inputs; now, this is what recurrent or time-delayed nets are used for in the 1st place, but as I said, I dont have a server on me right now So to overcome it, I want to place more emphasis on the symbolic coding (that is, have this coding already reflect the evolution of the sound over time) and sticking with a static neural architecture; perhaps the most obvious way of doing this is by training the net with spectral data an example: a neural vocoder we separate the sound over several bands, have a network learn each band's behaviour, and then run each band thru a resynthesis process, controlled by the network's output; unless we use freq values as the network material, we wont take advantage of the multiband analysis, since we wont be able to make assumptions to the range of values each band will have another example: learning a modular patch inspired by the paper at narx, I started to contemplate the idea of making of a neural based patch learner in csound; we would have a symbolic coding which would describe one possible architecture; the net would for instance learn combinations of vco's and vcf's together with their connections and gains between units, to reach a specified target sound; what is needed here is a measure of how well the network is performing, something like a way of mathematically comparing two spectra and have some quantity measure how much they sound alike! (the ga paper uses the euclidian distance between both fft's) that solved, and with an appropriate mapping, we could use a neural ugen, to learn, say, fof settings within a fof bank to suit a particular sound I have this notion (I need to think more thoroughly of this) we could have a generic neural opcode in csound, which would learn suitable parameters to synthesize some kind of target sound Actually thats the beauty of distributed processing: you can have it learn things you dont know in the first place! >I got the Stuttgart package working and trained their demo network with it >now, so I understand a little more about what the network is doing. I had >forgotten that neurons are binary so as far as I can tell it must take a >bunch of neurons to do anything useful. man, if that simulator only works with binary neurons than its use is even more limited than I thought! the rprop algo's I was talking about works with a continuous input and output range (limited: the input should work in the -5.0 to +5.0 range, for instance, and the output is -1.0 to +1.0, but its just a simple scaling process) binary neurons have their own problems: how do you code a continuous value to a binary pattern? you can just binary code it, but then you have the problem of binary 1000 being very far from binary 0000, while for the network they are very similar; you can use grey codes, but that wont help much, and you can have 'termometer' scales which solve the problem of discrepancies between input values proximity, but require small input ranges, or huge input dimensions >Still learning, same here... actually, I hope I can say that till the day I die! then again, even longer :) pedro