Csound Csound-dev Csound-tekno Search About

Re: 3D Sound Opcode?

Date1997-04-16 05:11
FromRobin Whittle
SubjectRe: 3D Sound Opcode?
Hello Hans and Csounders interested in binaural sound,

Your email pretty much describes a ugen I wrote around a year ago.

I have not released it publicly since I don't consider it finished, 
and I have not known of anyone who was really interested in exploring 
this field.  However you do sound interested and I can email you the 
source code.

Are you set up for re-compiling Csound and integrating new ugens?

What system are you using?  I am using Linux a little, but
mainly still using DJGPP for an MSDOS executable which runs under 
MSDOS, Win3.1 and Win95. 

There are two ugens.  One establishes some global variables and moves 
the head at k rate.

Head position is in x, y and z - distance measured in milliseconds of 
sound travel, around one foot or 30 cm.  The head axis is assumed to 
be vertical, but it would not be out of the question to extend this 
to other forms of rotation.  The rotation of the head is also 
specified at k rate.  This establishes a global pair of mix 
variables to which the second ugen accumulates a binaural mix. 

This ugen also sets up a global variable to determine the length of 
the delay lines used by the second ugen.

Another global variable determines the finesse to which the binaural 
processing is done.  In high quality mode, filtering, volume, 
proximity effects (low frequency response drops off when small sound 
sources are not close to the ear) and a double a rate over-sampled 
interpolated delay line are used.  Other values reduce quality, but 
enable faster cooking of the piece for draft purposes.

There are no HRTF samples - that would be slow and raise many problems 
with interpolating between the samples for other angles.

The second ugen is given an audio signal, the "size" of the sound (eg 
a 1" tweeter, a 15" woofer or something larger, a factor to control 
to what degree this "size" affects frequency response, and x, y and z 
locations at k rate. There is no "direction" for the sound source - 
that would be possible but would add a lot of processing.  The sound 
source is assumed to have spherical radiation.

The gain when the sound source is at the ear location is 1.0.  
Further away the gain drops off in a rough relation to the distance 
and the "size" of the sound source.  A simple 1/(distance squared) 
approach leads to infinite volumes when the source is at the ear - 
this would only be realistic for a sound source of infinitely small size.

All processing is done on the basis of finely calculated delay times 
(hence full Doppler effects are a natural occurrence), volume, 
frequency response (IIR) in terms of proximity and "size" and some crude 
but effective IIR filtering based on azimuth and a little on 
elevation.  Thus it is quite clear whether the sound is going around 
the head clockwise or counter-clockwise.  I wouldn't make such claims 
for up/down response - that is fairly weak in the human brain anyway, 
and would be very complex to research and implement.  One day 
perhaps.

The left to right response is excellent.  The front to back response 
is valuable, but probably not as good as with HRTF or a more 
sophisticated approach to filtering.

Overall it produces a very aesthetic sound space, with a few 
anomalies as the sound passes very close or between the ears.  It 
sounds dramatically real with headphones and quite spacious on 
speakers.  With ksmps = 3 and a rate = 44100, it can do smooth 
localisation of rapidly moving sources - and you can easily and 
safely do things moving at a moderate fraction of the speed of 
sound, in 20 metre radius orbits, whizzing just in front of your nose!

Although the elevation cues are limited, the z dimension (up/down) 
is very important since it enables a sound to pass above or below the 
head at a distance which does not make it excessively loud or close.

The system is set up for one head and any number of sound sources, 
however you need to set your distance limit carefully and make sure 
you can fit all the delay lines in RAM.

I can send you the source, or an MSDOS binary, and some orc/sco
pieces which use these ugens, on condition that you keep them to 
yourself.  At some stage I will probably refine and release them.  If 
you want to work on the source code *and* if you like to comment and 
document your code, then it would be great for you to contribute to 
the ugen.

The .h and .c files for UGRW3 total 184 k bytes. They include 
extensive comments and a sample rate conversion ugen as well.  (This 
enables Csound to calculate sound at 88.2kHz, but write a 44.1 kHz 
stereo output file.)

Even if you are not interested in how I achieve this binaural 
processing, I would suggest that the following parameters are 
a valuable basis for any binaural system:

Head    - k rate x, y and z location
        - k rate rotation
        - Perhaps other things if the head axis is not vertical.

Source  - "Size" of sound
        - To what degree "size" effects low frequency response:
          0 = none, like a sealed speaker box. 
          1 = a lot, like a loudspeaker without an enclosure.
        - a rate audio signal
        - k rate x, y and z

General - some parameter to globally control the CPU speed vs
          quality tradeoff.  You can grow old waiting for lots
          of high qualtity binaural sound to be calculated.
         
      
Regards

- Robin

. Robin Whittle                                               .
. http://www.ozemail.com.au/~firstpr   firstpr@ozemail.com.au .
. 11 Miller St. Heidelberg Heights 3081 Melbourne Australia   .
. Ph +61-3-9459-2889    Fax +61-3-9458-1736                   .
. Consumer advocacy in telecommunications, especially privacy .
.                                                             .
. First Principles      - Research and expression - music,    .
.                         music industry, telecommunications  .
.                         human factors in technology adoption.
.                                                             .
. Real World Interfaces - Hardware and software, especially   .
.                         for music                           .