| Hello Hans and Csounders interested in binaural sound,
Your email pretty much describes a ugen I wrote around a year ago.
I have not released it publicly since I don't consider it finished,
and I have not known of anyone who was really interested in exploring
this field. However you do sound interested and I can email you the
source code.
Are you set up for re-compiling Csound and integrating new ugens?
What system are you using? I am using Linux a little, but
mainly still using DJGPP for an MSDOS executable which runs under
MSDOS, Win3.1 and Win95.
There are two ugens. One establishes some global variables and moves
the head at k rate.
Head position is in x, y and z - distance measured in milliseconds of
sound travel, around one foot or 30 cm. The head axis is assumed to
be vertical, but it would not be out of the question to extend this
to other forms of rotation. The rotation of the head is also
specified at k rate. This establishes a global pair of mix
variables to which the second ugen accumulates a binaural mix.
This ugen also sets up a global variable to determine the length of
the delay lines used by the second ugen.
Another global variable determines the finesse to which the binaural
processing is done. In high quality mode, filtering, volume,
proximity effects (low frequency response drops off when small sound
sources are not close to the ear) and a double a rate over-sampled
interpolated delay line are used. Other values reduce quality, but
enable faster cooking of the piece for draft purposes.
There are no HRTF samples - that would be slow and raise many problems
with interpolating between the samples for other angles.
The second ugen is given an audio signal, the "size" of the sound (eg
a 1" tweeter, a 15" woofer or something larger, a factor to control
to what degree this "size" affects frequency response, and x, y and z
locations at k rate. There is no "direction" for the sound source -
that would be possible but would add a lot of processing. The sound
source is assumed to have spherical radiation.
The gain when the sound source is at the ear location is 1.0.
Further away the gain drops off in a rough relation to the distance
and the "size" of the sound source. A simple 1/(distance squared)
approach leads to infinite volumes when the source is at the ear -
this would only be realistic for a sound source of infinitely small size.
All processing is done on the basis of finely calculated delay times
(hence full Doppler effects are a natural occurrence), volume,
frequency response (IIR) in terms of proximity and "size" and some crude
but effective IIR filtering based on azimuth and a little on
elevation. Thus it is quite clear whether the sound is going around
the head clockwise or counter-clockwise. I wouldn't make such claims
for up/down response - that is fairly weak in the human brain anyway,
and would be very complex to research and implement. One day
perhaps.
The left to right response is excellent. The front to back response
is valuable, but probably not as good as with HRTF or a more
sophisticated approach to filtering.
Overall it produces a very aesthetic sound space, with a few
anomalies as the sound passes very close or between the ears. It
sounds dramatically real with headphones and quite spacious on
speakers. With ksmps = 3 and a rate = 44100, it can do smooth
localisation of rapidly moving sources - and you can easily and
safely do things moving at a moderate fraction of the speed of
sound, in 20 metre radius orbits, whizzing just in front of your nose!
Although the elevation cues are limited, the z dimension (up/down)
is very important since it enables a sound to pass above or below the
head at a distance which does not make it excessively loud or close.
The system is set up for one head and any number of sound sources,
however you need to set your distance limit carefully and make sure
you can fit all the delay lines in RAM.
I can send you the source, or an MSDOS binary, and some orc/sco
pieces which use these ugens, on condition that you keep them to
yourself. At some stage I will probably refine and release them. If
you want to work on the source code *and* if you like to comment and
document your code, then it would be great for you to contribute to
the ugen.
The .h and .c files for UGRW3 total 184 k bytes. They include
extensive comments and a sample rate conversion ugen as well. (This
enables Csound to calculate sound at 88.2kHz, but write a 44.1 kHz
stereo output file.)
Even if you are not interested in how I achieve this binaural
processing, I would suggest that the following parameters are
a valuable basis for any binaural system:
Head - k rate x, y and z location
- k rate rotation
- Perhaps other things if the head axis is not vertical.
Source - "Size" of sound
- To what degree "size" effects low frequency response:
0 = none, like a sealed speaker box.
1 = a lot, like a loudspeaker without an enclosure.
- a rate audio signal
- k rate x, y and z
General - some parameter to globally control the CPU speed vs
quality tradeoff. You can grow old waiting for lots
of high qualtity binaural sound to be calculated.
Regards
- Robin
. Robin Whittle .
. http://www.ozemail.com.au/~firstpr firstpr@ozemail.com.au .
. 11 Miller St. Heidelberg Heights 3081 Melbourne Australia .
. Ph +61-3-9459-2889 Fax +61-3-9458-1736 .
. Consumer advocacy in telecommunications, especially privacy .
. .
. First Principles - Research and expression - music, .
. music industry, telecommunications .
. human factors in technology adoption.
. .
. Real World Interfaces - Hardware and software, especially .
. for music . |