Csound Csound-dev Csound-tekno Search About

Optimizations

Date1999-05-06 19:45
FromAnders Andersson
SubjectOptimizations
Ho!

I've read about the optimization discussion going on..

Yesterday when browsing through some old books at the university-library
(where I work) I found a horrifying example on how things can go wrong when
you try to optimize "by the book", and have old books..

It was an algorithm that used 8 muls, but they "optimized" it to "only" 7
muls and 11 adds.
This would be great, if the CPU would perform 11 adds quicker than 1 mul, as
was the case pre 1990.

Nowadays (post 1990), that code would be more than twice as slow as the
"unoptimized", as todays CPU's are VERY optimized for multiplications, so
that a mul is almost faster than an add.

I guess there might be alot of this examples deep down in the code for the
most "basic" (and thus oldest) opcodes in CSound, as indeed, it was a very
common way to optimize on a few years ago. (The famous Bresenham DDA
linedrawer is a very good example!)


// Anders (That still uses a CPU dated pre 1990 (MC68030), thus the mul's
are about 10 times slower than the add's =D)

Date1999-05-06 22:08
FromEd Hall
SubjectRe: Optimizations
On Thu, 06 May 1999 20:45:42 you wrote:
> . . .
> Yesterday when browsing through some old books at the university-library
> (where I work) I found a horrifying example on how things can go wrong when
> you try to optimize "by the book", and have old books..
> . . .

Absolutely true!  Counting instruction cycles is no longer the way to
optimize.  Main memory is only about 2.5 times faster than it was ten
years ago, while CPU's are 25 times faster.  As you note, some
instructions (like multiply) have been sped up even more.  Optimizations
have to focus on the cache architecture and the wide range of latencies
between level one and two (and sometimes level three) cache and main
memory.  Since a single 64K Csound function table or 1.5 second delay
will completely fill most PC's cache, main memory latency affects Csound
more than most applications.

The CPU/memory ratio has shifted so far that optimizations that were
common a few years ago, such as using lookup tables to avoid complex
calculations, can actually slow programs down.  Nonlinear waveshapers
take note: it might be faster to compute your polynomial directly
(though Csound has a high enough per-operation overhead that this
likely isn't true for a coded expression, but would be for a polynomial
opcode).

		-Ed

Date1999-05-07 04:53
FromTerry McDermott
SubjectRe: Optimizations
>On Thu, 06 May 1999 20:45:42 you wrote:
>> . . .
>> Yesterday when browsing through some old books at the university-library
>> (where I work) I found a horrifying example on how things can go wrong when
>> you try to optimize "by the book", and have old books..
>> . . .
>
>Absolutely true!  Counting instruction cycles is no longer the way to
>optimize.  Main memory is only about 2.5 times faster than it was ten
>years ago, while CPU's are 25 times faster.  As you note, some
>instructions (like multiply) have been sped up even more.  Optimizations
>have to focus on the cache architecture and the wide range of latencies
>between level one and two (and sometimes level three) cache and main
>memory.  Since a single 64K Csound function table or 1.5 second delay
>will completely fill most PC's cache, main memory latency affects Csound
>more than most applications.
>
>The CPU/memory ratio has shifted so far that optimizations that were
>common a few years ago, such as using lookup tables to avoid complex
>calculations, can actually slow programs down.  Nonlinear waveshapers
>take note: it might be faster to compute your polynomial directly
>(though Csound has a high enough per-operation overhead that this
>likely isn't true for a coded expression, but would be for a polynomial
>opcode).
>
>		-Ed

I have a hazy recollection that there is some programming trick often used
by engineers to generate a sine wave in real time (calculate the values on
the fly, rather than accessing lookup tables) which calculates the locus of
a circle in the imaginary z plane, and voila there is a sinusoidal behavior
on the real axis. If this recollection is correct, then maybe this could be
an alternative way of making sine waves in csound, and given the speed of
CPUs these days as opposed to memory access, perhaps a more efficient
method?? Maybe its already been done?

 --Terry

Terry McDermott

Music Department
School of Arts & Media
Latrobe University
Bundoora, Victoria, 3083
Australia

email: T.McDermott@latrobe.edu.au

Telephone	+61 3 9479 2167
Fax		+61 3 9479 3651