Optimizations
Date | 1999-05-06 19:45 |
From | Anders Andersson |
Subject | Optimizations |
Ho! I've read about the optimization discussion going on.. Yesterday when browsing through some old books at the university-library (where I work) I found a horrifying example on how things can go wrong when you try to optimize "by the book", and have old books.. It was an algorithm that used 8 muls, but they "optimized" it to "only" 7 muls and 11 adds. This would be great, if the CPU would perform 11 adds quicker than 1 mul, as was the case pre 1990. Nowadays (post 1990), that code would be more than twice as slow as the "unoptimized", as todays CPU's are VERY optimized for multiplications, so that a mul is almost faster than an add. I guess there might be alot of this examples deep down in the code for the most "basic" (and thus oldest) opcodes in CSound, as indeed, it was a very common way to optimize on a few years ago. (The famous Bresenham DDA linedrawer is a very good example!) // Anders (That still uses a CPU dated pre 1990 (MC68030), thus the mul's are about 10 times slower than the add's =D) |
Date | 1999-05-06 22:08 |
From | Ed Hall |
Subject | Re: Optimizations |
On Thu, 06 May 1999 20:45:42 you wrote: > . . . > Yesterday when browsing through some old books at the university-library > (where I work) I found a horrifying example on how things can go wrong when > you try to optimize "by the book", and have old books.. > . . . Absolutely true! Counting instruction cycles is no longer the way to optimize. Main memory is only about 2.5 times faster than it was ten years ago, while CPU's are 25 times faster. As you note, some instructions (like multiply) have been sped up even more. Optimizations have to focus on the cache architecture and the wide range of latencies between level one and two (and sometimes level three) cache and main memory. Since a single 64K Csound function table or 1.5 second delay will completely fill most PC's cache, main memory latency affects Csound more than most applications. The CPU/memory ratio has shifted so far that optimizations that were common a few years ago, such as using lookup tables to avoid complex calculations, can actually slow programs down. Nonlinear waveshapers take note: it might be faster to compute your polynomial directly (though Csound has a high enough per-operation overhead that this likely isn't true for a coded expression, but would be for a polynomial opcode). -Ed |
Date | 1999-05-07 04:53 |
From | Terry McDermott |
Subject | Re: Optimizations |
>On Thu, 06 May 1999 20:45:42 you wrote: >> . . . >> Yesterday when browsing through some old books at the university-library >> (where I work) I found a horrifying example on how things can go wrong when >> you try to optimize "by the book", and have old books.. >> . . . > >Absolutely true! Counting instruction cycles is no longer the way to >optimize. Main memory is only about 2.5 times faster than it was ten >years ago, while CPU's are 25 times faster. As you note, some >instructions (like multiply) have been sped up even more. Optimizations >have to focus on the cache architecture and the wide range of latencies >between level one and two (and sometimes level three) cache and main >memory. Since a single 64K Csound function table or 1.5 second delay >will completely fill most PC's cache, main memory latency affects Csound >more than most applications. > >The CPU/memory ratio has shifted so far that optimizations that were >common a few years ago, such as using lookup tables to avoid complex >calculations, can actually slow programs down. Nonlinear waveshapers >take note: it might be faster to compute your polynomial directly >(though Csound has a high enough per-operation overhead that this >likely isn't true for a coded expression, but would be for a polynomial >opcode). > > -Ed I have a hazy recollection that there is some programming trick often used by engineers to generate a sine wave in real time (calculate the values on the fly, rather than accessing lookup tables) which calculates the locus of a circle in the imaginary z plane, and voila there is a sinusoidal behavior on the real axis. If this recollection is correct, then maybe this could be an alternative way of making sine waves in csound, and given the speed of CPUs these days as opposed to memory access, perhaps a more efficient method?? Maybe its already been done? --Terry Terry McDermott Music Department School of Arts & Media Latrobe University Bundoora, Victoria, 3083 Australia email: T.McDermott@latrobe.edu.au Telephone +61 3 9479 2167 Fax +61 3 9479 3651 |