Csound Csound-dev Csound-tekno Search About

Pentium/Linux csound speed boost

Date1998-06-29 18:49
FromPaul Winkler
SubjectPentium/Linux csound speed boost
Well, folks, I have some good news for those Csounders running 
unix-derivative OSes on Pentiumclass platforms. Actually it's 
undoubtedly old hat to some of you.I just compiled and installed the 
pgcc compiler, which is a patch to the egcs compiler with lots of 
Pentium-specific optimizations. I've done somebenchmarking of csound 
3.482 for Linux against an older 3.46
gcc-compiled binary I had floating around, and the results are mostly 
very favorable. I also booted into Windows 95 and tested csound95 just 
to see how that compared (very well, actually-- who compiled that 
executable, and with what?).

You can read more about pgcc and egcs at 
http://www.goof.com/pcg/pgcc-faq.html

Apparently there is a DOS version: DJGPP , which I have seen mentioned 
on the Csound list as well. (Maybe it was in connection with 
csound95...)

Compilation and installation of pgcc went pretty smoothly, though it's 
quite a lot to download.  The patch was successful except for a few 
comments in one file which I decided to ignore. The build eats a LOT of 
drive space. My object directory swelled to 70 MB by the end, and the 
bootstrapped compile took a LONG time (maybe an hour). Probably I could 
have gone faster/smaller by being more specific about what to build 
(there's c++ and fortran compilers and other stuff I never use).

Compiling Csound with pgcc:
Initially I just tried adding -mpentium -06 to the .configure file; 
Csound compiles, but dumps core at run time. I looked at some specific 
flags in the PGCC FAQ, and then changed the compile options to 
COMPILE_OPTIONS= -fPIC -mpentium -O6 -fno-peep-spills 
-fno-omit-frame-pointer -mstack-align-double -funroll-all-loops 
-ffast-math

That did the trick. I'm not sure if both of the "-fno-" flags are 
necessary.


Now for the not-necessarily-very-meaningful-but-traditional csound 
benchmarks: (watch out for line wrapping, ugh!)

Test   Bach-d  Bach-m  Riss-g  Riss-m  Guit-d  Guit-m  Jame-g  pvanal  
lpanal
Length 115.73s 115.73s  63.75s  63.75s  87.75s  87.75s   6.75s   6.75s   
5.00s
------------------------------------------------------------------------------

P133/48:
pgcc     7.39   7.27    3.94    3.82    6.63    6.47     0.50   4.51   
3.20

gcc      7.73   7.47    4.15    4.03   11.16   11.10     0.80   ---    
---

Win95   10.03   7.98    5.95    3.99    9.061   6.81     0.82  ---    
---

platform: Pentium 133, 48 MB SDRAM, 4.3 GB Maxtor EIDE drive.
OS: Windows 95 / linux 2.0.32 + XFree86 + FVWM2
csound versions: 
"pgcc"= csound 3.482 with glibc patches, dynamically linked, compiled 
with
     "pgcc -O6 -mpentium -fno-peep-spills -fno-omit-frame-pointer
      -mstack-align-double -funroll-all-loops -ffast-math".
"gcc" = csound 3.46, statically linked, compiled with "gcc -O2
     -ffast-math -funroll-loops".
"Win95" = csound95.exe precompiled binary, invoked via a .bat file
          from the Windows 95 GUI (thus popping up a DOS window).


Interestingly, these scores rank higher in the RESULTS table than I
would have guessed. My runs are well ahead of the most
similar hardware platforms in the RESULTS file (G. Maldonado's
P133/32, Antonio Neto's P166/32, & M. Gogins' Pent166/40). 

The pgcc vs. gcc improvement ranges from 2.8% to 71.6%, with a mean
improvement of 31%.

Moving on, I decided to test Joseph Kung's piece Xanadu, as this has 
been suggested several times as a more worthwhile benchmarking piece for 
modern processors. It can be found via ftp at 
ftp.musique.umontreal.ca/pub/mirrors/dream/documentation/orchestras+scores/others_examples

xanadu44.orc is the sr=44, kr=4410, stereo version of the orc; 
xanadus.sco is the 1-minute-long version. I benchmarked this with the 
same three setups as above. I also include Robin Whittle's PPro180/64 
results with Csound95 as he reported back in May 97 (see Csound_81 in 
the mailing list archives at .../dream/Csound_List_Archive) ... I 
haven't seen any other tests of Xanadu for comparison.

Robin Whittle's
PPro180:               165.71 seconds (44% better than my best time)

Win95/Csound95:        238.76 seconds (9.7% better than linux/pgcc, 
                                       31% better than linux/gcc)

Linux/3.482/pgcc:      261.88 seconds  (19.7% better than gcc)

Linux/Csound 3.46/gcc: 313.51 seconds


There is also a sr= 8192, kr= 512, mono version of the same orc
(xanadu.orc) and a longer, 100-second version of the sco
(xanadul.sco). Interestingly, this test showed a quite different spread:


Linux/3.482/pgcc:       99.97 seconds (4% better than csound95,
                                        56% better than gcc!!!)

Win95/Csound95:        103.95 seconds (50.1 % better than gcc)

Linux/Csound 3.46/gcc: 156.05 seconds

So there you have it. Looks like csound95 and pgcc-compiled linux csound 
are well ahead of gcc-compiled linux csound. To keep this message of a 
reasonable length, I'm putting my next tests (some realtime stuff) in a 
separate message.

Regards,

PW

______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com