[Cs-dev] csound6 speed
Date | 2013-05-25 11:39 |
From | Ben Hackbarth |
Subject | [Cs-dev] csound6 speed |
Attachments | None None |
hello, i'm looking to switching over to cs6 but i am finding that, in my larger csound projects, cs6 runs about 25% slower than cs5 on my machine (osx 10.7 doubles built from git yesterday). i run csound non-realtime.thanks, -- ben
|
Date | 2013-05-25 12:01 |
From | Michael Gogins |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
Yes, I think Csound 6 is a bit slower per thread than Csound 5. To deal with that, I expect to put in some time profiling and optimizing the code, as I think the other developers may also do or have done. This may or may not produce speedups. But there already is a way that some Csound pieces can benefit from big speedups right now, given certain pre-conditions. First, your computer has to have at least 2 cores. Second, your piece has to have code that will speed up if run on multiple cores. Usually but not always, that means multiple instances of the same instrument running at the same time.
If your pieces meet these conditions, then you will need to run at ksmps 20 to 100 or so. I haven't tried that with --sample-accurate, but I will be trying that. In such cases, on both 2 and 4 core machines, I've measured speedups of up to 2 times. That would make Csound 6 distinctly faster than Csound 5. I routinely run multi-threaded, now. I think that the people involved who implemented this (my role was pushing for the idea, monkeying with a simple and ineffective earlier implementation, and profiling code developed by John ffitch and Chris Wilson and perhaps others, Steven Yi, Victor Lazzarini, did you work on ParCS?) deserve recognition, as this puts Csound at the head of its class.
Expect to have to do some experimenting if you want this to work for you. Hope this helps, Mike On Sat, May 25, 2013 at 6:39 AM, Ben Hackbarth <hackbarth@gmail.com> wrote:
Michael Gogins Irreducible Productions http://www.michael-gogins.com Michael dot Gogins at gmail dot com |
Date | 2013-05-25 13:38 |
From | Victor Lazzarini |
Subject | Re: [Cs-dev] csound6 speed |
Ben, we have not, in general, looked too much at optmisation, so there might be some work there. Given the breadth of changes, it's not unexpected that there would be a slow down for some projects. There were some hacks and case-specific bits of code that were modified for more clear and maintainable code. For the moment, you will need to bear with us until we get to that point. Hopefully, we will be able to recover some of the loss in performance. If performance is critical for your projects, you will need to evaluate whether you want to switch to Csound 6 at this point. Currently, the 6.00 RC2 release for Csound is about 3-5% slower than 5.19, as measured with Trapped on a single thread. Of course, this might be worse/better for other code. As Michael pointed out much better, with multiple cores for code that is an optimal match for running in parallel. Regards Victor On 25 May 2013, at 11:39, Ben Hackbarth wrote: > hello, > > i'm looking to switching over to cs6 but i am finding that, in my larger csound projects, cs6 runs about 25% slower than cs5 on my machine (osx 10.7 doubles built from git yesterday). i run csound non-realtime. > > i'm using lots of strings and lots of python/pycall*. assuming that cs6 should be the same speed as cs5 (if not faster), how might i go about profiling cpu usage in detail? are there any build options that cs5 has turned off that cs6 has on (like multicore for instance)? the csd is complex and pulling it apart to test individual components would be rather difficult. > > i have noticed the difference is compile time between cs6 and cs5 for a while now (months), so i would doubt that it is due to any recent changes in git. > > thanks, > -- ben > ------------------------------------------------------------------------------ > Try New Relic Now & We'll Send You this Cool Shirt > New Relic is the only SaaS-based application performance monitoring service > that delivers powerful full stack analytics. Optimize and monitor your > browser, app, & servers with just a few lines of code. Try New Relic > and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may_______________________________________________ > Csound-devel mailing list > Csound-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/csound-devel Dr Victor Lazzarini Senior Lecturer Dept. of Music NUI Maynooth Ireland tel.: +353 1 708 3545 Victor dot Lazzarini AT nuim dot ie ------------------------------------------------------------------------------ Try New Relic Now & We'll Send You this Cool Shirt New Relic is the only SaaS-based application performance monitoring service that delivers powerful full stack analytics. Optimize and monitor your browser, app, & servers with just a few lines of code. Try New Relic and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may _______________________________________________ Csound-devel mailing list Csound-devel@lists.sourceforge.net |
Date | 2013-05-25 15:54 |
From | Ben Hackbarth |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
dear michael, thanks for your reponse. i starting building csound6 explicitly to take advantage of multithreading and the associates speedups. i am most grateful that csound development has been moving in this direction.-- ben
On Sat, May 25, 2013 at 1:01 PM, Michael Gogins <michael.gogins@gmail.com> wrote:
|
Date | 2013-05-25 16:57 |
From | Michael Gogins |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
Can you email me a test csd, self-contained? Whether you get speedups with multi-core or not very much depends on the nature of the orchestra code. I can try it here and let you know what I get. As for configuring, yes, there is a BUILD_MULTI_CORE option that must be enabled for CMake. What platform are you on -- Windows, OS X, or Linux?
Regards, Mike On Sat, May 25, 2013 at 10:54 AM, Ben Hackbarth <hackbarth@gmail.com> wrote:
Michael Gogins Irreducible Productions http://www.michael-gogins.com Michael dot Gogins at gmail dot com |
Date | 2013-05-25 17:32 |
From | Victor Lazzarini |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
Have you used the -j On 25 May 2013, at 15:54, Ben Hackbarth wrote:
Dr Victor Lazzarini Senior Lecturer Dept. of Music NUI Maynooth Ireland tel.: +353 1 708 3545 Victor dot Lazzarini AT nuim dot ie |
Date | 2013-05-25 17:42 |
From | Ben Hackbarth |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
hey guys, i have not set BUILD_MULTI_CORE in my cmake config, so that would seem to be the most likely culprit :)-- ben
On Sat, May 25, 2013 at 6:32 PM, Victor Lazzarini <Victor.Lazzarini@nuim.ie> wrote:
|
Date | 2013-05-25 17:57 |
From | Ben Hackbarth |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
things do not seem much better with multithreading enabled. cs6 with -j 4 is still about 30% slower than cs5 single thread.-- ben
On Sat, May 25, 2013 at 6:42 PM, Ben Hackbarth <hackbarth@gmail.com> wrote:
|
Date | 2013-05-25 18:01 |
From | Michael Gogins |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
I'm not familiar with OS X. So, perhaps the multicore code behaves differently on OS X. I can say that on both Windows and Linux it behaves more or less the same. Regards, Mike On Sat, May 25, 2013 at 12:57 PM, Ben Hackbarth <hackbarth@gmail.com> wrote:
Michael Gogins Irreducible Productions http://www.michael-gogins.com Michael dot Gogins at gmail dot com |
Date | 2013-05-25 18:11 |
From | Victor Lazzarini |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
No, it doesn't, the behaviour should be similar. Some code do not get the benefits of multicore. Victor On 25 May 2013, at 18:01, Michael Gogins wrote:
Dr Victor Lazzarini Senior Lecturer Dept. of Music NUI Maynooth Ireland tel.: +353 1 708 3545 Victor dot Lazzarini AT nuim dot ie |
Date | 2013-05-25 19:46 |
From | Ben Hackbarth |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
hi victor, i think i have found what is cause the significant difference in rendering time between cs5 and cs6 -- diskgrain.for csound 5.19.02 (double samples), single core: real 0m4.249s user 0m4.059s sys 0m0.186s for csound 6.00rc1 (double samples), single core: real 0m7.513s user 0m7.320s sys 0m0.192s with csound6 -j 4: real 0m4.337s user 0m13.314s sys 0m1.187s so, multicore is working on osx 10.7 gives a nice boost in this case. unfortunately my more complicated code does not see such a dramatic gain in efficiency. nonetheless, it seems suspicious that there is such a large difference in rendering times between single core versions. i realize that this might not be considered a problem or a priority, but i just wanted to follow up. regards, -- ben
On Sat, May 25, 2013 at 7:11 PM, Victor Lazzarini <Victor.Lazzarini@nuim.ie> wrote:
|
Date | 2013-05-25 19:59 |
From | Victor Lazzarini |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
OK, looks like disk reading could be the problem, since diskgrain has not really changed as far as I know (at least in normal mode). But I will look. Thanks. On 25 May 2013, at 19:46, Ben Hackbarth wrote:
Dr Victor Lazzarini Senior Lecturer Dept. of Music NUI Maynooth Ireland tel.: +353 1 708 3545 Victor dot Lazzarini AT nuim dot ie |
Date | 2013-05-25 20:10 |
From | Victor Lazzarini |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
I tested your CSD here with the latest GIT and there is not much difference (in fact csound6 is faster). I'm using a soundfile of my own (4 mins long). ======= coltrane:debug victor$ csound64 diskgrain.csd -d time resolution is 1000.000 ns Csound version 5.19.02 beta (double samples) May 7 2013 UnifiedCSD: diskgrain.csd Creating options Creating orchestra Creating score graph init using callback interface Parsing successful! Elapsed time at end of orchestra compile: real: 0.006s, CPU: 0.003s Sorting score Elapsed time at end of score sort: real: 0.007s, CPU: 0.004s displays suppressed 0dBFS level = 1.0 orch now loaded audio buffered in 1024 sample-frame blocks not writing to sound disk SECTION 1: ftable 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: B 0.000 ..100.000 T100.000 TT100.000 M: 1.02161 1.02161 number of samples out of range: 2 2 Score finished in csoundPerformKsmps(). inactive allocs returned to freespace end of score. overall amps: 1.02161 1.02161 overall samples out of range: 2 2 0 errors in performance Elapsed time at end of performance: real: 5.523s, CPU: 5.506s no sound written to disk ================ coltrane:debug victor$ csound diskgrain.csd -d CoreMIDI real time MIDI plugin for Csound PortMIDI real time MIDI plugin for Csound PortAudio real-time audio module for Csound virtual_keyboard real time MIDI plugin for Csound rtaudio: PortAudio module enabled ... using callback interface rtmidi: PortMIDI module enabled 0dBFS level = 32768.0 Csound version 6.00rc2 (double samples) May 24 2013 libsndfile-1.0.21 WARNING: could not open library './libCsoundAC.6.0.dylib' (-1) WARNING: could not open library './libCsoundAC.dylib' (-1) UnifiedCSD: diskgrain.csd STARTING FILE Creating options Creating orchestra Creating score rtaudio: PortAudio module enabled ... using callback interface rtmidi: PortMIDI module enabled Parsing successful! Elapsed time at end of orchestra compile: real: 0.003s, CPU: 0.003s sorting score ... ... done Elapsed time at end of score sort: real: 0.003s, CPU: 0.003s Csound version 6.00rc2 (double samples) May 24 2013 displays suppressed 0dBFS level = 1.0 orch now loaded audio buffered in 1024 sample-frame blocks not writing to sound disk SECTION 1: ftable 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: B 0.000 ..100.000 T100.000 TT100.000 M: 1.02161 1.02161 number of samples out of range: 2 2 Score finished in csoundPerformKsmps(). inactive allocs returned to freespace end of score. overall amps: 1.02161 1.02161 overall samples out of range: 2 2 0 errors in performance Elapsed time at end of performance: real: 5.426s, CPU: 5.385s no sound written to disk On 25 May 2013, at 19:46, Ben Hackbarth wrote:
Dr Victor Lazzarini Senior Lecturer Dept. of Music NUI Maynooth Ireland tel.: +353 1 708 3545 Victor dot Lazzarini AT nuim dot ie |
Date | 2013-05-25 20:11 |
From | Victor Lazzarini |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
And multicore is almost twice as fast (2 threads): coltrane:debug victor$ csound diskgrain.csd -d -j 2 CoreMIDI real time MIDI plugin for Csound PortMIDI real time MIDI plugin for Csound PortAudio real-time audio module for Csound virtual_keyboard real time MIDI plugin for Csound rtaudio: PortAudio module enabled ... using callback interface rtmidi: PortMIDI module enabled 0dBFS level = 32768.0 Csound version 6.00rc2 (double samples) May 24 2013 libsndfile-1.0.21 WARNING: could not open library './libCsoundAC.6.0.dylib' (-1) WARNING: could not open library './libCsoundAC.dylib' (-1) UnifiedCSD: diskgrain.csd STARTING FILE Creating options Creating orchestra Creating score rtaudio: PortAudio module enabled ... using callback interface rtmidi: PortMIDI module enabled Parsing successful! Elapsed time at end of orchestra compile: real: 0.004s, CPU: 0.003s sorting score ... ... done Elapsed time at end of score sort: real: 0.004s, CPU: 0.003s Multithread performance: insno: -1 thread 0 of 2 starting. Csound version 6.00rc2 (double samples) May 24 2013 displays suppressed 0dBFS level = 1.0 orch now loaded audio buffered in 1024 sample-frame blocks not writing to sound disk SECTION 1: ftable 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: new alloc for instr 1: B 0.000 ..100.000 T100.000 TT100.000 M: 1.02161 1.02161 number of samples out of range: 2 2 Score finished in csoundPerformKsmps(). inactive allocs returned to freespace end of score. overall amps: 1.02161 1.02161 overall samples out of range: 2 2 0 errors in performance Elapsed time at end of performance: real: 3.975s, CPU: 6.673s no sound written to disk ================= On 25 May 2013, at 19:59, Victor Lazzarini wrote:
Dr Victor Lazzarini Senior Lecturer Dept. of Music NUI Maynooth Ireland tel.: +353 1 708 3545 Victor dot Lazzarini AT nuim dot ie |
Date | 2013-05-25 20:15 |
From | Michael Gogins |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
Perhaps the architecture or the build flags are different? Regards, Mike On Sat, May 25, 2013 at 2:59 PM, Victor Lazzarini <Victor.Lazzarini@nuim.ie> wrote:
Michael Gogins Irreducible Productions http://www.michael-gogins.com Michael dot Gogins at gmail dot com |
Date | 2013-05-25 20:26 |
From | Ben Hackbarth |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
hi mike, some more info about my machine/build: Csound version 6.00rc1 (recent git) on osx 10.7 (64-bit kernel) ('USE_DOUBLE:BOOL', 'ON'), ('BUILD_MULTI_CORE:BOOL', 'ON'), ('CMAKE_VERBOSE_MAKEFILE:BOOL', 'ON'), ('NEW_PARSER_DEBUG:BOOL', 'OFF'), ('PORTAUDIO_LIBRARY:FILEPATH', '/Users/ben/lib/libportaudio.dylib'), ('PORTAUDIO_HEADER:FILEPATH', '/Users/ben/include/portaudio.h'), libsndfile-1.0.25 -- ben
On Sat, May 25, 2013 at 9:15 PM, Michael Gogins <michael.gogins@gmail.com> wrote:
|
Date | 2013-05-25 20:33 |
From | Victor Lazzarini |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
Try updating to the latest GIT, or even downloading the new binaries, we're in RC2 now. Victor On 25 May 2013, at 20:26, Ben Hackbarth wrote:
Dr Victor Lazzarini Senior Lecturer Dept. of Music NUI Maynooth Ireland tel.: +353 1 708 3545 Victor dot Lazzarini AT nuim dot ie |
Date | 2013-05-25 20:47 |
From | Ben Hackbarth |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
oops.. thought i pulled the latest. same result though:Elapsed time at end of performance: real: 4.203s, CPU: 4.203s Elapsed time at end of performance: real: 7.388s, CPU: 7.388s -- ben
On Sat, May 25, 2013 at 9:33 PM, Victor Lazzarini <Victor.Lazzarini@nuim.ie> wrote:
|
Date | 2013-05-25 20:58 |
From | Victor Lazzarini |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
By the way, my reported numbers were with the released binaries. Maybe you could try installing that. On 25 May 2013, at 20:47, Ben Hackbarth wrote:
Dr Victor Lazzarini Senior Lecturer Dept. of Music NUI Maynooth Ireland tel.: +353 1 708 3545 Victor dot Lazzarini AT nuim dot ie |
Date | 2013-05-25 21:01 |
From | Victor Lazzarini |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
I can only think that your build is not using any gcc optimisations. Could you look into the command lines to see if there is a -03 there? Like this (see the -O3 there). /Users/victor/bin/gcc -DCsoundLib64_EXPORTS -DUSE_DOUBLE -DCS_DEFAULT_PLUGINDIR=\"/Users/victor/Library/Frameworks/CsoundLib64.framework/Versions/6.0/Resources/Opcodes64\" -D_CSOUND_RELEASE_ -DUSE_LRINT -DMACOSX -DPIPES -DNO_FLTK_THREADS -DHAVE_SOCKETS -DHAVE_STRTOK_R -DHAVE_STRTOD_L -W -Wall -O3 -Wno-missing-field-initializers -Wno-unused-parameter -ftree-vectorize -ffast-math -g -fPIC -I/usr/local/include -I/Users/victor/src/csound6/./H -I/Users/victor/src/csound6/./include -I/Users/victor/src/csound6/./Engine -I/Users/victor/src/csound6/. -I/Users/victor/src/csound6/debug -Wno-format -g -D__BUILDING_LIBCSOUND -DPARCS -DHAVE_DIRENT_H -DHAVE_FCNTL_H -DHAVE_UNISTD_H -DHAVE_STDINT_H -DHAVE_SYS_TIME_H -DHAVE_SYS_TYPES_H -DHAVE_TERMIOS_H -fvisibility=hidden -ffast-math -mfpmath=sse -fomit-frame-pointer -o CMakeFiles/CsoundLib64.dir/OOps/str_ops.c.o -c /Users/victor/src/csound6/OOps/str_ops.c On 25 May 2013, at 20:47, Ben Hackbarth wrote:
Dr Victor Lazzarini Senior Lecturer Dept. of Music NUI Maynooth Ireland tel.: +353 1 708 3545 Victor dot Lazzarini AT nuim dot ie |
Date | 2013-05-25 21:03 |
From | Steven Yi |
Subject | Re: [Cs-dev] csound6 speed |
Hi Ben, What ksmps are you using? steven On Sat, May 25, 2013 at 9:01 PM, Victor Lazzarini |
Date | 2013-05-25 21:18 |
From | Victor Lazzarini |
Subject | Re: [Cs-dev] csound6 speed |
200 On 25 May 2013, at 21:03, Steven Yi wrote: > Hi Ben, > > What ksmps are you using? > > steven > > On Sat, May 25, 2013 at 9:01 PM, Victor Lazzarini > |
Date | 2013-05-25 21:28 |
From | Ben Hackbarth |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
looks like you're right victor.. -- ben
On Sat, May 25, 2013 at 10:18 PM, Victor Lazzarini <Victor.Lazzarini@nuim.ie> wrote: 200 |
Date | 2013-05-25 21:38 |
From | Ben Hackbarth |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
looks like i failed to specify -DCMAKE_BUILD_TYPE=Release... thank you victor, things are looking good now.-- ben
On Sat, May 25, 2013 at 10:28 PM, Ben Hackbarth <hackbarth@gmail.com> wrote:
|
Date | 2013-05-25 22:50 |
From | Victor Lazzarini |
Subject | Re: [Cs-dev] csound6 speed |
Attachments | None None |
yeh, I know the gcc optimisations would give a speed up of about 2x. I'm glad you found this. On 25 May 2013, at 21:38, Ben Hackbarth wrote:
Dr Victor Lazzarini Senior Lecturer Dept. of Music NUI Maynooth Ireland tel.: +353 1 708 3545 Victor dot Lazzarini AT nuim dot ie |