Csound Csound-dev Csound-tekno Search About

[Cs-dev] [OT] Block vs. sampling processing

Date2013-06-01 16:00
FromAndres Cabrera
Subject[Cs-dev] [OT] Block vs. sampling processing
Attachmentsticker.zip  performance1.pdf  performance2.pdf  performance3.pdf  performance4.pdf  None  None  
Hi,

I was doing some tests on my new machine about block vs. sample processing and got very unexpected results...

Not only is sample processing not slower, but also debug builds can be faster...

It's likely an error on my side, but I can't see it so if anyone has some procrastination inclinations, maybe they can have a look?

I'm attaching the code I used and the results from 100 runs for 3 cases: 1-block processing inline, 2-block processing with a separate function, 3-sample processing.

Let me know if you find anything...

Thanks,
Andrés

Date2013-06-01 16:18
FromSteven Yi
SubjectRe: [Cs-dev] [OT] Block vs. sampling processing
Hi Andres,

I'm wondering if the comparison is correct, as for the block
processing you're making an additional function call to
process_function()?  It seems the driving code in main.c should all
call one function, and those functions should process one-sample or
block of samples.  The process_audio_block functions then should not
have additional calls to other functions.  I think that will be a more
apples-to-apples comparison.  My guess at that point is that the
overhead for function calls will start to creep up for the sample
processing example, assuming it doesn't get inlined by the compiler.
Maybe a way to get around that is to have a test_runner function that
takes in a function pointer, then run the tests with the function
pointers.

steven

On Sat, Jun 1, 2013 at 11:00 AM, Andres Cabrera  wrote:
> Hi,
>
> I was doing some tests on my new machine about block vs. sample processing
> and got very unexpected results...
>
> Not only is sample processing not slower, but also debug builds can be
> faster...
>
> It's likely an error on my side, but I can't see it so if anyone has some
> procrastination inclinations, maybe they can have a look?
>
> I'm attaching the code I used and the results from 100 runs for 3 cases:
> 1-block processing inline, 2-block processing with a separate function,
> 3-sample processing.
>
> Let me know if you find anything...
>
> Thanks,
> Andrés
>
> ------------------------------------------------------------------------------
> Get 100% visibility into Java/.NET code with AppDynamics Lite
> It's a free troubleshooting tool designed for production
> Get down to code-level detail for bottlenecks, with <2% overhead.
> Download for free and get started troubleshooting in minutes.
> http://p.sf.net/sfu/appdyn_d2d_ap2
> _______________________________________________
> Csound-devel mailing list
> Csound-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/csound-devel
>

------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Csound-devel mailing list
Csound-devel@lists.sourceforge.net

Date2013-06-05 21:29
FromAndres Cabrera
SubjectRe: [Cs-dev] [OT] Block vs. sampling processing
AttachmentsNone  None  
Hi Steven,

The goal of wrapping every method in a function is to simulate the overhead of a processing callback.

What surprized me about the results was that sample processing was consistently faster on my system than block processing, and even stranger, the debug build was faster...

Cheers,
Andrés


On Sat, Jun 1, 2013 at 8:18 AM, Steven Yi <stevenyi@gmail.com> wrote:
Hi Andres,

I'm wondering if the comparison is correct, as for the block
processing you're making an additional function call to
process_function()?  It seems the driving code in main.c should all
call one function, and those functions should process one-sample or
block of samples.  The process_audio_block functions then should not
have additional calls to other functions.  I think that will be a more
apples-to-apples comparison.  My guess at that point is that the
overhead for function calls will start to creep up for the sample
processing example, assuming it doesn't get inlined by the compiler.
Maybe a way to get around that is to have a test_runner function that
takes in a function pointer, then run the tests with the function
pointers.

steven

On Sat, Jun 1, 2013 at 11:00 AM, Andres Cabrera <mantaraya36@gmail.com> wrote:
> Hi,
>
> I was doing some tests on my new machine about block vs. sample processing
> and got very unexpected results...
>
> Not only is sample processing not slower, but also debug builds can be
> faster...
>
> It's likely an error on my side, but I can't see it so if anyone has some
> procrastination inclinations, maybe they can have a look?
>
> I'm attaching the code I used and the results from 100 runs for 3 cases:
> 1-block processing inline, 2-block processing with a separate function,
> 3-sample processing.
>
> Let me know if you find anything...
>
> Thanks,
> Andrés
>
> ------------------------------------------------------------------------------
> Get 100% visibility into Java/.NET code with AppDynamics Lite
> It's a free troubleshooting tool designed for production
> Get down to code-level detail for bottlenecks, with <2% overhead.
> Download for free and get started troubleshooting in minutes.
> http://p.sf.net/sfu/appdyn_d2d_ap2
> _______________________________________________
> Csound-devel mailing list
> Csound-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/csound-devel
>

------------------------------------------------------------------------------
Get 100% visibility into Java/.NET code with AppDynamics Lite
It's a free troubleshooting tool designed for production
Get down to code-level detail for bottlenecks, with <2% overhead.
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap2
_______________________________________________
Csound-devel mailing list
Csound-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/csound-devel


Date2013-06-06 00:11
FromSteven Yi
SubjectRe: [Cs-dev] [OT] Block vs. sampling processing
Attachmentsperformance.pdf  main.c  None  None  
Hi Andres,

I understand the intention but I think something is off.  I spent some
time to modify the test, this time using function pointers.  I'm not
sure this is a debug build, I built without any -O flags.  However,
test3 in this case does come out noticeably slower.  I think this
might be more accurate a test, because in a real system, there will at
some point be a function pointer used for different audio processing
functions.  My guess is in your tests maybe the process_audio_sample()
function got inlined.

As far as I understand, with sample by sample processing, you don't
incur the overhead of the if-check in the loop, but you do incur an
overhead for function calls.  I think that's why if you do ksmps=1
with a block-based system, it should come out slower than a
sample-by-sample design where there's no for-loop around the audio
function. On the other hand, with larger ksmps, the overhead of the
if-then becomes greatly outweighed by the time for function calls.

I tried just now with the main.c attached built with -O3 and got
closer results, but still test3 came out slower.

I'm not sure if my own assumptions for the test are off though.  Could
you try this main.c and see what you get?

Thanks!
steven




On Wed, Jun 5, 2013 at 4:29 PM, Andres Cabrera  wrote:
> Hi Steven,
>
> The goal of wrapping every method in a function is to simulate the overhead
> of a processing callback.
>
> What surprized me about the results was that sample processing was
> consistently faster on my system than block processing, and even stranger,
> the debug build was faster...
>
> Cheers,
> Andrés
>
>
> On Sat, Jun 1, 2013 at 8:18 AM, Steven Yi  wrote:
>>
>> Hi Andres,
>>
>> I'm wondering if the comparison is correct, as for the block
>> processing you're making an additional function call to
>> process_function()?  It seems the driving code in main.c should all
>> call one function, and those functions should process one-sample or
>> block of samples.  The process_audio_block functions then should not
>> have additional calls to other functions.  I think that will be a more
>> apples-to-apples comparison.  My guess at that point is that the
>> overhead for function calls will start to creep up for the sample
>> processing example, assuming it doesn't get inlined by the compiler.
>> Maybe a way to get around that is to have a test_runner function that
>> takes in a function pointer, then run the tests with the function
>> pointers.
>>
>> steven
>>
>> On Sat, Jun 1, 2013 at 11:00 AM, Andres Cabrera 
>> wrote:
>> > Hi,
>> >
>> > I was doing some tests on my new machine about block vs. sample
>> > processing
>> > and got very unexpected results...
>> >
>> > Not only is sample processing not slower, but also debug builds can be
>> > faster...
>> >
>> > It's likely an error on my side, but I can't see it so if anyone has
>> > some
>> > procrastination inclinations, maybe they can have a look?
>> >
>> > I'm attaching the code I used and the results from 100 runs for 3 cases:
>> > 1-block processing inline, 2-block processing with a separate function,
>> > 3-sample processing.
>> >
>> > Let me know if you find anything...
>> >
>> > Thanks,
>> > Andrés
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > Get 100% visibility into Java/.NET code with AppDynamics Lite
>> > It's a free troubleshooting tool designed for production
>> > Get down to code-level detail for bottlenecks, with <2% overhead.
>> > Download for free and get started troubleshooting in minutes.
>> > http://p.sf.net/sfu/appdyn_d2d_ap2
>> > _______________________________________________
>> > Csound-devel mailing list
>> > Csound-devel@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/csound-devel
>> >
>>
>>
>> ------------------------------------------------------------------------------
>> Get 100% visibility into Java/.NET code with AppDynamics Lite
>> It's a free troubleshooting tool designed for production
>> Get down to code-level detail for bottlenecks, with <2% overhead.
>> Download for free and get started troubleshooting in minutes.
>> http://p.sf.net/sfu/appdyn_d2d_ap2
>> _______________________________________________
>> Csound-devel mailing list
>> Csound-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/csound-devel
>
>
>
> ------------------------------------------------------------------------------
> How ServiceNow helps IT people transform IT departments:
> 1. A cloud service to automate IT design, transition and operations
> 2. Dashboards that offer high-level views of enterprise services
> 3. A single system of record for all IT processes
> http://p.sf.net/sfu/servicenow-d2d-j
> _______________________________________________
> Csound-devel mailing list
> Csound-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/csound-devel
>

Date2013-06-06 00:36
FromVictor Lazzarini
SubjectRe: [Cs-dev] [OT] Block vs. sampling processing
I think you're right about the inlining. I built Andres original code and ran it with gdb, setting a breakpoint in the process_audio_sample() function

(gdb) break main.c:105
Breakpoint 5 at 0x1000007d4: file main.c, line 105.
Breakpoint 6 (inlined main.c:105) at 0x1000008e8: file main.c, line 105.
Breakpoint 7 (inlined main.c:105) at 0x100000b3e: file main.c, line 105.
warning: Multiple breakpoints were set.
Use the "delete" command to delete unwanted breakpoints.

That would make the whole test meaningless.

Victor



On 1 Jun 2013, at 16:00, Andres Cabrera wrote:

> Hi,
> 
> I was doing some tests on my new machine about block vs. sample processing and got very unexpected results...
> 
> Not only is sample processing not slower, but also debug builds can be faster...
> 
> It's likely an error on my side, but I can't see it so if anyone has some procrastination inclinations, maybe they can have a look?
> 
> I'm attaching the code I used and the results from 100 runs for 3 cases: 1-block processing inline, 2-block processing with a separate function, 3-sample processing.
> 
> Let me know if you find anything...
> 
> Thanks,
> Andrés
> ------------------------------------------------------------------------------
> Get 100% visibility into Java/.NET code with AppDynamics Lite
> It's a free troubleshooting tool designed for production
> Get down to code-level detail for bottlenecks, with <2% overhead.
> Download for free and get started troubleshooting in minutes.
> http://p.sf.net/sfu/appdyn_d2d_ap2_______________________________________________
> Csound-devel mailing list
> Csound-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/csound-devel

Dr Victor Lazzarini
Senior Lecturer
Dept. of Music
NUI Maynooth Ireland
tel.: +353 1 708 3545
Victor dot Lazzarini AT nuim dot ie




------------------------------------------------------------------------------
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
_______________________________________________
Csound-devel mailing list
Csound-devel@lists.sourceforge.net