Csound Csound-dev Csound-tekno Search About

[Csnd-dev] CMake and simd

Date2021-07-11 11:55
FromEduardo Moguillansky
Subject[Csnd-dev] CMake and simd
Hi,

I just sent a PR which makes sumarray for audio arrays friendlier to 
simd compilation. It copies supercollider's approach, where audio 
signals are summed in groups of 4. With properly configured flags this 
results in very efficient horizontal simd instructions. On my system 
(linux, sandy bridge), with a properly configured cmake I see execution 
speed up to 9x faster, which is a great result for only a very small 
modification.

cheers,

eduardo

Date2021-07-11 12:10
FromVictor Lazzarini
SubjectRe: [Csnd-dev] [EXTERNAL] [Csnd-dev] CMake and simd
Thanks for this. I think I have the right flags for SIMD on my MacOS build that I use for releases (I’ve played with these a few years back), but
I will check.

Not sure if there’s any gain to be had, but if you have the chance, could you look at the output mixing functions (aops.c) to see if your
solutions to arrays can also be applied there.

best
========================
Prof. Victor Lazzarini
Maynooth University
Ireland

> On 11 Jul 2021, at 11:55, Eduardo Moguillansky  wrote:
> 
> *Warning*
> 
> This email originated from outside of Maynooth University's Mail System. Do not reply, click links or open attachments unless you recognise the sender and know the content is safe.
> 
> Hi,
> 
> I just sent a PR which makes sumarray for audio arrays friendlier to
> simd compilation. It copies supercollider's approach, where audio
> signals are summed in groups of 4. With properly configured flags this
> results in very efficient horizontal simd instructions. On my system
> (linux, sandy bridge), with a properly configured cmake I see execution
> speed up to 9x faster, which is a great result for only a very small
> modification.
> 
> cheers,
> 
> eduardo


Date2021-07-11 21:54
FromEduardo Moguillansky
SubjectRe: [Csnd-dev] [EXTERNAL] [Csnd-dev] CMake and simd
I took a brief look but I didn't see clear candidates for this in 
aops.c.  I did apply this to the "sum" opcode, which resulted in the 
same kind of speed up when summing 4 or more signals.

As a side note, to check if the optimizations are taking effect in 
macos, the disassembled code should have instructions like "movupd" and 
"addpd".

On 11.07.21 13:10, Victor Lazzarini wrote:
> Thanks for this. I think I have the right flags for SIMD on my MacOS build that I use for releases (I’ve played with these a few years back), but
> I will check.
>
> Not sure if there’s any gain to be had, but if you have the chance, could you look at the output mixing functions (aops.c) to see if your
> solutions to arrays can also be applied there.
>
> best
> ========================
> Prof. Victor Lazzarini
> Maynooth University
> Ireland
>
>> On 11 Jul 2021, at 11:55, Eduardo Moguillansky  wrote:
>>
>> *Warning*
>>
>> This email originated from outside of Maynooth University's Mail System. Do not reply, click links or open attachments unless you recognise the sender and know the content is safe.
>>
>> Hi,
>>
>> I just sent a PR which makes sumarray for audio arrays friendlier to
>> simd compilation. It copies supercollider's approach, where audio
>> signals are summed in groups of 4. With properly configured flags this
>> results in very efficient horizontal simd instructions. On my system
>> (linux, sandy bridge), with a properly configured cmake I see execution
>> speed up to 9x faster, which is a great result for only a very small
>> modification.
>>
>> cheers,
>>
>> eduardo

Date2021-07-11 22:12
FromVictor Lazzarini
SubjectRe: [Csnd-dev] [EXTERNAL] [Csnd-dev] CMake and simd
Thanks. These instructions are already present in the disassembled CsoundLib64, so I expect that your code will use the optimisations (I have not built it yet).
========================
Prof. Victor Lazzarini
Maynooth University
Ireland

> On 11 Jul 2021, at 21:54, Eduardo Moguillansky  wrote:
> 
> I took a brief look but I didn't see clear candidates for this in aops.c.  I did apply this to the "sum" opcode, which resulted in the same kind of speed up when summing 4 or more signals.
> 
> As a side note, to check if the optimizations are taking effect in macos, the disassembled code should have instructions like "movupd" and "addpd".
> 
> On 11.07.21 13:10, Victor Lazzarini wrote:
>> Thanks for this. I think I have the right flags for SIMD on my MacOS build that I use for releases (I’ve played with these a few years back), but
>> I will check.
>> 
>> Not sure if there’s any gain to be had, but if you have the chance, could you look at the output mixing functions (aops.c) to see if your
>> solutions to arrays can also be applied there.
>> 
>> best
>> ========================
>> Prof. Victor Lazzarini
>> Maynooth University
>> Ireland
>> 
>>> On 11 Jul 2021, at 11:55, Eduardo Moguillansky  wrote:
>>> 
>>> *Warning*
>>> 
>>> This email originated from outside of Maynooth University's Mail System. Do not reply, click links or open attachments unless you recognise the sender and know the content is safe.
>>> 
>>> Hi,
>>> 
>>> I just sent a PR which makes sumarray for audio arrays friendlier to
>>> simd compilation. It copies supercollider's approach, where audio
>>> signals are summed in groups of 4. With properly configured flags this
>>> results in very efficient horizontal simd instructions. On my system
>>> (linux, sandy bridge), with a properly configured cmake I see execution
>>> speed up to 9x faster, which is a great result for only a very small
>>> modification.
>>> 
>>> cheers,
>>> 
>>> eduardo