[Csnd-dev] CMake and simd

Date	2021-07-11 11:55
From	Eduardo Moguillansky
Subject	[Csnd-dev] CMake and simd
	Hi, I just sent a PR which makes sumarray for audio arrays friendlier to simd compilation. It copies supercollider's approach, where audio signals are summed in groups of 4. With properly configured flags this results in very efficient horizontal simd instructions. On my system (linux, sandy bridge), with a properly configured cmake I see execution speed up to 9x faster, which is a great result for only a very small modification. cheers, eduardo

Date	2021-07-11 12:10
From	Victor Lazzarini
Subject	Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CMake and simd
	Thanks for this. I think I have the right flags for SIMD on my MacOS build that I use for releases (I’ve played with these a few years back), but I will check. Not sure if there’s any gain to be had, but if you have the chance, could you look at the output mixing functions (aops.c) to see if your solutions to arrays can also be applied there. best ======================== Prof. Victor Lazzarini Maynooth University Ireland > On 11 Jul 2021, at 11:55, Eduardo Moguillansky wrote: > > Warning > > This email originated from outside of Maynooth University's Mail System. Do not reply, click links or open attachments unless you recognise the sender and know the content is safe. > > Hi, > > I just sent a PR which makes sumarray for audio arrays friendlier to > simd compilation. It copies supercollider's approach, where audio > signals are summed in groups of 4. With properly configured flags this > results in very efficient horizontal simd instructions. On my system > (linux, sandy bridge), with a properly configured cmake I see execution > speed up to 9x faster, which is a great result for only a very small > modification. > > cheers, > > eduardo

Date	2021-07-11 21:54
From	Eduardo Moguillansky
Subject	Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CMake and simd
	I took a brief look but I didn't see clear candidates for this in aops.c. I did apply this to the "sum" opcode, which resulted in the same kind of speed up when summing 4 or more signals. As a side note, to check if the optimizations are taking effect in macos, the disassembled code should have instructions like "movupd" and "addpd". On 11.07.21 13:10, Victor Lazzarini wrote: > Thanks for this. I think I have the right flags for SIMD on my MacOS build that I use for releases (I’ve played with these a few years back), but > I will check. > > Not sure if there’s any gain to be had, but if you have the chance, could you look at the output mixing functions (aops.c) to see if your > solutions to arrays can also be applied there. > > best > ======================== > Prof. Victor Lazzarini > Maynooth University > Ireland > >> On 11 Jul 2021, at 11:55, Eduardo Moguillansky wrote: >> >> Warning >> >> This email originated from outside of Maynooth University's Mail System. Do not reply, click links or open attachments unless you recognise the sender and know the content is safe. >> >> Hi, >> >> I just sent a PR which makes sumarray for audio arrays friendlier to >> simd compilation. It copies supercollider's approach, where audio >> signals are summed in groups of 4. With properly configured flags this >> results in very efficient horizontal simd instructions. On my system >> (linux, sandy bridge), with a properly configured cmake I see execution >> speed up to 9x faster, which is a great result for only a very small >> modification. >> >> cheers, >> >> eduardo

Date	2021-07-11 22:12
From	Victor Lazzarini
Subject	Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CMake and simd
	Thanks. These instructions are already present in the disassembled CsoundLib64, so I expect that your code will use the optimisations (I have not built it yet). ======================== Prof. Victor Lazzarini Maynooth University Ireland > On 11 Jul 2021, at 21:54, Eduardo Moguillansky wrote: > > I took a brief look but I didn't see clear candidates for this in aops.c. I did apply this to the "sum" opcode, which resulted in the same kind of speed up when summing 4 or more signals. > > As a side note, to check if the optimizations are taking effect in macos, the disassembled code should have instructions like "movupd" and "addpd". > > On 11.07.21 13:10, Victor Lazzarini wrote: >> Thanks for this. I think I have the right flags for SIMD on my MacOS build that I use for releases (I’ve played with these a few years back), but >> I will check. >> >> Not sure if there’s any gain to be had, but if you have the chance, could you look at the output mixing functions (aops.c) to see if your >> solutions to arrays can also be applied there. >> >> best >> ======================== >> Prof. Victor Lazzarini >> Maynooth University >> Ireland >> >>> On 11 Jul 2021, at 11:55, Eduardo Moguillansky wrote: >>> >>> Warning >>> >>> This email originated from outside of Maynooth University's Mail System. Do not reply, click links or open attachments unless you recognise the sender and know the content is safe. >>> >>> Hi, >>> >>> I just sent a PR which makes sumarray for audio arrays friendlier to >>> simd compilation. It copies supercollider's approach, where audio >>> signals are summed in groups of 4. With properly configured flags this >>> results in very efficient horizontal simd instructions. On my system >>> (linux, sandy bridge), with a properly configured cmake I see execution >>> speed up to 9x faster, which is a great result for only a very small >>> modification. >>> >>> cheers, >>> >>> eduardo