Csound Csound-dev Csound-tekno Search About

[Csnd] Ryzen vs M1

Date2024-01-13 19:14
FromTobiah
Subject[Csnd] Ryzen vs M1
I decided to test csound speed on a new (to me) Thinkpad.  I was surprised
that it greatly outperformed an M1 Macbook air.  The Thinkpad has an
AMD Ryzen 7 PRO 6850U.

Here are two .csd files I used for testing:

https://drive.google.com/drive/folders/1f3y1vDk-q_QQclv--_AYRkrd0BOTQP6t

bench.csd

         Thinkpad        9.8 seconds
         M1              38.8 seconds

bench2.csd
         Thinkpad        8.0  seconds
         M1              21.0 seconds

I was surpised that the Ryzen 7 compared so favorably to the
M1, given all the hype about M1 performance.  Is it just weak
at floating point, or perhaps they throttle the cpu since the
Macbook Air has no fans?

I was also surprised that when comparing bench.csd, the macbook
took almost 4 times as long, whereas with bench2.csd, it took a
little over 2.5 times a long.  Maybe the score sorting needed
for bench.csd took longer?

What numbers are you getting with your machine?


Another curiosity that I thought the list might be able to
help with, was that I don't seem to gain any speed by running
multiple csound instances in parallel.   I tried this:

         for each in `seq 1 8`; do
                 csound bench.csd &
         done

I see 8 cpu's churning, but it takes about the same
amount of time as it would have had I run the processes
consecutively.  The Ryzen is an 8-core 16-thread cpu.
I also tried running these concurrent processes in different
directories, in case the score.srt was getting tied up, but
there was no difference.


Thanks,


Toby

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Date2024-01-13 21:29
From"Peter P."
SubjectRe: [Csnd] Ryzen vs M1
AttachmentsNone  

Date2024-01-14 00:41
FromVictor Lazzarini
SubjectRe: [Csnd] [EXTERNAL] Re: [Csnd] Ryzen vs M1
 Csound can run in parallel with the -j flag. Whether it runs faster or slower depends on the csd, ksmps, and processor.

Prof. Victor Lazzarini
Maynooth University
Ireland

> On 13 Jan 2024, at 21:29, Peter P.  wrote:
>
> *Warning*
>
> This email originated from outside of Maynooth University's Mail System. Do not reply, click links or open attachments unless you recognise the sender and know the content is safe.
>
> Hi,
>
> * Tobiah  [2024-01-13 20:14]:
>> I decided to test csound speed on a new (to me) Thinkpad.  I was surprised
>> that it greatly outperformed an M1 Macbook air.  The Thinkpad has an
>> AMD Ryzen 7 PRO 6850U.
>>
>> Here are two .csd files I used for testing:
>>
>> https://drive.google.com/drive/folders/1f3y1vDk-q_QQclv--_AYRkrd0BOTQP6t
>>
>> bench.csd
>>
>>        Thinkpad        9.8 seconds
>>        M1              38.8 seconds
>>
>> bench2.csd
>>        Thinkpad        8.0  seconds
>>        M1              21.0 seconds
>>
>
> How much RAM do these machines have?
>
> [...]
>> What numbers are you getting with your machine?
>
> How are you running these tests? I tried your benchmark csd's on my
> everyday Thinkpad X230 from 2012 with csound using one of the four CPU
> cores which qualify as:
>
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 58
> model name      : Intel(R) Core(TM) i5-3320M CPU @ 2.60GHz
> stepping        : 9
> microcode       : 0x21
> cpu MHz         : 1197.224
> cache size      : 3072 KB
>
> by executing
> time csound -d -n bench2.csd 2> /dev/null
> avoiding writing soundfile to disk and not posting any messages to the
> console.
>
> bench.csd
>        real    1m24.532s
>        user    1m23.433s
>
> bench2.csd takes
>        real    0m32.224s
>        user    0m31.875s
>
> This doesn't help much in optimizing M1 performance, but shows that
> 15 year old computers can do good jobs.
>
>> Another curiosity that I thought the list might be able to
>> help with, was that I don't seem to gain any speed by running
>> multiple csound instances in parallel.   I tried this:
>>
>>        for each in `seq 1 8`; do
>>                csound bench.csd &
>>        done
> Yes, you are compiling bench.csd eight times yielding identical results.
>
> I don't think csound is capable of running parallel jobs on SMP. You'd
> have to split your composition into individual csd's and compile them
> at the same time, possibly using such a great tool as GNU parallel:
>        parallel csound *.csd
>
>> I see 8 cpu's churning, but it takes about the same
>> amount of time as it would have had I run the processes
>> consecutively.  The Ryzen is an 8-core 16-thread cpu.
>> I also tried running these concurrent processes in different
>> directories, in case the score.srt was getting tied up, but
>> there was no difference.
>
> Csound mailing list
> Csound@listserv.heanet.ie
> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
> Send bugs reports to
>        https://github.com/csound/csound/issues
> Discussions of bugs and features can be posted here

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Date2024-01-14 02:04
FromTobiah
SubjectRe: [Csnd] Ryzen vs M1
> How much RAM do these machines have?

Well, the Thinkpad has 24 gig (32 - 8 for video) and the M1 has only 8,
but the csound processes appear to take half a gig or so.

> How are you running these tests?
> by executing
> time csound -d -n bench2.csd

I was using -n -d, but then I put the orc and sco in
a .csd, and forgot the CsOptions tag.  I put amended
copies up.

rallel.   I tried this:
>>
>>          for each in `seq 1 8`; do
>>                  csound bench.csd &
>>          done
> Yes, you are compiling bench.csd eight times yielding identical results.
> 
> I don't think csound is capable of running parallel jobs on SMP. You'd
> have to split your composition into individual csd's and compile them
> at the same time, possibly using such a great tool as GNU parallel:
> 	parallel csound *.csd

I don't expect it to go through the performance faster, I expected
it to run the same performance 8 times in parallel, faster than
8 times as long.  I can see 8 CPU's crunching at 100%, but it still
take 8 times the amount of time it takes to run one performance.


Toby

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Date2024-01-14 02:13
FromTobiah
SubjectRe: [Csnd] [EXTERNAL] Re: [Csnd] Ryzen vs M1
On 1/13/24 16:41, Victor Lazzarini wrote:
>   Csound can run in parallel with the -j flag. Whether it runs faster or slower depends on the csd, ksmps, and processor.

In this case it more than doubled the execution time.

Still, if I run them in the background:

csound bench.csd &
csound bench.csd &
csound bench.csd &
...


Shouldn't I do better than when I execute them sequentially?
All the processors show 100% usage, so I'd expect to make
faster progress through the jobs, but it's not so.


Toby

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Date2024-01-14 10:09
FromVictor Lazzarini
SubjectRe: [Csnd] [EXTERNAL] Re: [Csnd] Ryzen vs M1
Not necessarily, it depends on the CSD and ksmps.

Prof. Victor Lazzarini
Maynooth University
Ireland

> On 14 Jan 2024, at 02:14, Tobiah  wrote:
>
> On 1/13/24 16:41, Victor Lazzarini wrote:
>>  Csound can run in parallel with the -j flag. Whether it runs faster or slower depends on the csd, ksmps, and processor.
>
> In this case it more than doubled the execution time.
>
> Still, if I run them in the background:
>
> csound bench.csd &
> csound bench.csd &
> csound bench.csd &
> ...
>
>
> Shouldn't I do better than when I execute them sequentially?
> All the processors show 100% usage, so I'd expect to make
> faster progress through the jobs, but it's not so.
>
>
> Toby
>
> Csound mailing list
> Csound@listserv.heanet.ie
> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
> Send bugs reports to
>       https://github.com/csound/csound/issues
> Discussions of bugs and features can be posted here

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Date2024-01-14 11:14
FromJohn ff
SubjectRe: [Csnd] [EXTERNAL] Re: [Csnd] Ryzen vs M1
Are you getting conflicts with writing to the same output? 

⁣==John ffitch ​

On 14 Jan 2024, 02:14, at 02:14, Tobiah  wrote:
>On 1/13/24 16:41, Victor Lazzarini wrote:
>>   Csound can run in parallel with the -j flag. Whether it runs faster
>or slower depends on the csd, ksmps, and processor.
>
>In this case it more than doubled the execution time.
>
>Still, if I run them in the background:
>
>csound bench.csd &
>csound bench.csd &
>csound bench.csd &
>...
>
>
>Shouldn't I do better than when I execute them sequentially?
>All the processors show 100% usage, so I'd expect to make
>faster progress through the jobs, but it's not so.
>
>
>Toby
>
>Csound mailing list
>Csound@listserv.heanet.ie
>https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
>Send bugs reports to
>        https://github.com/csound/csound/issues
>Discussions of bugs and features can be posted here

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Date2024-01-14 11:25
FromToby Shepard
SubjectRe: [Csnd] [EXTERNAL] Re: [Csnd] Ryzen vs M1
I'm using -n for no sample output  and I tried running the jobs in separate directories.  Also, the cpu's all show 100% use (by htop) for the duration, so it doesn't seem like blocking.

On Sun, Jan 14, 2024, 3:14 AM John ff <jpff@codemist.co.uk> wrote:
Are you getting conflicts with writing to the same output?

⁣==John ffitch ​

On 14 Jan 2024, 02:14, at 02:14, Tobiah <toby@tobiah.org> wrote:
>On 1/13/24 16:41, Victor Lazzarini wrote:
>>   Csound can run in parallel with the -j flag. Whether it runs faster
>or slower depends on the csd, ksmps, and processor.
>
>In this case it more than doubled the execution time.
>
>Still, if I run them in the background:
>
>csound bench.csd &
>csound bench.csd &
>csound bench.csd &
>...
>
>
>Shouldn't I do better than when I execute them sequentially?
>All the processors show 100% usage, so I'd expect to make
>faster progress through the jobs, but it's not so.
>
>
>Toby
>
>Csound mailing list
>Csound@listserv.heanet.ie
>https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
>Send bugs reports to
>        https://github.com/csound/csound/issues
>Discussions of bugs and features can be posted here

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here
Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Date2024-01-14 19:12
FromVictor Lazzarini
SubjectRe: [Csnd] [EXTERNAL] Re: [Csnd] Ryzen vs M1
Because of the fact that there's inter thread communication and synchronisation and that has a cost. So it's a ratio between computation and communication. If more time is spent on the latter, the gains of computing in parallel may be lost.

Prof. Victor Lazzarini
Maynooth University
Ireland

> On 14 Jan 2024, at 17:32, Tobiah  wrote:
>
> On 1/14/2024 2:09 AM, Victor Lazzarini wrote:
>> Not necessarily, it depends on the CSD and ksmps.
>
> Why would that be?  I would have thought that whatever
> the process, running simultaneous instances on different
> CPU's would be faster than runing them consecutively.
> Otherwise, why do we have multiple cores?
>
>>> Still, if I run them in the background:
>>>
>>> csound bench.csd &
>>> csound bench.csd &
>>> csound bench.csd &
>>> ...
>
> Csound mailing list
> Csound@listserv.heanet.ie
> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
> Send bugs reports to
>       https://github.com/csound/csound/issues
> Discussions of bugs and features can be posted here

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Date2024-01-14 19:53
Fromjohn
SubjectRe: [Csnd] [EXTERNAL] Re: [Csnd] Ryzen vs M1
You did not count the code for deciding when it is safe to run in parallel

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Date2024-01-14 19:54
FromMichael Gogins
SubjectRe: [Csnd] [EXTERNAL] Re: [Csnd] Ryzen vs M1
What Victor says is correct. In addition, there will be less synchronization and less setup by the multithreading infrastructure if there are multiple copies of one instr template instead one copy each of multiple instr templates, and also if notes or other events are longer rather than shorter. So, a .csd with multiple longer notes by fewer instr definitions will speed up quite a bit with -j, whereas a .csd with shorter notes by more instr definitions will not speed as much, or even run more slowly than a single thread. You have to experiment to see what works.

IIrreducible Productions
http://michaelgogins.tumblr.com
Michael dot Gogins at gmail dot com


On Sun, Jan 14, 2024 at 2:12 PM Victor Lazzarini <Victor.Lazzarini@mu.ie> wrote:
Because of the fact that there's inter thread communication and synchronisation and that has a cost. So it's a ratio between computation and communication. If more time is spent on the latter, the gains of computing in parallel may be lost.

Prof. Victor Lazzarini
Maynooth University
Ireland

> On 14 Jan 2024, at 17:32, Tobiah <toby@tobiah.org> wrote:
>
> On 1/14/2024 2:09 AM, Victor Lazzarini wrote:
>> Not necessarily, it depends on the CSD and ksmps.
>
> Why would that be?  I would have thought that whatever
> the process, running simultaneous instances on different
> CPU's would be faster than runing them consecutively.
> Otherwise, why do we have multiple cores?
>
>>> Still, if I run them in the background:
>>>
>>> csound bench.csd &
>>> csound bench.csd &
>>> csound bench.csd &
>>> ...
>
> Csound mailing list
> Csound@listserv.heanet.ie
> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
> Send bugs reports to
>       https://github.com/csound/csound/issues
> Discussions of bugs and features can be posted here

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here
Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Date2024-01-14 20:43
FromTobiah
SubjectRe: [Csnd] [EXTERNAL] Re: [Csnd] Ryzen vs M1
But you guys understand that I'm talking about running multiple
instances of csound in separate processes, right?

   csound batch.csd &
   csound batch.csd &
   csound batch.csd &
   ...

It seems to me that it shouldn't matter what the content of
the background process is.  It should run faster concurrently
then it does consecutively if multiple cores have any use at all.



On 1/14/2024 11:54 AM, Michael Gogins wrote:
> What Victor says is correct. In addition, there will be less 
> synchronization and less setup by the multithreading infrastructure if 
> there are multiple copies of one instr template instead one copy each of 
> multiple instr templates, and also if notes or other events are longer 
> rather than shorter. So, a .csd with multiple longer notes by fewer 
> instr definitions will speed up quite a bit with -j, whereas a .csd with 
> shorter notes by more instr definitions will not speed as much, or even 
> run more slowly than a single thread. You have to experiment to see what 
> works.
> 
> IIrreducible Productions
> http://michaelgogins.tumblr.com 
> Michael dot Gogins at gmail dot com
> 
> 
> On Sun, Jan 14, 2024 at 2:12 PM Victor Lazzarini  > wrote:
> 
>     Because of the fact that there's inter thread communication and
>     synchronisation and that has a cost. So it's a ratio between
>     computation and communication. If more time is spent on the latter,
>     the gains of computing in parallel may be lost.
> 
>     Prof. Victor Lazzarini
>     Maynooth University
>     Ireland
> 
>      > On 14 Jan 2024, at 17:32, Tobiah      > wrote:
>      >
>      > On 1/14/2024 2:09 AM, Victor Lazzarini wrote:
>      >> Not necessarily, it depends on the CSD and ksmps.
>      >
>      > Why would that be?  I would have thought that whatever
>      > the process, running simultaneous instances on different
>      > CPU's would be faster than runing them consecutively.
>      > Otherwise, why do we have multiple cores?
>      >
>      >>> Still, if I run them in the background:
>      >>>
>      >>> csound bench.csd &
>      >>> csound bench.csd &
>      >>> csound bench.csd &
>      >>> ...
>      >
>      > Csound mailing list
>      > Csound@listserv.heanet.ie 
>      > https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
>     
>      > Send bugs reports to
>      > https://github.com/csound/csound/issues
>     
>      > Discussions of bugs and features can be posted here
> 
>     Csound mailing list
>     Csound@listserv.heanet.ie 
>     https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
>     
>     Send bugs reports to
>     https://github.com/csound/csound/issues
>     
>     Discussions of bugs and features can be posted here
> 
> Csound mailing list Csound@listserv.heanet.ie 
>  
> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND 
>  Send bugs reports to 
> https://github.com/csound/csound/issues 
>  Discussions of bugs and 
> features can be posted here

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Date2024-01-14 20:51
FromMichael Gogins
SubjectRe: [Csnd] [EXTERNAL] Re: [Csnd] Ryzen vs M1
Yes, of course, if the three processes are not communicating with each other. 

On Sun, Jan 14, 2024, 15:43 Tobiah <toby@tobiah.org> wrote:
But you guys understand that I'm talking about running multiple
instances of csound in separate processes, right?

   csound batch.csd &
   csound batch.csd &
   csound batch.csd &
   ...

It seems to me that it shouldn't matter what the content of
the background process is.  It should run faster concurrently
then it does consecutively if multiple cores have any use at all.



On 1/14/2024 11:54 AM, Michael Gogins wrote:
> What Victor says is correct. In addition, there will be less
> synchronization and less setup by the multithreading infrastructure if
> there are multiple copies of one instr template instead one copy each of
> multiple instr templates, and also if notes or other events are longer
> rather than shorter. So, a .csd with multiple longer notes by fewer
> instr definitions will speed up quite a bit with -j, whereas a .csd with
> shorter notes by more instr definitions will not speed as much, or even
> run more slowly than a single thread. You have to experiment to see what
> works.
>
> IIrreducible Productions
> http://michaelgogins.tumblr.com <http://michaelgogins.tumblr.com>
> Michael dot Gogins at gmail dot com
>
>
> On Sun, Jan 14, 2024 at 2:12 PM Victor Lazzarini <Victor.Lazzarini@mu.ie
> <mailto:Victor.Lazzarini@mu.ie>> wrote:
>
>     Because of the fact that there's inter thread communication and
>     synchronisation and that has a cost. So it's a ratio between
>     computation and communication. If more time is spent on the latter,
>     the gains of computing in parallel may be lost.
>
>     Prof. Victor Lazzarini
>     Maynooth University
>     Ireland
>
>      > On 14 Jan 2024, at 17:32, Tobiah <toby@tobiah.org
>     <mailto:toby@tobiah.org>> wrote:
>      >
>      > On 1/14/2024 2:09 AM, Victor Lazzarini wrote:
>      >> Not necessarily, it depends on the CSD and ksmps.
>      >
>      > Why would that be?  I would have thought that whatever
>      > the process, running simultaneous instances on different
>      > CPU's would be faster than runing them consecutively.
>      > Otherwise, why do we have multiple cores?
>      >
>      >>> Still, if I run them in the background:
>      >>>
>      >>> csound bench.csd &
>      >>> csound bench.csd &
>      >>> csound bench.csd &
>      >>> ...
>      >
>      > Csound mailing list
>      > Csound@listserv.heanet.ie <mailto:Csound@listserv.heanet.ie>
>      > https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
>     <https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND>
>      > Send bugs reports to
>      > https://github.com/csound/csound/issues
>     <https://github.com/csound/csound/issues>
>      > Discussions of bugs and features can be posted here
>
>     Csound mailing list
>     Csound@listserv.heanet.ie <mailto:Csound@listserv.heanet.ie>
>     https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
>     <https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND>
>     Send bugs reports to
>     https://github.com/csound/csound/issues
>     <https://github.com/csound/csound/issues>
>     Discussions of bugs and features can be posted here
>
> Csound mailing list Csound@listserv.heanet.ie
> <mailto:Csound@listserv.heanet.ie>
> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
> <https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND> Send bugs reports to
> https://github.com/csound/csound/issues
> <https://github.com/csound/csound/issues> Discussions of bugs and
> features can be posted here

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here
Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Date2024-01-14 22:03
FromVictor Lazzarini
SubjectRe: [Csnd] [EXTERNAL] Re: [Csnd] Ryzen vs M1
I thought you were talking about the -j flag.

Now what you'd expect is that if each process runs on a separate processor, then they should run faster than three sequential processes.

However, I am not sure it is a given that the OS will always do that allocation. All is guaranteed I think is that they will be given slices of a processor's time but not where this will be placed. Intuitively, you'd think they'd run on separate cores but since they are just another set of processes alongside all the others your computer is running, I am not certain this is always the case.

I actually don't know the answer.

You could try parallel on linux, I think you could have more control of processes with it.

Prof. Victor Lazzarini
Maynooth University
Ireland

> On 14 Jan 2024, at 20:43, Tobiah  wrote:
>
> But you guys understand that I'm talking about running multiple
> instances of csound in separate processes, right?
>
>  csound batch.csd &
>  csound batch.csd &
>  csound batch.csd &
>  ...
>
> It seems to me that it shouldn't matter what the content of
> the background process is.  It should run faster concurrently
> then it does consecutively if multiple cores have any use at all.
>
>
>
>> On 1/14/2024 11:54 AM, Michael Gogins wrote:
>> What Victor says is correct. In addition, there will be less synchronization and less setup by the multithreading infrastructure if there are multiple copies of one instr template instead one copy each of multiple instr templates, and also if notes or other events are longer rather than shorter. So, a .csd with multiple longer notes by fewer instr definitions will speed up quite a bit with -j, whereas a .csd with shorter notes by more instr definitions will not speed as much, or even run more slowly than a single thread. You have to experiment to see what works.
>> IIrreducible Productions
>> http://michaelgogins.tumblr.com/ 
>> Michael dot Gogins at gmail dot com
>> On Sun, Jan 14, 2024 at 2:12 PM Victor Lazzarini > wrote:
>>    Because of the fact that there's inter thread communication and
>>    synchronisation and that has a cost. So it's a ratio between
>>    computation and communication. If more time is spent on the latter,
>>    the gains of computing in parallel may be lost.
>>    Prof. Victor Lazzarini
>>    Maynooth University
>>    Ireland
>>     > On 14 Jan 2024, at 17:32, Tobiah >    > wrote:
>>     >
>>     > On 1/14/2024 2:09 AM, Victor Lazzarini wrote:
>>     >> Not necessarily, it depends on the CSD and ksmps.
>>     >
>>     > Why would that be?  I would have thought that whatever
>>     > the process, running simultaneous instances on different
>>     > CPU's would be faster than runing them consecutively.
>>     > Otherwise, why do we have multiple cores?
>>     >
>>     >>> Still, if I run them in the background:
>>     >>>
>>     >>> csound bench.csd &
>>     >>> csound bench.csd &
>>     >>> csound bench.csd &
>>     >>> ...
>>     >
>>     > Csound mailing list
>>     > Csound@listserv.heanet.ie 
>>     > https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
>>    
>>     > Send bugs reports to
>>     > https://github.com/csound/csound/issues
>>    
>>     > Discussions of bugs and features can be posted here
>>    Csound mailing list
>>    Csound@listserv.heanet.ie 
>>    https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
>>    
>>    Send bugs reports to
>>    https://github.com/csound/csound/issues
>>    
>>    Discussions of bugs and features can be posted here
>> Csound mailing list Csound@listserv.heanet.ie  https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND  Send bugs reports to https://github.com/csound/csound/issues  Discussions of bugs and features can be posted here
>
> Csound mailing list
> Csound@listserv.heanet.ie
> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
> Send bugs reports to
>       https://github.com/csound/csound/issues
> Discussions of bugs and features can be posted here

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Date2024-01-15 01:31
FromTobiah
SubjectRe: [Csnd] [EXTERNAL] Re: [Csnd] Ryzen vs M1
On 1/14/2024 2:09 AM, Victor Lazzarini wrote:
> Not necessarily, it depends on the CSD and ksmps.

Why would that be?  I would have thought that whatever
the process, running simultaneous instances on different
CPU's would be faster than runing them consecutively.
Otherwise, why do we have multiple cores?

>> Still, if I run them in the background:
>>
>> csound bench.csd &
>> csound bench.csd &
>> csound bench.csd &
>> ...

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Date2024-01-15 10:47
FromOeyvind Brandtsegg
SubjectRe: [Csnd] Ryzen vs M1
Hi Toby,

It is an interesting result, comparing the M to the Ryzen.
Running it here on an intel i7 (12th gen at 2300 MHz with 14 cores) I get
bench.csd : 25 sec
bench2.csd: 10 sec
So, something in between your two results for both cases.

Now, if I try to use the -j flag, it seems to always slow down by quite a lot. I tried -j4 and -j12, both running significantly slower.
But also noteworthy is that several coreas are active even if I do not use the -j flag. 
Perhaps the OS is now distributing the processing load "under the hood"? ... which would also mean that one can not compare two processors running on different OS'es (which you do if I understand correctly?)

Another thing I noted is that Csound seems to take some more time "cleaning up" when using the -j flag. I can see the cores being busy with the processing falling back to zero, then 3-4 other cores ramp up and it keeps that situation for some seconds before Csound returns. This is all just eyeballing the CPU meters, nothing really precise

My takeaway is that multicore load balancing is a complex issue (as also Victor and Michael noted), but it is interesting to see that the M processors are not all the magic they promised when they came out ;-)

all best
Øyvind

lør. 13. jan. 2024 kl. 20:14 skrev Tobiah <toby@tobiah.org>:
I decided to test csound speed on a new (to me) Thinkpad.  I was surprised
that it greatly outperformed an M1 Macbook air.  The Thinkpad has an
AMD Ryzen 7 PRO 6850U.

Here are two .csd files I used for testing:

https://drive.google.com/drive/folders/1f3y1vDk-q_QQclv--_AYRkrd0BOTQP6t

bench.csd

         Thinkpad        9.8 seconds
         M1              38.8 seconds

bench2.csd
         Thinkpad        8.0  seconds
         M1              21.0 seconds

I was surpised that the Ryzen 7 compared so favorably to the
M1, given all the hype about M1 performance.  Is it just weak
at floating point, or perhaps they throttle the cpu since the
Macbook Air has no fans?

I was also surprised that when comparing bench.csd, the macbook
took almost 4 times as long, whereas with bench2.csd, it took a
little over 2.5 times a long.  Maybe the score sorting needed
for bench.csd took longer?

What numbers are you getting with your machine?


Another curiosity that I thought the list might be able to
help with, was that I don't seem to gain any speed by running
multiple csound instances in parallel.   I tried this:

         for each in `seq 1 8`; do
                 csound bench.csd &
         done

I see 8 cpu's churning, but it takes about the same
amount of time as it would have had I run the processes
consecutively.  The Ryzen is an 8-core 16-thread cpu.
I also tried running these concurrent processes in different
directories, in case the score.srt was getting tied up, but
there was no difference.


Thanks,


Toby

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here
Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Date2024-01-15 14:18
FromSteven Yi
SubjectRe: [Csnd] Ryzen vs M1
This is interesting. I think we need to worry more about diagnosing the single-instance vs. multi-instance. Could you tell us what version of Csound you were using? Also, just to verify, for macOS the version you are using is compiled for Arm?

On Sat, Jan 13, 2024 at 2:14 PM Tobiah <toby@tobiah.org> wrote:
I decided to test csound speed on a new (to me) Thinkpad.  I was surprised
that it greatly outperformed an M1 Macbook air.  The Thinkpad has an
AMD Ryzen 7 PRO 6850U.

Here are two .csd files I used for testing:

https://drive.google.com/drive/folders/1f3y1vDk-q_QQclv--_AYRkrd0BOTQP6t

bench.csd

         Thinkpad        9.8 seconds
         M1              38.8 seconds

bench2.csd
         Thinkpad        8.0  seconds
         M1              21.0 seconds

I was surpised that the Ryzen 7 compared so favorably to the
M1, given all the hype about M1 performance.  Is it just weak
at floating point, or perhaps they throttle the cpu since the
Macbook Air has no fans?

I was also surprised that when comparing bench.csd, the macbook
took almost 4 times as long, whereas with bench2.csd, it took a
little over 2.5 times a long.  Maybe the score sorting needed
for bench.csd took longer?

What numbers are you getting with your machine?


Another curiosity that I thought the list might be able to
help with, was that I don't seem to gain any speed by running
multiple csound instances in parallel.   I tried this:

         for each in `seq 1 8`; do
                 csound bench.csd &
         done

I see 8 cpu's churning, but it takes about the same
amount of time as it would have had I run the processes
consecutively.  The Ryzen is an 8-core 16-thread cpu.
I also tried running these concurrent processes in different
directories, in case the score.srt was getting tied up, but
there was no difference.


Thanks,


Toby

Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here
Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here

Date2024-01-15 15:38
FromVictor Lazzarini
SubjectRe: [Csnd] [EXTERNAL] [Csnd] Ryzen vs M1
bench.csd on my M1 run for 26s
bench2.csd : 17s.

Running three processes of bench.csd on the same terminal with & all completed within 36s (35, 35 and 36), so yes they
seem to run in parallel a lot faster than sequentially, although slower than a single process.

This is csound 6.18 as released by us.
========================
Prof. Victor Lazzarini
Maynooth University
Ireland

> On 15 Jan 2024, at 10:47, Oeyvind Brandtsegg  wrote:
> 
> *Warning*
> This email originated from outside of Maynooth University's Mail System. Do not reply, click links or open attachments unless you recognise the sender and know the content is safe.
> Hi Toby, 
> 
> It is an interesting result, comparing the M to the Ryzen.
> Running it here on an intel i7 (12th gen at 2300 MHz with 14 cores) I get
> bench.csd : 25 sec
> bench2.csd: 10 sec
> So, something in between your two results for both cases.
> 
> Now, if I try to use the -j flag, it seems to always slow down by quite a lot. I tried -j4 and -j12, both running significantly slower.
> But also noteworthy is that several coreas are active even if I do not use the -j flag. 
> Perhaps the OS is now distributing the processing load "under the hood"? ... which would also mean that one can not compare two processors running on different OS'es (which you do if I understand correctly?)
> 
> Another thing I noted is that Csound seems to take some more time "cleaning up" when using the -j flag. I can see the cores being busy with the processing falling back to zero, then 3-4 other cores ramp up and it keeps that situation for some seconds before Csound returns. This is all just eyeballing the CPU meters, nothing really precise
> 
> My takeaway is that multicore load balancing is a complex issue (as also Victor and Michael noted), but it is interesting to see that the M processors are not all the magic they promised when they came out ;-)
> 
> all best
> Øyvind
> 
> lør. 13. jan. 2024 kl. 20:14 skrev Tobiah :
> I decided to test csound speed on a new (to me) Thinkpad.  I was surprised
> that it greatly outperformed an M1 Macbook air.  The Thinkpad has an
> AMD Ryzen 7 PRO 6850U.
> 
> Here are two .csd files I used for testing:
> 
> https://drive.google.com/drive/folders/1f3y1vDk-q_QQclv--_AYRkrd0BOTQP6t
> 
> bench.csd
> 
>          Thinkpad        9.8 seconds
>          M1              38.8 seconds
> 
> bench2.csd
>          Thinkpad        8.0  seconds
>          M1              21.0 seconds
> 
> I was surpised that the Ryzen 7 compared so favorably to the
> M1, given all the hype about M1 performance.  Is it just weak
> at floating point, or perhaps they throttle the cpu since the
> Macbook Air has no fans?
> 
> I was also surprised that when comparing bench.csd, the macbook
> took almost 4 times as long, whereas with bench2.csd, it took a
> little over 2.5 times a long.  Maybe the score sorting needed
> for bench.csd took longer?
> 
> What numbers are you getting with your machine?
> 
> 
> Another curiosity that I thought the list might be able to
> help with, was that I don't seem to gain any speed by running
> multiple csound instances in parallel.   I tried this:
> 
>          for each in `seq 1 8`; do
>                  csound bench.csd &
>          done
> 
> I see 8 cpu's churning, but it takes about the same
> amount of time as it would have had I run the processes
> consecutively.  The Ryzen is an 8-core 16-thread cpu.
> I also tried running these concurrent processes in different
> directories, in case the score.srt was getting tied up, but
> there was no difference.
> 
> 
> Thanks,
> 
> 
> Toby
> 
> Csound mailing list
> Csound@listserv.heanet.ie
> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
> Send bugs reports to
>         https://github.com/csound/csound/issues
> Discussions of bugs and features can be posted here
> Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here


Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here

Date2024-01-15 16:15
FromVictor Lazzarini
SubjectRe: [Csnd] [EXTERNAL] [Csnd] Ryzen vs M1
I also noted a couple of things

These files have ’ksms = 1’, did you want to test ksmps=1? It’s running at ksmps=10 (default)

Bench2.csd time seems to be dominated by instantiation, since it only runs for 1s and has to instantiate
10000 instruments. Running it for 2,  3 and 4s we get 25, 33, 41 process times, which give 17, 12.5, 11, 10.25 cpu/realtime ratio
(so it gets better). For 10s we get 88secs of running, so this goes down further to 8.8.

This is again with the released csound 6.18 (csound.com ).
========================
Prof. Victor Lazzarini
Maynooth University
Ireland

> On 15 Jan 2024, at 10:47, Oeyvind Brandtsegg  wrote:
> 
> *Warning*
> This email originated from outside of Maynooth University's Mail System. Do not reply, click links or open attachments unless you recognise the sender and know the content is safe.
> Hi Toby, 
> 
> It is an interesting result, comparing the M to the Ryzen.
> Running it here on an intel i7 (12th gen at 2300 MHz with 14 cores) I get
> bench.csd : 25 sec
> bench2.csd: 10 sec
> So, something in between your two results for both cases.
> 
> Now, if I try to use the -j flag, it seems to always slow down by quite a lot. I tried -j4 and -j12, both running significantly slower.
> But also noteworthy is that several coreas are active even if I do not use the -j flag. 
> Perhaps the OS is now distributing the processing load "under the hood"? ... which would also mean that one can not compare two processors running on different OS'es (which you do if I understand correctly?)
> 
> Another thing I noted is that Csound seems to take some more time "cleaning up" when using the -j flag. I can see the cores being busy with the processing falling back to zero, then 3-4 other cores ramp up and it keeps that situation for some seconds before Csound returns. This is all just eyeballing the CPU meters, nothing really precise
> 
> My takeaway is that multicore load balancing is a complex issue (as also Victor and Michael noted), but it is interesting to see that the M processors are not all the magic they promised when they came out ;-)
> 
> all best
> Øyvind
> 
> lør. 13. jan. 2024 kl. 20:14 skrev Tobiah :
> I decided to test csound speed on a new (to me) Thinkpad.  I was surprised
> that it greatly outperformed an M1 Macbook air.  The Thinkpad has an
> AMD Ryzen 7 PRO 6850U.
> 
> Here are two .csd files I used for testing:
> 
> https://drive.google.com/drive/folders/1f3y1vDk-q_QQclv--_AYRkrd0BOTQP6t
> 
> bench.csd
> 
>          Thinkpad        9.8 seconds
>          M1              38.8 seconds
> 
> bench2.csd
>          Thinkpad        8.0  seconds
>          M1              21.0 seconds
> 
> I was surpised that the Ryzen 7 compared so favorably to the
> M1, given all the hype about M1 performance.  Is it just weak
> at floating point, or perhaps they throttle the cpu since the
> Macbook Air has no fans?
> 
> I was also surprised that when comparing bench.csd, the macbook
> took almost 4 times as long, whereas with bench2.csd, it took a
> little over 2.5 times a long.  Maybe the score sorting needed
> for bench.csd took longer?
> 
> What numbers are you getting with your machine?
> 
> 
> Another curiosity that I thought the list might be able to
> help with, was that I don't seem to gain any speed by running
> multiple csound instances in parallel.   I tried this:
> 
>          for each in `seq 1 8`; do
>                  csound bench.csd &
>          done
> 
> I see 8 cpu's churning, but it takes about the same
> amount of time as it would have had I run the processes
> consecutively.  The Ryzen is an 8-core 16-thread cpu.
> I also tried running these concurrent processes in different
> directories, in case the score.srt was getting tied up, but
> there was no difference.
> 
> 
> Thanks,
> 
> 
> Toby
> 
> Csound mailing list
> Csound@listserv.heanet.ie
> https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
> Send bugs reports to
>         https://github.com/csound/csound/issues
> Discussions of bugs and features can be posted here
> Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here


Csound mailing list
Csound@listserv.heanet.ie
https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND
Send bugs reports to
        https://github.com/csound/csound/issues
Discussions of bugs and features can be posted here