[Csnd-dev] CUDA opcodes (again)
Date | 2022-10-29 15:31 |
From | Anders Genell |
Subject | [Csnd-dev] CUDA opcodes (again) |
Hi! I just got a new laptop at work which has a T500 nvidia graphics card and I managed to install all the cuda bells and whistles for it with some effort. Now I am trying to have the csound cuda opcodes built, but seem to run into problems. cmake finds the cuda installation allright, but there seems to be some trouble with the CUDA/CMakeLists.tx file. I get the following error: CMake Error at CUDA/CMakeLists.txt:20 (set_target_properties): set_target_properties called with incorrect number of arguments. Call Stack (most recent call first): CUDA/CMakeLists.txt:65 (make_cuda_plugin) repeated for every call to 'make_cuda_plugin' in the CMakeLists.txt file Is there something I have done wrong? Best regards, Anders |
Date | 2022-10-29 17:35 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
I guess the CMakeLists.txt might need to be fixed for the version of CMake you have. No wonder, it's about 10 years old.
Prof. Victor Lazzarini
Maynooth University
Ireland
On 29 Oct 2022, at 15:33, Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-10-29 17:45 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Well, one shouldn’t rush in to these things… I’m sure another few years should only make it more stable and mature. 29 okt. 2022 kl. 18:35 skrev Victor Lazzarini <Victor.Lazzarini@mu.ie>:
|
Date | 2022-10-29 18:05 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
10 years of code rot, soon it will be mature like blue cheese. Prof. Victor Lazzarini
Maynooth University
Ireland
On 29 Oct 2022, at 17:46, Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-10-29 18:21 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
A nice glass of chianti and some well-fermented vintage code is something of an acquired taste, but for the initiated it is among the most exquisite. On Sat, Oct 29, 2022 at 7:05 PM Victor Lazzarini <Victor.Lazzarini@mu.ie> wrote:
|
Date | 2022-10-29 19:08 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
So, according to an answer here: https://itecnote.com/tecnote/cmake-complains-about-wrong-number-of-arguments/ sometimes symbols are resolved to empty strings, which appear as missing arguments to cmake. Adding quotes around the symbols semms to have done the trick from set_target_properties(${libname} PROPERTIES RUNTIME_OUTPUT_DIRECTORY ${BUILD_PLUGINS_DIR} LIBRARY_OUTPUT_DIRECTORY ${BUILD_PLUGINS_DIR} ARCHIVE_OUTPUT_DIRECTORY ${BUILD_PLUGINS_DIR}) to set_target_properties(${libname} PROPERTIES RUNTIME_OUTPUT_DIRECTORY "${BUILD_PLUGINS_DIR}" LIBRARY_OUTPUT_DIRECTORY "${BUILD_PLUGINS_DIR}" ARCHIVE_OUTPUT_DIRECTORY "${BUILD_PLUGINS_DIR}") Now for the actual compiling... Regards, /A On Sat, Oct 29, 2022 at 7:21 PM Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-10-29 19:28 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
I predict more code rot ahead. Prof. Victor Lazzarini
Maynooth University
Ireland
On 29 Oct 2022, at 19:10, Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-10-29 19:35 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
No worries, I already have a bottle of cabarnet in the cooler. I was out of chianti it seems. 29 okt. 2022 kl. 20:28 skrev Victor Lazzarini <Victor.Lazzarini@mu.ie>:
|
Date | 2022-10-30 15:07 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Quick update - I updated the CMakeLists.txt to specifically point to the csound include library in order to get rid of an error about not finding .h files and then it seems to have built the opcodes all right. I cannot seem to be able to get csound to find them however. I set the OPCODE6DIR as well as the OPCODE6DIR64 to point to where the plugins are, but csound does not find them, and specifying —opcode-lib on the command line didn’t make any difference either. I shall make sure I have built the opcodes against the correct version of csound, but I don’t think I have more than one installed… Any ideas are greatly appreciated. Regards, Anders 29 okt. 2022 kl. 20:28 skrev Victor Lazzarini <Victor.Lazzarini@mu.ie>:
|
Date | 2022-10-30 18:16 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
A couple of things:
1) opcodes do not link to Csound. You only need the headers to build them.
2) with --opcode-lib= you need the full path so if the library is in the same directory you have
--opcode-lib=./library.so
etc
That one should make Csound acknowledge the library has been found.
Prof. Victor Lazzarini
Maynooth University
Ireland
On 30 Oct 2022, at 15:09, Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-10-30 18:24 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
1) The headers were found so all should be fine on that front 2) I did set the full path to the plugins but to no avail. It seems though that the cuda opcodes are built as static .a libs if that matters in any way. I will continue to investigate if there is something I have missed. Regards, /A sön 30 okt. 2022 kl. 19:16 skrev Victor Lazzarini <Victor.Lazzarini@mu.ie>:
|
Date | 2022-10-30 18:41 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
No, that sounds wrong. They can't be static libs. There's something not right with the cmake build.
The result should be a .so lib on Linux, a .dylib on MacOS and .dll on Windows. Anything else is wrong.
Prof. Victor Lazzarini
Maynooth University
Ireland
On 30 Oct 2022, at 18:25, Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-10-30 18:42 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Ok, that is not something I think I will manage to figure out… sön 30 okt. 2022 kl. 19:41 skrev Victor Lazzarini <Victor.Lazzarini@mu.ie>:
|
Date | 2022-10-30 18:51 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
alternatively you could ditch cmake and build the opcodes without. There is a script there showing how to do in on macos, it should be a case of adapting it to your linux paths and library name.
Prof. Victor Lazzarini
Maynooth University
Ireland
On 30 Oct 2022, at 18:44, Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-10-30 19:06 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
I’ll give that a go and see what I can manage. Thanks! Regards, /A sön 30 okt. 2022 kl. 19:52 skrev Victor Lazzarini <Victor.Lazzarini@mu.ie>:
|
Date | 2022-10-30 20:02 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Ok, after a bit of googling and a bit of hacking I changed to build.sh script as follows: #!/bin/sh # use -Xptxas="-v" to check register usage and --maxrregcount 32 to limit it echo "building cuda opcodes ..." #nvcc -O3 -shared -o libcudaop1.dylib adsyn.cu -use_fast_math -I../../debug/CsoundLib64.framework/Headers -arch=sm_30 -I/usr/local/cuda/include -L/usr/local/cuda/lib #nvcc -O3 -shared -o libcudaop2.dylib pvsops.cu -use_fast_math -g -I../../debug/CsoundLib64.framework/Headers -arch=sm_30 -I/usr/local/cuda/include -L/usr/local/cuda/lib -lcufft #nvcc -O3 -shared -o libcudaop3.dylib slidingm.cu -use_fast_math -I../../debug/CsoundLib64.framework/Headers -arch=sm_30 -I/usr/local/cuda/include -L/usr/local/cuda/lib #nvcc -O3 -shared -o libcudaop4.dylib conv.cu -I../../debug/CsoundLib64.framework/Headers -arch=sm_30 -I/usr/local/cuda/include -L/usr/local/cuda/lib #nvcc -O3 -shared -o libcudaop5.dylib pconv.cu -I../../debug/CsoundLib64.framework/Headers -arch=sm_30 -I/usr/local/cuda/include -L/usr/local/cuda/lib -lcufft nvcc -O3 -Xcompiler -fPIC -shared -o libcudaop1.so adsyn.cu -use_fast_math -I/usr/local/include/csound -arch=sm_75 -I/usr/local/cuda/include -L/usr/local/cuda/lib nvcc -O3 -Xcompiler -fPIC -shared -o libcudaop2.so pvsops.cu -use_fast_math -g -I/usr/local/include/csound -arch=sm_75 -I/usr/local/cuda/include -L/usr/local/cuda/lib -lcufft nvcc -O3 -Xcompiler -fPIC -shared -o libcudaop3.so slidingm.cu -use_fast_math -I/usr/local/include/csound -arch=sm_75 -I/usr/local/cuda/include -L/usr/local/cuda/lib nvcc -O3 -Xcompiler -fPIC -shared -o libcudaop4.so conv.cu -I/usr/local/include/csound -arch=sm_75 -I/usr/local/cuda/include -L/usr/local/cuda/lib nvcc -O3 -Xcompiler -fPIC -shared -o libcudaop5.so pconv.cu -I/usr/local/include/csound -arch=sm_75 -I/usr/local/cuda/include -L/usr/local/cuda/lib -lcufft echo "...done" Before I added -Xcompiler -fPIC I got errors saying /usr/bin/ld: /tmp/tmpxft_0000e6be_00000000-11_adsyn.o: warning: relocation against `_Z6samplePfS_fPlS_if' in read-only section `.text.startup' /usr/bin/ld: /tmp/tmpxft_0000e6be_00000000-11_adsyn.o: relocation R_X86_64_PC32 against symbol `_Z6samplePfS_fPlS_if' can not be used when making a shared object; recompile with -fPIC /usr/bin/ld: final link failed: bad value collect2: error: ld returned 1 exit status but some googling said what to add and then the build ran fine. cudapconv seems to work, and if checking with nvidia-smi while running csound with cudapconv I get the following output: +-----------------------------------------------------------------------------+ | NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA T500 On | 00000000:01:00.0 Off | N/A | | N/A 48C P8 N/A / N/A | 84MiB / 4096MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 2648 G /usr/bin/gnome-shell 1MiB | | 0 N/A N/A 59817 C csound 80MiB | +-----------------------------------------------------------------------------+ So it seems to work, but there are no guarantees there aren't any nasties I missed so someone who actually knows what they're doing should definately have a look at things... Thank you again! Regards, Anders On Sun, Oct 30, 2022 at 8:06 PM Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-10-30 20:08 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Hm. I may have spoken too soon. It seems there are no segfaults but there is also no sound... Score finishes with overall amps of 0.0 Oh well... Regards, /A On Sun, Oct 30, 2022 at 9:02 PM Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-10-30 21:18 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
no sound could be to do with the impulse response (if it's cudapconv). If you look at the csd there is ftconv there to test alongside it.
You should be able to swap one for the other and test.
Prof. Victor Lazzarini
Maynooth University
Ireland
On 30 Oct 2022, at 20:09, Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-10-31 16:27 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Well, unfortunately ftconv gives me plenty of amplitude, so it seems to be the cudapconv opcode that doesn't work properly here... Regards, Anders On Sun, Oct 30, 2022 at 10:18 PM Victor Lazzarini <Victor.Lazzarini@mu.ie> wrote:
|
Date | 2022-10-31 17:45 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
maybe check that the build options match your device Prof. Victor Lazzarini
Maynooth University
Ireland
On 31 Oct 2022, at 16:29, Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-10-31 18:32 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Ah, yes… I wonder how/where I can find what sm_XX setting to use? If XX should correspond to compute ability I believe it is correct now, as I have compute ability 7.5 (according to nvidia-smi) and have set sm_75 in the build options. Are there other possibilities? Regards, Anders 31 okt. 2022 kl. 18:45 skrev Victor Lazzarini <Victor.Lazzarini@mu.ie>:
|
Date | 2022-10-31 19:04 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
I think you would need to check the latest CUDA manual to see.
If it's building without any warnings, then I would expect the code to work. You can try the other opcodes to see if any works.
I think the last time I had a nvidia setup to work with was 2016, so I don't really know but all was working then. If I have a bit of time, I'll see if there's anything out of date in that code.
best
Prof. Victor Lazzarini
Maynooth University
Ireland
On 31 Oct 2022, at 18:34, Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-10-31 19:13 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
ok I think I spotted a need for updating, maybe you can try this:
Line 244 of pconv.cu
{"cudapconv", sizeof(PCONV),0, 5, "a", "aii", (SUBR) pconv_init, NULL,
(SUBR) pconv_perf},
should be
{"cudapconv", sizeof(PCONV),0, 5, "a", "aii", (SUBR) pconv_init, (SUBR) pconv_perf},
See if that makes any difference.
Prof. Victor Lazzarini
Maynooth University
Ireland
On 31 Oct 2022, at 18:34, Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-10-31 20:47 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Alas, no. Still amps 0.0 However, I also tried the other cuda opcode examples and lo and behold, the cuda_pvs_example.csd resulted in sound output, and it also announced the use of the nvidia device: UnifiedCSD: cuda_pvs_example.csd Elapsed time at end of orchestra compile: real: 0.001s, CPU: 0.001s sorting score ... ... done Elapsed time at end of score sort: real: 0.001s, CPU: 0.001s --Csound version 6.17 (double samples) May 29 2022 [commit: 95007834d8afc68b9ae465a6f65b93b23ae41709] libsndfile-1.1.0 graphics suppressed, ascii substituted sr = 44100.0, kr = 689.062, ksmps = 64 0dBFS level = 1.0, A4 tuning = 440.0 orch now loaded audio buffered in 256 sample-frame blocks ALSA output: total buffer size: 1024, period size: 256 writing 256 sample blks of 64-bit floats to dac SECTION 1: new alloc for instr 1: diskin2: opened 'SpaceFunk20_91bpm_mono.wav': 48000 Hz, 1 channel(s), 506373 sample frames CUDAnal: using device NVIDIA T500 (capability 7.5) CUDAsynth: using device NVIDIA T500 (capability 7.5) B 0.000 .. 60.000 T 60.000 TT 60.000 M: 1.00535 number of samples out of range: 36 B 60.000 .. 60.011 T 60.011 TT 60.011 M: 0.69522 Score finished in csoundPerform(). inactive allocs returned to freespace end of score. overall amps: 1.00535 overall samples out of range: 36 0 errors in performance Elapsed time at end of performance: real: 60.062s, CPU: 2.196s 10338 256 sample blks of 64-bit floats written to dac *************************************************************** The cuda_sliding_example.csd announced the use of the nvidia device but outputs no sound: UnifiedCSD: cuda_sliding_example.csd Loading command-line libraries: Elapsed time at end of orchestra compile: real: 0.001s, CPU: 0.001s sorting score ... ... done Elapsed time at end of score sort: real: 0.001s, CPU: 0.001s --Csound version 6.17 (double samples) May 29 2022 [commit: 95007834d8afc68b9ae465a6f65b93b23ae41709] libsndfile-1.1.0 graphics suppressed, ascii substituted sr = 44100.0, kr = 689.062, ksmps = 64 0dBFS level = 1.0, A4 tuning = 440.0 orch now loaded audio buffered in 256 sample-frame blocks ALSA output: total buffer size: 1024, period size: 256 writing 256 sample blks of 64-bit floats to dac SECTION 1: new alloc for instr 1: diskin2: opened 'SpaceFunk20_91bpm.wav': 48000 Hz, 2 channel(s), 506373 sample frames Sliding PV: using floats on device NVIDIA T500 (capability 7.5) B 0.000 .. 60.000 T 60.000 TT 60.000 M: 0.00000 B 60.000 .. 60.011 T 60.011 TT 60.011 M: 0.00000 Score finished in csoundPerform(). inactive allocs returned to freespace end of score. overall amps: 0.00000 overall samples out of range: 0 0 errors in performance Elapsed time at end of performance: real: 60.064s, CPU: 2.208s 10338 256 sample blks of 64-bit floats written to dac *************************************************************** Finally, cudadsyn_pvs_example.csd behaves like the cudapconv example; no announcement about nvidia device and no sound: UnifiedCSD: cudadsyn_pvs_example.csd Loading command-line libraries: Elapsed time at end of orchestra compile: real: 0.001s, CPU: 0.001s sorting score ... ... done Elapsed time at end of score sort: real: 0.001s, CPU: 0.001s --Csound version 6.17 (double samples) May 29 2022 [commit: 95007834d8afc68b9ae465a6f65b93b23ae41709] libsndfile-1.1.0 graphics suppressed, ascii substituted sr = 44100.0, kr = 689.062, ksmps = 64 0dBFS level = 1.0, A4 tuning = 440.0 orch now loaded audio buffered in 256 sample-frame blocks ALSA output: total buffer size: 1024, period size: 256 writing 256 sample blks of 64-bit floats to dac SECTION 1: new alloc for instr 1: diskin2: opened 'SpaceFunk20_91bpm.wav': 48000 Hz, 2 channel(s), 506373 sample frames B 0.000 .. 60.000 T 60.000 TT 60.000 M: 0.00000 B 60.000 .. 60.011 T 60.011 TT 60.011 M: 0.00000 Score finished in csoundPerform(). inactive allocs returned to freespace end of score. overall amps: 0.00000 overall samples out of range: 0 0 errors in performance Elapsed time at end of performance: real: 60.049s, CPU: 3.513s 10338 256 sample blks of 64-bit floats written to dac Regards, Anders On Mon, Oct 31, 2022 at 8:13 PM Victor Lazzarini <Victor.Lazzarini@mu.ie> wrote:
|
Date | 2022-10-31 21:41 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
The sliding may be too slow for realtime still, so perhaps that's why.
I was thinking the changes I gave should have been needed to be applied to get any sound, because of some changes in Csound but maybe that's not the case.
At least there is some success. The fact that there is no printout in the adsyn example is that it doesn't print anything anyway. The same for pconv.
Now, these two examples exist in 2 versions, the pconv.cu and adsyn.cu, and the pconv11.cu and adsyn11.cu. I think the difference is that the former use some atomic builtins that did not exist in some versions of cuda.
You could try building the 11 versions to see if they do anything.
Prof. Victor Lazzarini
Maynooth University
Ireland
On 31 Oct 2022, at 20:49, Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-11-02 12:04 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Adsyn11.cu seems to build properly but pconv11.cu complains a little: pconv11.cu(147): error: identifier "CUFFT_COMPATIBILITY_NATIVE" is undefined pconv11.cu(147): error: identifier "cufftSetCompatibilityMode" is undefined Regards, Anders On Mon, Oct 31, 2022 at 10:41 PM Victor Lazzarini <Victor.Lazzarini@mu.ie> wrote:
|
Date | 2022-11-02 13:04 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
yes, they’re for older CUDA versions so it’s probably why. Anyway, I think the pconv.cu should work as is. So what I did was to add some tracing to the code and so we will know at least whether it is actually running. If you could pull again and try pconv.cu and running the cudapconv opcode. Note that the cudapconv is like ftconv, asig cudapcov ain, ifn, iparts so the partition size is the last argument. ======================== Prof. Victor Lazzarini Maynooth University Ireland > On 2 Nov 2022, at 12:04, Anders Genell |
Date | 2022-11-02 16:17 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Thanks! Right now I’m running a csound score that actually uses ftconv, but the laptop is booted to windows, and I need to let it finish writing sound to disk (it will take another 3hrs or so) since it will be used as stimuli for a study on the effects of noise on sleep beginning next week. I’ll reboot into linux and try pulling and building as soon as it’s done. Regards, Anders ons 2 nov. 2022 kl. 14:04 skrev Victor Lazzarini <Victor.Lazzarini@mu.ie>: yes, they’re for older CUDA versions so it’s probably why. |
Date | 2022-11-02 17:24 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
don’t worry, whenever you next look at it. ======================== Prof. Victor Lazzarini Maynooth University Ireland > On 2 Nov 2022, at 16:17, Anders Genell |
Date | 2022-11-03 12:48 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Ok, here's some output from my latest test. UnifiedCSD: cudapconv.csd Elapsed time at end of orchestra compile: real: 0.001s, CPU: 0.001s sorting score ... ... done Elapsed time at end of score sort: real: 0.001s, CPU: 0.001s --Csound version 6.17 (double samples) May 29 2022 [commit: 95007834d8afc68b9ae465a6f65b93b23ae41709] libsndfile-1.1.0 graphics suppressed, ascii substituted sr = 44100.0, kr = 5512.500, ksmps = 8 0dBFS level = 32768.0, A4 tuning = 440.0 ftable 1: deferred alloc for pianoimpulseshort.wav audio sr = 44100, monaural opening WAV infile pianoimpulseshort.wav defer length 295621 ftable 1: 295621 points, scalemax 1.000 . . .._ . _. . _.___________'___.___-___-_-''____.'.-_______.________-_-__'____________________ _ . - - ''_ _-_.-'- ._'_-' ' _ _ .- '''''' '' '' '' _ - -' _ _ orch now loaded audio buffered in 256 sample-frame blocks ALSA output: total buffer size: 1024, period size: 256 writing 512 sample blks of 64-bit floats to dac SECTION 1: new alloc for instr 1: diskin2: opened './SpaceFunk20_91bpm_mono.wav': 48000 Hz, 1 channel(s), 506373 sample frames CUDA init: copy buffer 0 to deviceCUDA init: done transform 0CUDA init: copy buffer 1 to deviceCUDA init: done transform 1CUDA init: copy buffer 2 to deviceCUDA init: done transform 2CUDA init: copy buffer 3 to deviceCUDA init: done transform 3CUDA init: copy buffer 4 to deviceCUDA init: done transform 4CUDA init: copy buffer 5 to deviceCUDA init: done transform 5CUDA init: copy buffer 6 to deviceCUDA init: done transform 6CUDA init: copy buffer 7 to deviceCUDA init: done transform 7CUDA init: copy buffer 8 to deviceCUDA init: done transform 8CUDA init: copy buffer 9 to deviceCUDA init: done transform 9CUDA init: copy buffer 10 to deviceCUDA init: done transform 10CUDA init: copy buffer 11 to deviceCUDA init: done transform 11CUDA init: copy buffer 12 to deviceCUDA init: done transform 12CUDA init: copy buffer 13 to deviceCUDA init: done transform 13CUDA init: copy buffer 14 to deviceCUDA init: done transform 14CUDA init: copy buffer 15 to deviceCUDA init: done transform 15CUDA init: copy buffer 16 to deviceCUDA init: done transform 16CUDA init: copy buffer 17 to deviceCUDA init: done transform 17CUDA init: copy buffer 18 to deviceCUDA init: done transform 18CUDApconv: using device NVIDIA T500 (capability 7.5) B 0.000 .. 60.000 T 60.000 TT 60.000 M: 0.0 0.0 Score finished in csoundPerform(). inactive allocs returned to freespace end of score. overall amps: 0.0 0.0 overall samples out of range: 0 0 0 errors in performance Elapsed time at end of performance: real: 61.393s, CPU: 2.344s 10336 512 sample blks of 64-bit floats written to dac Regards, Anders On Wed, Nov 2, 2022 at 6:25 PM Victor Lazzarini <Victor.Lazzarini@mu.ie> wrote: don’t worry, whenever you next look at it. |
Date | 2022-11-03 13:16 |
From | Richard Dobson |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Just lurking on this thread -interested, but I have no Nvidia devices available to play with. This example uses 0dbfs=32768, while others you posted set it to 1.0. Not a diagnosis as such... but if outvalues are say normalised floats, they will basically disappear with 0dbfs=32768, and whatever would be left would not be audible. Richard Dobson On 03/11/2022 12:48, Anders Genell wrote: > Ok, here's some output from my latest test. > > UnifiedCSD: cudapconv.csd > Elapsed time at end of orchestra compile: real: 0.001s, CPU: 0.001s > sorting score ... > ... done > Elapsed time at end of score sort: real: 0.001s, CPU: 0.001s > --Csound version 6.17 (double samples) May 29 2022 > [commit: 95007834d8afc68b9ae465a6f65b93b23ae41709] > libsndfile-1.1.0 > graphics suppressed, ascii substituted > sr = 44100.0, kr = 5512.500, ksmps = 8 > 0dBFS level = 32768.0, A4 tuning = 440.0 > ftable 1: > deferred alloc for pianoimpulseshort.wav > audio sr = 44100, monaural > opening WAV infile pianoimpulseshort.wav > defer length 295621 > ftable 1: 295621 points, scalemax 1.000 > . > . .. |
Date | 2022-11-03 13:27 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
yep, so it’s running the init but it’s not running CUDA performance. No wonder you have no audio out. I’ve added some further tracing. Let me know what happens if anything. ======================== Prof. Victor Lazzarini Maynooth University Ireland > On 3 Nov 2022, at 12:48, Anders Genell |
Date | 2022-11-03 15:46 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
New output after last pull: UnifiedCSD: cudapconv.csd Elapsed time at end of orchestra compile: real: 0.001s, CPU: 0.001s sorting score ... ... done Elapsed time at end of score sort: real: 0.001s, CPU: 0.001s --Csound version 6.17 (double samples) May 29 2022 [commit: 95007834d8afc68b9ae465a6f65b93b23ae41709] libsndfile-1.1.0 graphics suppressed, ascii substituted sr = 44100.0, kr = 5512.500, ksmps = 8 0dBFS level = 32768.0, A4 tuning = 440.0 ftable 1: deferred alloc for pianoimpulseshort.wav audio sr = 44100, monaural opening WAV infile pianoimpulseshort.wav defer length 295621 ftable 1: 295621 points, scalemax 1.000 . . .._ . _. . _.___________'___.___-___-_-''____.'.-_______.________-_-__'____________________ _ . - - ''_ _-_.-'- ._'_-' ' _ _ .- '''''' '' '' '' _ - -' _ _ orch now loaded audio buffered in 256 sample-frame blocks writing 1024-byte blks of shorts to cuda.wav (WAV) SECTION 1: new alloc for instr 1: diskin2: opened './SpaceFunk20_91bpm_mono.wav': 48000 Hz, 1 channel(s), 506373 sample frames CUDA init: copy buffer 0 to deviceCUDA init: done transform 0CUDA init: copy buffer 1 to deviceCUDA init: done transform 1CUDA init: copy buffer 2 to deviceCUDA init: done transform 2CUDA init: copy buffer 3 to deviceCUDA init: done transform 3CUDA init: copy buffer 4 to deviceCUDA init: done transform 4CUDA init: copy buffer 5 to deviceCUDA init: done transform 5CUDA init: copy buffer 6 to deviceCUDA init: done transform 6CUDA init: copy buffer 7 to deviceCUDA init: done transform 7CUDA init: copy buffer 8 to deviceCUDA init: done transform 8CUDA init: copy buffer 9 to deviceCUDA init: done transform 9CUDA init: copy buffer 10 to deviceCUDA init: done transform 10CUDA init: copy buffer 11 to deviceCUDA init: done transform 11CUDA init: copy buffer 12 to deviceCUDA init: done transform 12CUDA init: copy buffer 13 to deviceCUDA init: done transform 13CUDA init: copy buffer 14 to deviceCUDA init: done transform 14CUDA init: copy buffer 15 to deviceCUDA init: done transform 15CUDA init: copy buffer 16 to deviceCUDA init: done transform 16CUDA init: copy buffer 17 to deviceCUDA init: done transform 17CUDA init: copy buffer 18 to deviceCUDA init: done transform 18CUDApconv: using device NVIDIA T500 (capability 7.5) B 0.000 .. 60.000 T 60.000 TT 60.000 M: 0.0 0.0 Score finished in csoundPerform(). inactive allocs returned to freespace end of score. overall amps: 0.0 0.0 overall samples out of range: 0 0 0 errors in performance Elapsed time at end of performance: real: 1.559s, CPU: 0.244s 512 1024 sample blks of shorts written to cuda.wav (WAV) Regards, Anders On Thu, Nov 3, 2022 at 2:28 PM Victor Lazzarini <Victor.Lazzarini@mu.ie> wrote: yep, so it’s running the init but it’s not running CUDA performance. No wonder you have no audio out. |
Date | 2022-11-03 15:48 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Good point! However in this case using the exact same csd-file but with ftconv instead of cudapconv does produce sound, so there are other ghosts in the machine this time I think.. Regards, Anders On Thu, Nov 3, 2022 at 2:16 PM Richard Dobson <richard@rwdobson.com> wrote: Just lurking on this thread -interested, but I have no Nvidia devices |
Date | 2022-11-03 15:52 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Are you sure this is the new output? I added newlines to the messages csound->Message(csound,"CUDA init: copy buffer %d to device\n",i); etc but they are not there. I suspect this is the previous version. This is the commit commit c0f80c6dd862a037bfa53da221271ab44f638061 ======================== Prof. Victor Lazzarini Maynooth University Ireland > On 3 Nov 2022, at 15:46, Anders Genell |
Date | 2022-11-03 15:55 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Sorry, forgot to copy to the plugins dir... Here's the correct output: UnifiedCSD: cudapconv.csd Elapsed time at end of orchestra compile: real: 0.001s, CPU: 0.001s sorting score ... ... done Elapsed time at end of score sort: real: 0.001s, CPU: 0.001s --Csound version 6.17 (double samples) May 29 2022 [commit: 95007834d8afc68b9ae465a6f65b93b23ae41709] libsndfile-1.1.0 graphics suppressed, ascii substituted sr = 44100.0, kr = 5512.500, ksmps = 8 0dBFS level = 32768.0, A4 tuning = 440.0 ftable 1: deferred alloc for pianoimpulseshort.wav audio sr = 44100, monaural opening WAV infile pianoimpulseshort.wav defer length 295621 ftable 1: 295621 points, scalemax 1.000 . . .._ . _. . _.___________'___.___-___-_-''____.'.-_______.________-_-__'____________________ _ . - - ''_ _-_.-'- ._'_-' ' _ _ .- '''''' '' '' '' _ - -' _ _ orch now loaded audio buffered in 256 sample-frame blocks writing 1024-byte blks of shorts to cuda.wav (WAV) SECTION 1: new alloc for instr 1: diskin2: opened './SpaceFunk20_91bpm_mono.wav': 48000 Hz, 1 channel(s), 506373 sample frames CUDA init: copy buffer 0 to device CUDA init: done transform 0 CUDA init: copy buffer 1 to device CUDA init: done transform 1 CUDA init: copy buffer 2 to device CUDA init: done transform 2 CUDA init: copy buffer 3 to device CUDA init: done transform 3 CUDA init: copy buffer 4 to device CUDA init: done transform 4 CUDA init: copy buffer 5 to device CUDA init: done transform 5 CUDA init: copy buffer 6 to device CUDA init: done transform 6 CUDA init: copy buffer 7 to device CUDA init: done transform 7 CUDA init: copy buffer 8 to device CUDA init: done transform 8 CUDA init: copy buffer 9 to device CUDA init: done transform 9 CUDA init: copy buffer 10 to device CUDA init: done transform 10 CUDA init: copy buffer 11 to device CUDA init: done transform 11 CUDA init: copy buffer 12 to device CUDA init: done transform 12 CUDA init: copy buffer 13 to device CUDA init: done transform 13 CUDA init: copy buffer 14 to device CUDA init: done transform 14 CUDA init: copy buffer 15 to device CUDA init: done transform 15 CUDA init: copy buffer 16 to device CUDA init: done transform 16 CUDA init: copy buffer 17 to device CUDA init: done transform 17 CUDA init: copy buffer 18 to device CUDA init: done transform 18 CUDApconv: using device NVIDIA T500 (capability 7.5) B 0.000 .. 60.000 T 60.000 TT 60.000 M: 0.0 0.0 Score finished in csoundPerform(). inactive allocs returned to freespace end of score. overall amps: 0.0 0.0 overall samples out of range: 0 0 0 errors in performance Elapsed time at end of performance: real: 0.228s, CPU: 0.227s 512 1024 sample blks of shorts written to cuda.wav (WAV) Regards, Anders On Thu, Nov 3, 2022 at 4:53 PM Victor Lazzarini <Victor.Lazzarini@mu.ie> wrote: Are you sure this is the new output? I added newlines to the messages |
Date | 2022-11-03 15:59 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Ok, this confirms that the perform function is never called. I made a small change to OENTRY, see if it runs now. ======================== Prof. Victor Lazzarini Maynooth University Ireland > On 3 Nov 2022, at 15:55, Anders Genell |
Date | 2022-11-03 16:04 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
SUCCESS!! [...] CUDA execution: done copy to device done transform done convolution done inverse transform done overlap-add done copy from device pconv perf count: 0 pconv perf count: 8 pconv perf count: 16 pconv perf count: 24 pconv perf count: 32 [...] pconv perf count: 8160 pconv perf count: 8168 B 0.000 .. 60.000 T 60.000 TT 60.000 M:15759902.015759902.0 number of samples out of range: 2610308 2610308 Score finished in csoundPerform(). inactive allocs returned to freespace end of score. overall amps:15759902.015759902.0 overall samples out of range: 2610308 2610308 0 errors in performance Elapsed time at end of performance: real: 0.974s, CPU: 0.740s 512 1024 sample blks of shorts written to cuda.wav (WAV) well, I obviously need to scale amplitudes and so on, but otherwise it seems good! Thanks Victor! Regards, Anders On Thu, Nov 3, 2022 at 4:59 PM Victor Lazzarini <Victor.Lazzarini@mu.ie> wrote: Ok, this confirms that the perform function is never called. |
Date | 2022-11-03 16:13 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Ok, so let me get rid of the tracing now. ======================== Prof. Victor Lazzarini Maynooth University Ireland > On 3 Nov 2022, at 16:04, Anders Genell |
Date | 2022-11-03 18:57 |
From | Anders Genell |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Great! Right now it still says "done inverse transform" and "done convolution" for each partition - was that intentional? Regards, Anders On Thu, Nov 3, 2022 at 5:14 PM Victor Lazzarini <Victor.Lazzarini@mu.ie> wrote: Ok, so let me get rid of the tracing now. |
Date | 2022-11-03 21:21 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
Probably forgot to remove that... Prof. Victor Lazzarini
Maynooth University
Ireland
On 3 Nov 2022, at 18:59, Anders Genell <anders.genell@gmail.com> wrote:
|
Date | 2022-11-03 23:24 |
From | Victor Lazzarini |
Subject | Re: [Csnd-dev] [EXTERNAL] [Csnd-dev] CUDA opcodes (again) |
OK, all gone now hopefully. ======================== Prof. Victor Lazzarini Maynooth University Ireland > On 3 Nov 2022, at 18:57, Anders Genell |