Csound Csound-dev Csound-tekno Search About

[Csnd-dev] Hash Table Updated

Date2019-04-07 00:40
FromSteven Yi
Subject[Csnd-dev] Hash Table Updated
Attachmentstest.csd  
Hi All,

In response the reported issue with Csound's hash table reported by Eduardo, I modified the table implementation to do resizing based on load factor.  The attached test.csd performance went from ~9 seconds before change and down to 0.9 seconds after the change.  

I've pushed to code to a branch new_hash_table, visible at:


The hash table is used in various places in the code base including channels, maintaining the opcode definitions, and during parsing. The impact probably shouldn't be noticeable for small amounts of items in the table (< 6144 items), but has an impact once you go above that. There is a cost when inserting and table expansion conditions are met, but it should be worth it overall for reads/writes at the larger sizes. 

Could someone review the code?  Unit tests passed here and basic usage is working alright via live coding and Blue, but I'd like a second pair of eyes before merging.  

Thanks,
Steven


Date2019-04-07 01:08
FromEduardo Moguillansky
SubjectRe: [Csnd-dev] Hash Table Updated
When you rehash the table after resizing all pointers cached are 
invalid, but I don't see any code checking this. But maybe I missed 
something

On So, Apr 7, 2019 at 1:40 AM, Steven Yi  wrote:
> Hi All,
> 
> In response the reported issue with Csound's hash table reported by 
> Eduardo, I modified the table implementation to do resizing based on 
> load factor.  The attached test.csd performance went from ~9 seconds 
> before change and down to 0.9 seconds after the change.
> 
> I've pushed to code to a branch new_hash_table, visible at:
> 
> https://github.com/csound/csound/tree/new_hash_table
> 
> The hash table is used in various places in the code base including 
> channels, maintaining the opcode definitions, and during parsing. The 
> impact probably shouldn't be noticeable for small amounts of items in 
> the table (< 6144 items), but has an impact once you go above that. 
> There is a cost when inserting and table expansion conditions are 
> met, but it should be worth it overall for reads/writes at the larger 
> sizes.
> 
> Could someone review the code?  Unit tests passed here and basic 
> usage is working alright via live coding and Blue, but I'd like a 
> second pair of eyes before merging.
> 
> Thanks,
> Steven
> 

Date2019-04-07 01:30
FromEduardo Moguillansky
SubjectRe: [Csnd-dev] Hash Table Updated
Thanks for this. When you resize the table all cached pointers in the 
channel opcodes become invalid, but there is no code checking for this. 
But I did not take a look at the implementation of the hash table, so 
maybe there is some other code ensuring that the pointers are still 
valid after reallocation.

On 07.04.19 01:40, Steven Yi wrote:
> Hi All,
>
> In response the reported issue with Csound's hash table reported by 
> Eduardo, I modified the table implementation to do resizing based on 
> load factor.  The attached test.csd performance went from ~9 seconds 
> before change and down to 0.9 seconds after the change.
>
> I've pushed to code to a branch new_hash_table, visible at:
>
> https://github.com/csound/csound/tree/new_hash_table
>
> The hash table is used in various places in the code base including 
> channels, maintaining the opcode definitions, and during parsing. The 
> impact probably shouldn't be noticeable for small amounts of items in 
> the table (< 6144 items), but has an impact once you go above that. 
> There is a cost when inserting and table expansion conditions are met, 
> but it should be worth it overall for reads/writes at the larger sizes.
>
> Could someone review the code?  Unit tests passed here and basic usage 
> is working alright via live coding and Blue, but I'd like a second 
> pair of eyes before merging.
>
> Thanks,
> Steven

Date2019-04-07 02:04
FromSteven Yi
SubjectRe: [Csnd-dev] Hash Table Updated
Attachmentstest.csd  
I don't see how channel pointers become invalid. Their addresses are stored as the value pointer in the CS_HASH_TABLE_ITEM's which are transferred from the old table to the newly resized one.  

This updated test.csd does assigns an incrementing value for each channel created. It does a second pass to read the values and print it out. The correct values are printed for each channel. 



On Sat, Apr 6, 2019 at 8:30 PM Eduardo Moguillansky <eduardo.moguillansky@gmail.com> wrote:
Thanks for this. When you resize the table all cached pointers in the
channel opcodes become invalid, but there is no code checking for this.
But I did not take a look at the implementation of the hash table, so
maybe there is some other code ensuring that the pointers are still
valid after reallocation.

On 07.04.19 01:40, Steven Yi wrote:
> Hi All,
>
> In response the reported issue with Csound's hash table reported by
> Eduardo, I modified the table implementation to do resizing based on
> load factor.  The attached test.csd performance went from ~9 seconds
> before change and down to 0.9 seconds after the change.
>
> I've pushed to code to a branch new_hash_table, visible at:
>
> https://github.com/csound/csound/tree/new_hash_table
>
> The hash table is used in various places in the code base including
> channels, maintaining the opcode definitions, and during parsing. The
> impact probably shouldn't be noticeable for small amounts of items in
> the table (< 6144 items), but has an impact once you go above that.
> There is a cost when inserting and table expansion conditions are met,
> but it should be worth it overall for reads/writes at the larger sizes.
>
> Could someone review the code?  Unit tests passed here and basic usage
> is working alright via live coding and Blue, but I'd like a second
> pair of eyes before merging.
>
> Thanks,
> Steven
>

Date2019-04-07 11:56
FromEduardo Moguillansky
SubjectRe: [Csnd-dev] Hash Table Updated
Attachmentstestdict.csd  test2.csd  
Of course, I misunderstood what was being cached in the chn opcodes. It 
does run ~ 10x faster, thanks! The hashtable implementation I used in 
the dict opcodes is still around 2x faster than this new hashtable, 
which probably confirms that open-adressing is faster than chaining, 
but this is fast enough now.

On a side note, would it be possible to add a chndel opcode to delete a 
channel?

On So, Apr 7, 2019 at 3:04 AM, Steven Yi  wrote:
> I don't see how channel pointers become invalid. Their addresses are 
> stored as the value pointer in the CS_HASH_TABLE_ITEM's which are 
> transferred from the old table to the newly resized one.
> 
> This updated test.csd does assigns an incrementing value for each 
> channel created. It does a second pass to read the values and print 
> it out. The correct values are printed for each channel.
> 
> 
> 
> On Sat, Apr 6, 2019 at 8:30 PM Eduardo Moguillansky 
>  wrote:
>> Thanks for this. When you resize the table all cached pointers in the
>>  channel opcodes become invalid, but there is no code checking for 
>> this.
>>  But I did not take a look at the implementation of the hash table, 
>> so
>>  maybe there is some other code ensuring that the pointers are still
>>  valid after reallocation.
>> 
>>  On 07.04.19 01:40, Steven Yi wrote:
>>  > Hi All,
>>  >
>>  > In response the reported issue with Csound's hash table reported 
>> by
>>  > Eduardo, I modified the table implementation to do resizing based 
>> on
>>  > load factor.  The attached test.csd performance went from ~9 
>> seconds
>>  > before change and down to 0.9 seconds after the change.
>>  >
>>  > I've pushed to code to a branch new_hash_table, visible at:
>>  >
>>  > https://github.com/csound/csound/tree/new_hash_table
>>  >
>>  > The hash table is used in various places in the code base 
>> including
>>  > channels, maintaining the opcode definitions, and during parsing. 
>> The
>>  > impact probably shouldn't be noticeable for small amounts of 
>> items in
>>  > the table (< 6144 items), but has an impact once you go above 
>> that.
>>  > There is a cost when inserting and table expansion conditions are 
>> met,
>>  > but it should be worth it overall for reads/writes at the larger 
>> sizes.
>>  >
>>  > Could someone review the code?  Unit tests passed here and basic 
>> usage
>>  > is working alright via live coding and Blue, but I'd like a second
>>  > pair of eyes before merging.
>>  >
>>  > Thanks,
>>  > Steven
>>  >


Date2019-09-18 04:00
FromSteven Yi
SubjectRe: [Csnd-dev] Hash Table Updated
Hi Eduardo,

Thanks for the reply and glad to that the performance gains worked out
in the end, at least to usable levels. Certainly nice to know it can
be improved too, would be good to have an issue filed so we don't miss
out revisit this in the future. (Of course, contributions from anyone
who has time very welcome!)

I had looked at the code again and found it alright, but I realized we
discussed the validity of the pointers on April 6th, maybe it got
lost? The reply mentioned:

"I don't see how channel pointers become invalid. Their addresses are
stored as the value pointer in the CS_HASH_TABLE_ITEM's which are
transferred from the old table to the newly resized one.

This updated test.csd does assigns an incrementing value for each
channel created. It does a second pass to read the values and print it
out. The correct values are printed for each channel. "

The code for the channel system's use of the table is in OOps/bus.c,
would be happy if anyone double-checked.

All best,
Steven

On Sat, Sep 14, 2019 at 8:03 PM Eduardo Moguillansky
 wrote:
>
> When you rehash the table after resizing all pointers cached are
> invalid, but I don't see any code checking this. But maybe I missed
> something
>
> On So, Apr 7, 2019 at 1:40 AM, Steven Yi  wrote:
> > Hi All,
> >
> > In response the reported issue with Csound's hash table reported by
> > Eduardo, I modified the table implementation to do resizing based on
> > load factor.  The attached test.csd performance went from ~9 seconds
> > before change and down to 0.9 seconds after the change.
> >
> > I've pushed to code to a branch new_hash_table, visible at:
> >
> > https://github.com/csound/csound/tree/new_hash_table
> >
> > The hash table is used in various places in the code base including
> > channels, maintaining the opcode definitions, and during parsing. The
> > impact probably shouldn't be noticeable for small amounts of items in
> > the table (< 6144 items), but has an impact once you go above that.
> > There is a cost when inserting and table expansion conditions are
> > met, but it should be worth it overall for reads/writes at the larger
> > sizes.
> >
> > Could someone review the code?  Unit tests passed here and basic
> > usage is working alright via live coding and Blue, but I'd like a
> > second pair of eyes before merging.
> >
> > Thanks,
> > Steven
> >

Date2019-09-18 07:32
FromEduardo Moguillansky
SubjectRe: [Csnd-dev] Hash Table Updated
dear steven, this email was wrongly resent when an old email client 
resynched with the imap server, sorry for the noise. I am in fact very 
happy with the new has table implementation

best regards,

eduardo

On 18.09.19 05:00, Steven Yi wrote:
> Hi Eduardo,
>
> Thanks for the reply and glad to that the performance gains worked out
> in the end, at least to usable levels. Certainly nice to know it can
> be improved too, would be good to have an issue filed so we don't miss
> out revisit this in the future. (Of course, contributions from anyone
> who has time very welcome!)
>
> I had looked at the code again and found it alright, but I realized we
> discussed the validity of the pointers on April 6th, maybe it got
> lost? The reply mentioned:
>
> "I don't see how channel pointers become invalid. Their addresses are
> stored as the value pointer in the CS_HASH_TABLE_ITEM's which are
> transferred from the old table to the newly resized one.
>
> This updated test.csd does assigns an incrementing value for each
> channel created. It does a second pass to read the values and print it
> out. The correct values are printed for each channel. "
>
> The code for the channel system's use of the table is in OOps/bus.c,
> would be happy if anyone double-checked.
>
> All best,
> Steven
>
> On Sat, Sep 14, 2019 at 8:03 PM Eduardo Moguillansky
>  wrote:
>> When you rehash the table after resizing all pointers cached are
>> invalid, but I don't see any code checking this. But maybe I missed
>> something
>>
>> On So, Apr 7, 2019 at 1:40 AM, Steven Yi  wrote:
>>> Hi All,
>>>
>>> In response the reported issue with Csound's hash table reported by
>>> Eduardo, I modified the table implementation to do resizing based on
>>> load factor.  The attached test.csd performance went from ~9 seconds
>>> before change and down to 0.9 seconds after the change.
>>>
>>> I've pushed to code to a branch new_hash_table, visible at:
>>>
>>> https://github.com/csound/csound/tree/new_hash_table
>>>
>>> The hash table is used in various places in the code base including
>>> channels, maintaining the opcode definitions, and during parsing. The
>>> impact probably shouldn't be noticeable for small amounts of items in
>>> the table (< 6144 items), but has an impact once you go above that.
>>> There is a cost when inserting and table expansion conditions are
>>> met, but it should be worth it overall for reads/writes at the larger
>>> sizes.
>>>
>>> Could someone review the code?  Unit tests passed here and basic
>>> usage is working alright via live coding and Blue, but I'd like a
>>> second pair of eyes before merging.
>>>
>>> Thanks,
>>> Steven
>>>

Date2019-09-18 15:51
FromSteven Yi
SubjectRe: [Csnd-dev] Hash Table Updated
Ah, that makes sense. ;)

Cheers!
Steven

On Wed, Sep 18, 2019 at 2:33 AM Eduardo Moguillansky
 wrote:
>
> dear steven, this email was wrongly resent when an old email client
> resynched with the imap server, sorry for the noise. I am in fact very
> happy with the new has table implementation
>
> best regards,
>
> eduardo
>
> On 18.09.19 05:00, Steven Yi wrote:
> > Hi Eduardo,
> >
> > Thanks for the reply and glad to that the performance gains worked out
> > in the end, at least to usable levels. Certainly nice to know it can
> > be improved too, would be good to have an issue filed so we don't miss
> > out revisit this in the future. (Of course, contributions from anyone
> > who has time very welcome!)
> >
> > I had looked at the code again and found it alright, but I realized we
> > discussed the validity of the pointers on April 6th, maybe it got
> > lost? The reply mentioned:
> >
> > "I don't see how channel pointers become invalid. Their addresses are
> > stored as the value pointer in the CS_HASH_TABLE_ITEM's which are
> > transferred from the old table to the newly resized one.
> >
> > This updated test.csd does assigns an incrementing value for each
> > channel created. It does a second pass to read the values and print it
> > out. The correct values are printed for each channel. "
> >
> > The code for the channel system's use of the table is in OOps/bus.c,
> > would be happy if anyone double-checked.
> >
> > All best,
> > Steven
> >
> > On Sat, Sep 14, 2019 at 8:03 PM Eduardo Moguillansky
> >  wrote:
> >> When you rehash the table after resizing all pointers cached are
> >> invalid, but I don't see any code checking this. But maybe I missed
> >> something
> >>
> >> On So, Apr 7, 2019 at 1:40 AM, Steven Yi  wrote:
> >>> Hi All,
> >>>
> >>> In response the reported issue with Csound's hash table reported by
> >>> Eduardo, I modified the table implementation to do resizing based on
> >>> load factor.  The attached test.csd performance went from ~9 seconds
> >>> before change and down to 0.9 seconds after the change.
> >>>
> >>> I've pushed to code to a branch new_hash_table, visible at:
> >>>
> >>> https://github.com/csound/csound/tree/new_hash_table
> >>>
> >>> The hash table is used in various places in the code base including
> >>> channels, maintaining the opcode definitions, and during parsing. The
> >>> impact probably shouldn't be noticeable for small amounts of items in
> >>> the table (< 6144 items), but has an impact once you go above that.
> >>> There is a cost when inserting and table expansion conditions are
> >>> met, but it should be worth it overall for reads/writes at the larger
> >>> sizes.
> >>>
> >>> Could someone review the code?  Unit tests passed here and basic
> >>> usage is working alright via live coding and Blue, but I'd like a
> >>> second pair of eyes before merging.
> >>>
> >>> Thanks,
> >>> Steven
> >>>