[Csnd] Mailing list archive amalgamation attempt
Date | 2022-01-20 01:02 |
From | Richard Knight |
Subject | [Csnd] Mailing list archive amalgamation attempt |
The mailing list archives and the demise of nabble have been mentioned quite a bit - while there was a plan to incorporate all messages to forum.csound.com, I am not sure how this is going. Hence I have attempted to put together a site which incorporates everything I could find. It is at http://ml.csound.1bpm.net/ and is in a basic/testing stage at the moment, so any suggestions/ideas/bug reports etc are welcome. Messages are presented in threads so the originating message is shown in the overview and then the replies to that are (should be, if the email's message-id and reply-to headers are right) shown in the thread accordingly. Attachments and multipart messages ie html should be preserved OK (eg attachment in http://ml.csound.1bpm.net/thread/4975 ) The search functionality may be a bit patchy at the moment, especially the full text stuff, but I will try and optimise that or revisit at some point. The messages themselves are actually stored in a NNTP server, so you can connect directly with a newsreader to 1bpm.net and view the messages like that too. I have tried to redact full email addresses where possible via the web frontend and just show names. The sources I used are as follows: 2007-10 to 2014-09 http://gaule.cs.bath.ac.uk/Csound-archive/ 2017-10 to current Personal copies 2005 to current, but patchy Gmane There may be some messages missed so if anyone has any I can try and import them. I have been trying to get hold of the raw messages from HEANET which would cover 2015-10 to current, but not managed to yet. John mentioned about messages on codemist.co.uk from Feb 1997 to Nov 1999 , but I could not find them when I had a look around. The counts of messages per year on the site are: 2005 | 1563 2006 | 2818 2007 | 1995 2008 | 5194 2009 | 5645 2010 | 6100 2011 | 7073 2012 | 7289 2013 | 7285 2014 | 3870 2015 | 3245 2016 | 5763 2017 | 4062 2018 | 3220 2019 | 3010 2020 | 2840 2021 | 1467 2022 | 142 Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here |
Date | 2022-01-20 06:44 |
From | Victor Lazzarini |
Subject | Re: [Csnd] [EXTERNAL] [Csnd] Mailing list archive amalgamation attempt |
Great work. Would it make sense to host this in our csound.github.io site? I could give you access or you could do a PR if you prefer. Prof. Victor Lazzarini Maynooth University Ireland > On 20 Jan 2022, at 01:04, Richard Knight |
Date | 2022-01-20 13:33 |
From | John ff |
Subject | Re: [Csnd] Mailing list archive amalgamation attempt |
Can you read foremost.co.uk/cs_archive/
That is the 1997 emails
On 20 Jan 2022, at 01:03, Richard Knight <richard@1bpm.net> wrote: The mailing list archives and the demise of nabble have been mentioned |
Date | 2022-01-20 13:36 |
From | John ff |
Subject | Re: [Csnd] Mailing list archive amalgamation attempt |
Predictive text!!
On 20 Jan 2022, at 13:34, John ff <jpff@codemist.co.uk> wrote:
|
Date | 2022-01-20 13:46 |
From | Rory Walsh |
Subject | Re: [Csnd] Mailing list archive amalgamation attempt |
Thanks for this Richard. It's great to have all of these in one place again. I tried the search feature, and yeah, I got no hits on terms like oscil, metro, etc? On Thu, 20 Jan 2022 at 14:37, John ff <jpff@codemist.co.uk> wrote:
|
Date | 2022-01-21 21:23 |
From | Richard Knight |
Subject | Re: [Csnd] [EXTERNAL] [Csnd] Mailing list archive amalgamation attempt |
Yes, I think hosting it in the csound.github.io site is a good idea. However at the moment the pages on my site are generated on the fly, parsed from the raw messages, and the overviews (sender/subject etc) are in a postgres database along with an attempt at full text search indexing. As far as I understand github pages, they would have to be static (is that right?). This could be dealt with by generating them all as static html pages, at a rough estimate that would result in around 500MB+ of html (the raw messages take up 900MB in total - github pages limit appears to be 1GB per site, so that may be pushing it in future?) I am not sure how search functionality could be incorporated if the pages are all static, but I suppose eventually they would be indexed by search engines. On 2022-01-20 06:44, Victor Lazzarini wrote: > Great work. Would it make sense to host this in our csound.github.io > site? I could give you access or you could do a PR if you prefer. > > Prof. Victor Lazzarini > Maynooth University > Ireland > Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here |
Date | 2022-01-21 21:26 |
From | Richard Knight |
Subject | Re: [Csnd] Mailing list archive amalgamation attempt |
Thanks, unfortunately however I can see the directory listing but each of the files gives a 403 forbidden error - maybe the file permissions may need to be different, or something tweaking in the htaccess or apache config. On 2022-01-20 13:36, John ff wrote:
|
Date | 2022-01-21 21:33 |
From | Richard Knight |
Subject | Re: [Csnd] Mailing list archive amalgamation attempt |
Ah, yes, the body/subject/sender conditions were being combined with and instead of or - should work a bit better now. However, the full text search (body) is really slow, probably too slow to be usable. I will have a look and see if that can be optimised, but a better option may be to have the pages as static html and hosted on csound.github.io as Victor suggested, and let search engines take care of the indexing. On 2022-01-20 13:46, Rory Walsh wrote:
|
Date | 2022-01-21 23:08 |
From | Victor Lazzarini |
Subject | Re: [Csnd] [EXTERNAL] [Csnd] Mailing list archive amalgamation attempt |
actually not sure if they need to be static. Certainly the site is dynamic and pulls data from various sources in the repo, not sure if that can be leveraged for your archive. Maybe others will know more. Possibly the best thing is to give you access and let you poke around to see what may be achievable. Prof. Victor Lazzarini Maynooth University Ireland > On 21 Jan 2022, at 21:25, Richard Knight |
Date | 2022-01-22 18:00 |
From | john |
Subject | Re: [Csnd] Mailing list archive amalgamation attempt |
I changed prmissions etc so could you try again? I am not up to speed on web acess stuff so it may still be wrong. On Fri, 21 Jan 2022, Richard Knight wrote: > > Thanks, unfortunately however I can see the directory listing but each of the > files gives a 403 forbidden error - maybe the file permissions may need to be > different, or something tweaking in the htaccess or apache config. > > > > > On 2022-01-20 13:36, John ff wrote: > > Predictive text!! > > codemist.co.uk/cs_archive > > Get TypeApp for Android > On 20 Jan 2022, at 13:34, John ff |
Date | 2022-01-23 21:52 |
From | Richard Knight |
Subject | Re: [Csnd] Mailing list archive amalgamation attempt |
Still doesn't work unfortunately. Probably a few things to check with the apache config - can try and give some suggestions - or I could give you a login on one of my servers so you could scp them over to me, if that sounds easier? On 2022-01-22 18:00, john wrote: > I changed prmissions etc so could you try again? I am not up to speed > on web acess stuff so it may still be wrong. > Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here |
Date | 2022-01-29 16:37 |
From | Richard Knight |
Subject | Re: [Csnd] [EXTERNAL] [Csnd] Mailing list archive amalgamation attempt |
Yes, I think that would be useful. I've since had a look around at the repo at https://github.com/csound/csound.github.io , which looks to be the right one. As far as I understand, the site is served as static html, but it is built dynamically as such using Jekyll, from markdown files, and Github pages doesn't support server side scripting (ie python/databases/etc). Hence I think there are two fundamental options: - Generate the mailing list pages as static html (or markdown) and include in the repo/site. - Have something on the github.io site that uses javascript to interact with an api served by my server. I'd initially be inclined towards the first as it keeps everything in a central place and would be more performant. However concerns include space usage in the repo and the fact there would need to be some update schedule to keep the repo up to date with the mailing list. The drawback with the second is that I would still have ownership of the archive, and if messages were loaded dynamically with js then that would likely negatively affect search engine indexing. I'll have a look at generating my site as static html and see how much disk space that uses. On 2022-01-21 23:08, Victor Lazzarini wrote: > actually not sure if they need to > be static. Certainly the site is dynamic and pulls data from various > sources in the repo, not sure if that can be leveraged for your > archive. > > Maybe others will know more. Possibly the best thing is to give you > access and let you poke around to see what may be achievable. > > Prof. Victor Lazzarini > Maynooth University > Ireland > Csound mailing list Csound@listserv.heanet.ie https://listserv.heanet.ie/cgi-bin/wa?A0=CSOUND Send bugs reports to https://github.com/csound/csound/issues Discussions of bugs and features can be posted here |