| Paul Winkler wrote:
> It's much too big, I'm afraid. The whole damn thing (jpff's complete
> mail archives from feb. 1997 to april 1998, html-ized with date, thread,
> and author indeces) runs to a whopping 16,666 kilobytes. It's the
> mailbox of the beast. The same stuff in original non-html mailbox
> format is still 8257 kb. There's a total of 3,296 messages. I guess we
> talk a lot.
>
> I could shave a bit of that size off by having the various navigational
> links appear only at the top (or bottom ?) of each message, rather than
> at both top and bottom. But that would be pretty minimal help.
>
> Incidentally, the Linux version of netscape really can't deal with the
> size of the index pages. It freezes after loading 64% of the date index,
> or 68% of the thread index. The threads index is 382 k, the author and
> date indeces are 505 k each. Lynx reads 'em just fine, but nobody cares
> about that, right?
> I also booted into Win95 and tried accessing them via Netscape 3.0...
> reads the indexes okay. But I haven't yet found a good way to copy
> 3,000+ files from one partition to another :)
>
> But even if we limited the archive to e.g. the last six months, it's
> still a good 8 megs or so... anyone have that much free web space
> floating around???
>
> Alternately, offer an amount of space and I'll figure out how much of
> the archive could fit, starting with more recent messages.
>
> The whole HTML archive tarred and compressed with gzip makes for a 2723
> KB file, which might be reasonable to put on an ftp server somewhere, if
> people prefer to download the archive and access it locally.
>
> There is one odd detail about the HTML archive, which I probably won't
> bother to fix unless someone requests it: although there is an author
> index, it is not linked to from each message page. I figure that having
> an author index at all is not that important... what would you use it
> for, unless you perhaps haven't had your daily dosage of =cw4t7abs ?
>
> Well, even if no one else ever sees the whole thing, at least I have a
> nice browsable archive on my local machine now...
>
> best,
>
> PW
>
> >Date: Tue, 12 May 1998 07:48:47 -0400
> >From: Jean Piche
> >Paul Winkler wrote:
> >>
> >> Success!
> >>
> >> I now have managed to create an HTML archive of this mailing list,
> (snip)
> >> Now all we need is someone to give these files a home on the web!
> >>
> >> Any volunteers? Dave T., are you still with us?
>
> >How big is the full package?
I don't think that having them in HTML is as important as having the plain
text messages indexed either by an internal or external search engine.
Since it appears that there are problems with the overall size of the
archive, it makes sense to leave them in text form to keep it small and to
also break up each message into its own file so that browsers don't need to
load such a huge file. That way we could initiate a search with a few
keywords (plus the archive site name if we're talking about an external
search engine) and get all the messages that we're actually interested in.
Having hyperlinked message threads is nice, but alternatively a searchable
archive would suffice.
Anybody know how to get a search engine company to point at an archive?
-jsb
|