5 years, 1 month and 18 days on from the last 'snapshot' archive made by Joao (no longer available on 'archive.org', it would appear)... here's another one.
It was made on a linux machine using 'wget', the command being:
- Code: Select all
wget -k -m -E -p -np -R memberlist.php*,faq.php*,viewtopic.php*p=*,posting.php*,search.php*,ucp.php*,viewonline.php*,*sid*,*view=print*,*start=0* -o log.txt rigorousintuition.ca/board2/
It took several days to complete - mainly due to a bug in wget which means it downloads stuff it's told to ignore before deciding to ignore it, plus not wanting to hammer the board too hard.
It includes everything up to around Jan 10th 2022. As before, it does not contain hotlinked images or videos, but if the link to them still works, it will display as expected from within the archive when viewed on a browser. A lot of cruft was excluded, including anything related to the defunct 'blog' function, print-views of pages and duplicates of pages created from links to forum searches (e.g.'..&hilit=noddy...').
Obviously (or maybe not) the 'search' functions won't work on a local copy, as they're performed by the server backend on the site, so links within the archived pages to user profiles etc will bring you back to the 'real' site.
The old 'Ezcode' formatted pages now display properly - in so far as you can consider anything displayed in
comic sans to be
'proper'.
Tech note: This was a PITA (Joao never did say how it was achieved last time), but I eventually got there buy using 'sed', simply replacing all the instances of '<' and '>' with the appropriate '<' and '>' characters in all of the pages (where the topic number 't=...' was less than 10,000 to speed thing up). This does mean that where an 'ezcode' formatted post was quoted in a later ('t=>10000) one, it will likely still look ugly.
The sed lines were:
- Code: Select all
find . -type f -exec sed -i 's/</</g' {} +
find . -type f -exec sed -i 's/>/>/g' {} +
As a note, what appears to have caused it was that initially, the posts imported from 'ezboards' into an earlier incarnation of phpbb displayed fine. At some point, phpbb was updated (drew?) and the old imported posts no longer displayed correctly.The file is only
362mb in size, and is compressed (zipped) with
7-zip. It expands to 72,374 items, totalling ~3.6 GB (Yeah, 7-zip really is that good).
I intend to upload it to archive.org (their registration sign-up doesn't seem to be working at present), but in the meantime, I've uploaded it to a free filehosting service here:
https://www48.zippyshare.com/v/YPZ9aY3F/file.htmlIt will remain downloadable for '30 days since the last time it was downloaded', so fill yer boots. I'll update this with a link to archive.org when it's available.
If anyone want's to create one for themselves, you can use the commands above from a linux machine, or for Windows/Mac use
a tool like the one Joao recommended in the OP - but set a rate limit (delay) or you'll likely crash the board (MYSQLi errors).