Download Wikipedia Articles
Technically it is my company's MediaWiki page and not Wikipedia, but the principal is the same.
If you got to the Wikipedia main page and do File -> Save Page As -> Web page complete
Then open the file that is saved, it turns out that none of the css or images have been saved and the page looks rubbish.
Same behaviour in Firefox and IE7.
Any idea why this is happening and how I can get round it?
I can print to PDF or even save as *.mht, but I don't want to do that.
I am also not interested in manually looking for the css files and copying them one by one locally.
Re: Download Wikipedia Articles
That's because a lot of the CSS files specified in the page's markup are not hard links. By hard links, I mean, they're more 'programmatic', though that isn't the correct term either.
Have a look at the markup
Code:
<style type="text/css">/*<![CDATA[*/
@import "/w/index.php?title=MediaWiki:Common.css&usemsgcache=yes&action=raw&ctype=text/css&smaxage=2678400";
@import "/w/index.php?title=MediaWiki:Monobook.css&usemsgcache=yes&action=raw&ctype=text/css&smaxage=2678400";
@import "/w/index.php?title=-&action=raw&gen=css&maxage=2678400";
/*]]>*/</style>
and
Code:
<!--[if lt IE 5.5000]><style type="text/css">@import "/skins-1.5/monobook/IE50Fixes.css?156";</style><![endif]-->
<!--[if IE 5.5000]><style type="text/css">@import "/skins-1.5/monobook/IE55Fixes.css?156";</style><![endif]-->
<!--[if IE 6]><style type="text/css">@import "/skins-1.5/monobook/IE60Fixes.css?156";</style><![endif]-->
<!--[if IE 7]><style type="text/css">@import "/skins-1.5/monobook/IE70Fixes.css?156";</style><![endif]-->
<!--[if lt IE 7]><script type="text/javascript" src="/skins-1.5/common/IEFixes.js?156"></script>
<meta http-equiv="imagetoolbar" content="no" /><![endif]-->
When saving the file, the browser will only save those files that are directly referenced, explicitly.
Re: Download Wikipedia Articles
Re: Download Wikipedia Articles
Does this stop you in your tracks for the task you want to perform?
Re: Download Wikipedia Articles
I can view them offline correctly with mht files, but I it turns out I can't print them as I would like online or offline actually as there is an onPrint method which takes out the css.
So far the only option I can see is to download the html then manully follow the links and download them as well and do a little bit of modification of the code.
The core reason behind it is to print off some info from the wiki to give to someone else to read.
My next step is to look at web site rippers to see if they help.
Any further ideas greatly recieved!