Ever had that terrifying feeling you’ve lost your blog? Perhaps your WordPress installation got hacked, or you’ve got to move to a new web host because your old ones screwed up with a “database upgrade”. Either way there’s an almost infinite array of reasons to download and backup a copy of your website, and precisely zero reasons to neglect doing it.
If you’re a Linux user, there are lots of guides out there on how to use WGET, the free network utility to retrieve files from the World Wide Web using HTTP and FTP, but no guides to doing so with Windows. Unless you fancy installing Ubuntu or Crunchbang, here’s a handy guide to downloading your site using WGET in Windows.
Summary: Here’s how to install and use WGET in Windows
- Download WGET.
- Make WGET a command you can run from any directory in Command Prompt.
- Restart the command terminal and test WGET.
- Make a directory to download your site to.
- Use the commands listed in this article to download your site.
Down to the details – get started:
Download and save WGET to your desktop. You can get wget.exe here. I recommend downloading WGET for Windows (win32) from Ugent.be as it’s the most up to date version I could find. For info, you can also get WGET from Brothersoft. Avoid the WGET for Windows download page, because their installer doesn’t work with Windows versions later than Vista!
Make WGET a command you can run from any directory in Command Prompt
If you want to be able to run WGET from any directory inside the command terminal, you’ll need to learn about the path command to work out where to copy your new executable.
First, open a command terminal by selecting “run” in the start menu (if you’re using Windows XP) and typing “cmd”. If you’re running Windows Vista go to “All Programs > Accessories > Command Prompt” from the start bar.
You’ll see something that looks like this:
We’re going to move wget.exe into a Windows directory that will allow WGET to be run from anywhere. First, we need to find out which directory that should be. Type:
into the command prompt to find out:
<a onclick="location.href='https://seogadget.co.uk/?utm_source=dupe-source'; return false;" href="https://seogadget.co.uk">Seogadget home page</a>
Thanks to the “Path” environment variable, we know that we need to copy wget.exe to either the C:WindowsSystem32 directory or the C:Windows directory.
Go ahead and copy WGET to either of the directories you see in your Command Terminal.
Restart the command terminal and test WGET
If you want to test WGET is working properly, restart your terminal and type:
If you’ve copied the file to the right place, you’ll see a help file appear with all of the available commands
Make a directory to download your site to
Seeing that we’ll be working in Command Prompt, let’s create a download directory just for WGET downloads. *If you’re familiar with Command Terminal basics, just skip this step. Change to the C: and use to make a directory:
Change to your new directory and you’re ready to do some downloading!
Use these commands to download your site using WGET
Ok, the fun bit begins. Once you’ve got WGET installed and you’ve created a new directory, all you have to do is learn some of the finer points of WGET arguments to make sure you get what you need.
To mirror your site execute this command:
wget -r http://www.yoursite.com
To mirror the site and localise all of the urls:
wget --convert-links -r http://www.yoursite.com
To mirror the site and save the files as .html:
wget --html-extension -r http://www.yoursite.com
Stop WGET from being blocked
Some web servers are set up to deny WGET’s default user agent – for obvious, bandwidth saving reasons. You could try changing your user agent to get round this. Try er, pretending to be Googlebot:
wget --user-agent="Googlebot/2.1 (+http://www.googlebot.com/bot.html)" -r http://www.yoursite.com
And finally, here’s WGET downloading my website:
On that last note, lots of hosting companies block WGET. Mine included! Took me a while to be able to back my own site up but now, I feel pretty safe that I have backups of the database, the plugins, the images and even the HTML of the site itself.