How to Download Your Website Using WGET for Windows

Ever had that terrifying feeling you’ve lost your blog? Perhaps your WordPress installation got hacked, or your web hosts royally screwed up with a “database upgrade”. Either way there’s an almost infinite array of reasons to download and backup a copy of your website, and precisely zero reasons to neglect doing it.

If you’re a Linux user, there are lots of guides out there on how to use WGET, the free network utility to retrieve files from the World Wide Web using HTTP and FTP, but no guides to doing so with Windows. Unless you fancy installing Ubuntu or Crunchbang, here’s a handy guide to downloading your site using WGET in Windows.

1) Download WGET

Download and save WGET to your desktop. You can get wget.exe here. I recommend downloading WGET for Windows (win32) from Ugent.be as it’s the most up to date version I could find. For info, you can also get WGET from Brothersoft but avoid the WGET for Windows download page, because their installer doesn’t work with Windows Vista.

2) Make WGET a command you can run from anywhere in the Command Prompt

If you want to be able to run WGET from any directory inside the command terminal, you’ll need to learn about the path command to work out where to copy your new executable.

First, open a command terminal by selecting “run” in the start menu (if you’re using Windows XP) and typing “cmd”. If you’re running Windows Vista go to “All Programs > Accessories > Command Prompt” from the start bar. You’ll see something that looks like this:

terminal

We’re going to move wget.exe into a Windows directory that will allow WGET to be run from anywhere. First, we need to find out which directory that should be. Type path into the command prompt to find out:

command prompt path command

Thanks to the “Path” environment variable, we know that we need to copy wget.exe to either the C:WindowsSystem32 directory or the C:Windows directory. Go ahead and copy WGET to either of the directories you see in your Command Terminal.

3) Restart terminal and test WGET

If you want to test WGET is working properly, restart your terminal and type:

wget -h

If you’ve copied the file to the right place, you’ll see a help file appear with all of the available commands

4) Make a directory to download your site to

Seeing that we’ll be working in Command Prompt, let’s create a download directory just for WGET downloads. *If you’re familiar with Command Terminal basics, just skip this step. Change to the C: and use md (makedir) to make a directory:

make directory

Change (cd site-download) to your new directory and you’re ready to do some downloading!

5) Download your site using WGET

Ok, the fun bit begins. Once you’ve got WGET installed and you’ve created a new directory, all you have to do is learn some of the finer points of WGET arguments to make sure you get what you need.

I found two particulary useful resources for WGET usage. The Gnu.org WGET Manual and About.com’s Linux WGET guide are definitely the best.

After some research I came up with a set of instructions to WGET to recursively mirror your site, download all the images, css and javascript, localise all of the URLS (so the site works on your local machine), and save all the pages as a .html file.

To mirror your site:

wget -r http://www.yoursite.com

To mirror the site and localise all of the urls:

wget --convert-links -r http://www.yoursite.com

To mirror the site and save the files as .html:

wget --html-extension -r http://www.yoursite.com

6) Is your WGETing you blocked?

See what I did there? Some webservers are set up to deny WGET’s default user agent – for obvious, bandwidth saving reasons. You could try changing your user agent to get round this. Try er, pretending to be Googlebot:

wget --user-agent="Googlebot/2.1 (+http://www.googlebot.com/bot.html)" -r http://www.yoursite.com

And finally, here’s WGET downloading my website:

downloading seogadget

On that last note, lots of hosting companies block WGET. Mine included! Took me a while to be able to back my own site up but now, I feel pretty safe that I have backups of the database, the plugins, the images and even the HTML of the site itself. Happy WGETting! :-)



Stay Updated: Sign Up for Webinar & New Blog Alerts

32 thoughts on “How to Download Your Website Using WGET for Windows

  1. Rarst says:

    That’s quite a twist on site backup. :) I use less inventive way – Cobian Backup to backup folders from my FTP account. Fetches only needed folders (theme, plugins, images).

  2. Matthew says:

    nice article and wget is very useful tool to be aware of. You might also like Unison.

  3. Nifty technique and unique guide here Richard. Way to contribute to the community.

    2 points:
    1) Recursively mirror your site – huh? Try that again in English please, Mr. Englishman ;P
    2) How do Mac users do this?

  4. Hey Gab – ok – the -r command is the mirror command. Recursively, follow all links. As for the MAC – no idea dude! I keep meaning to get hold of a MAC to learn how. If I come across the answer, I’ll post it here. Thanks for dropping by!

  5. Dan says:

    This works for me: wget -e robots=off -E -r -k -l inf -p –restrict-file-names=windows -H -K -D [Your Blog Name].wordpress.com,[Your Blog Name].files.wordpress.com –random-wait http://Your Blog Name].wordpress.com

  6. Owen English says:

    Hey can anyone help. Trying to download files using wget v 1.10.2 from the command prompt gives this (filenames blanked for commercial reasons):

    –2009-04-07 07:53:52– http://www.medistat-software.net/*******/******.***
    Resolving http://www.medistat-software.net... seconds 0.00, 77.92.81.1
    Caching http://www.medistat-software.net => 77.92.81.1
    Connecting to http://www.medistat-software.net|77.92.81.1|:80… seconds 0.00, Closed fd 1936
    failed: Connection timed out.
    Releasing 0x009259d8 (new refcount 1).
    Retrying.

    This actually works from about 75% of my clients but the other 25% get this error.

    Help – what does it mean/ What is ‘Closed fd 1936′ ??

    Cheers Owen E

  7. Ivan says:

    Hello,

    very nice article indeed. However, it does not help me download my site or any site. I copied and pasted all the commands here with the same result. Only the index file and a js file (http://jscook.sourceforge.net/JSCookMenu/). Why can't I download the site?

    Please help.

  8. I've found that some sites don't respond correctly unless you add a user agent to the request. Have you tried that?

  9. Ivan says:

    How do I add an user agent and what is it?

    I cannot download my site only with wget. WinHTTrack did it. Other sites can be downloaded with wget, but mine nada.

    Here is the url: http://www.all-e-services.com (still in development phase)

    Thank you!

  10. No problem Ivan,

    I suggest you follow point 6 in the post, as it may be your web host is blocking WGET's standard user agent.

    Good luck!

  11. Darius says:

    Suggested wget download website
    is corrupt

    http://users.ugent.be/~bpuype/wget/#download

    Hidden installation under Vista
    no apps, no interface

    Can you check it under Vista ?

  12. randomstranger says:

    Thanks Dan, that totally worked!

  13. Manali says:

    It helped me in all ways… Details are informative…
    Thanks!

  14. thank says:

    i’m saving a site that’s about to be deleted as i type!

    thanks for sharing.

  15. Cameron Fraser says:

    the site I’m trying to archive has “?” in the links and wget saves files on windows replacing
    the ? with “@” – but it leaves the links with “?” so the links don’t match the filenames.
    Also “@” is problematic since it looks like an email link to the browser. Any workaround
    for this?
    thanks

  16. HP Bryce says:

    This looks like a great 11th hour plan B option. I am sure changing the ? is easier than starting over! However if there is a way to get rid of the ? that would be great.

  17. ashish says:

    thanks. wget helped me save the site that i wanted.

  18. php ide says:

    great tutorial, a problem i have is download a page that auto redirects to the home page, so wget only downloads the home page, can i set wget not to follow redirects??

  19. Matthijn says:

    Mac users can just install wget through something like macports.

  20. Anonymous says:

    This is the first guide on wget I have read that is fairly up-to-date and actually helpful. Thank you.

    Everything is working fine for me but I was wondering if it is possible to save directory/folder names as html files?

    Example:

    http://www.sitename.com/directory-name-1/ would be saved as /site-name/directory-name-1/index.html

    Is there no way to save as /site-name/directory-name-1.html ?

    Any help would be greatly appreciated!

  21. hoowha says:

    Goooooooooooooooooooooooooooooooooooooooood!
    the option –user-agent is what I looking for.
    Thanks!

  22. Arrh … my hosting company block WGET. I can’t try this awesome tool!

  23. Thanks Richard,

    Between your steps and the other comments I finally got my problem worked out with downloading images from external site.

    For Windows users though – you might find it easier with WinHTTrack

  24. Evan says:

    Awesome! Thanks!

  25. Pferdeklinik says:

    Better late than never. Having recovered from a hack I’ve been looking for ways to protect and better backup in future.

    Thanks for the great info…

  26. BaZz says:

    Thank you for this useful article.

    Please note: the 2 versions of wget you are linking are different in command-line options!

  27. PC says:

    Buy a windows computer.

  28. bluerika08 says:

    thank you so much for the instructions!!!!! really! :D

  29. Adam says:

    Why use wget and not just FTP your site locally for backup?

  30. Kellie says:

    Terrific instructions, very easy to follow.

    worked like a charm :D

  31. Helper says:

    this is fake page plz don’t downlod wget it’s hacking VIRUS if clik the file dosent show any thing … I SAY IT’S VIRUS

  32. Nobody here seems to know about WHTT, a free site grabber. (Possibly the omission is because it’s Windows? – I don’t know). Once you work out how it operates, you can grab assorted sites from around Internet, for future reference. This includes forums – I have a copy of the nukelies forum, which I uploaded for reference – but any forum can be downloaded, though ones with milions of postings are best avoided, partly becvause of the size, partly the low quality.

Comments are closed.