Phone: + (702) 966.5500
Fax: + (702) 993.6900

How to Download an Entire Website with wget

Ahh, the power of wget. Probably my favorite shell command of all. One of the many cool things you can do is to download an entire website with wget - images, css and all - right to your local computer. And it's pretty easy... you just need to open your ssh client, in my case Terminal. Navigate to where you want the downloaded files to go and run a wget command something like this:

$ wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains thewebsite.com --no-parent www.thewebsite.com

Note: $ is the prompt so you wound't need to include it

There are several options that you may find useful when downloading a website with wget. They are

  • --recursive (this command allows you to download all the files and folders)
  • --page-requisites (this command allows tells the system to include all the elements of the website such as images, CSS, and other files.
  • --domains thewebsite.com (a useful command that tells wget not to look outside of the thewebsite.com for files.
  • --no-parent (another useful command that tells wget not to follow links outside the specified domain.
  • --html-extension (allows you to save the site as easy to open .html files on your local machine.
  • --convert-links (another great command, it tells wget to convert all the links in the site so that they work for your local machine).
  • --no-clobber (tells wget to not overwrite files that already exist).
  • --restrict-file-names=windows (this command makes sure the file names are compatible with Windows. Solves the problem of ? in URLs.
  • --no-host-directories (disables the creation of host-prefixed directories)
  • --no-directories (doesn't create a hierarchal directory structure, rather dumps everything in the current directory. Happily, it doesn't overwrite files with the same name, but appends an .n)

Where to Get wget

wget isn't something that comes native to a mac so you need to download it here for Mac: http://download.cnet.com/Wget/3000-18506_4-128268.html For windows look here: http://gnuwin32.sourceforge.net/packages/wget.htm

List of wget Commands

It has a ton of them, see http://linux.about.com/od/commands/l/blcmdl1_wget.htm

Clients

Resources

Media