m (Created page with "= wget = '''wget''' is a command to mirror/crawl the web. An often used command is to grab everything from the specified node: <pre>wget -r -l 0 http://www.example.com/</pr...")
 
m (wget)
 
Line 7: Line 7:
 
<pre>wget -r -l 0 http://www.example.com/</pre>
 
<pre>wget -r -l 0 http://www.example.com/</pre>
  
For the simple but recurrent and possibly annoying problem of downloading attachments from a file, then use --domains to restrict to sections you are interested in ''would they be on a different host'' while still avoiding to mirror the full internet. For example, the following downloads [[Guillemin]]'s conferences from [http://www.ecosynchro.org/henri-guillemin-mp3.php a page] that also refers to [www.rts.ch www.rts.ch], which you don't want to touch:
+
For the simple but recurrent and possibly annoying problem of downloading attachments from a file, use --domains to restrict to sections you are interested in ''would they be on a different host'' while still avoiding to mirror the full internet. For example, the following downloads [[Guillemin]]'s conferences from [http://www.ecosynchro.org/henri-guillemin-mp3.php a page] that also refers to [www.rts.ch www.rts.ch], which you don't want to touch:
  
 
<pre>wget --recursive --level=10 --convert-links -H --domains=www.byblios.fr http://www.ecosynchro.org/henri-guillemin-mp3.php</pre>
 
<pre>wget --recursive --level=10 --convert-links -H --domains=www.byblios.fr http://www.ecosynchro.org/henri-guillemin-mp3.php</pre>

Latest revision as of 12:59, 11 August 2016

wget

wget is a command to mirror/crawl the web.

An often used command is to grab everything from the specified node:

wget -r -l 0 http://www.example.com/

For the simple but recurrent and possibly annoying problem of downloading attachments from a file, use --domains to restrict to sections you are interested in would they be on a different host while still avoiding to mirror the full internet. For example, the following downloads Guillemin's conferences from a page that also refers to [www.rts.ch www.rts.ch], which you don't want to touch:

wget --recursive --level=10 --convert-links -H --domains=www.byblios.fr http://www.ecosynchro.org/henri-guillemin-mp3.php