Hacker Public Radio

HPR4293: HTTrack website copier software


Listen Later

This show has been flagged as Clean by the host.

The Wayback Machine by The Internet Archive is a very good resource for web sites no longer existing or older revisions of them.


However, sometimes I have also found it is nice and useful to have my own copy of a web site. It means I have control over the copy, it can be accessed offline and no world wide wait for the page to load.


My most typical use case if for web sites that I am manager of myself. For one or another reason, I want to keep a snapshot of the site. I have also used it for fact based sites which I want to always have access to, like a reference book. One of my recent use cases was a magazine that has closed down and announced the web site will also soon be terminated. Although it is available in the Wayback machine, I wanted to have a copy myself for a short period of time.


The software I use for this HTTrack. This software is available for Windows, Android, Linux and unix-like systems. It is at least for some platforms available with a graphical user interface. I have myself only used HTTrack with the terminal interface on Linux. HTTrack is a free and open source software.


In its simplest way to operate, it is just to type "httrack" followed by the url to the start page of the site to be copied.


In many cases this works well, I get a perfect copy. In other cases, it works less well. First of all, of course, I do not copy very big websites, both for the amount of time it takes and the disc space. What is stated in the robot textfile can also matter for the result. Another issue can be the folder structure of the site, HTTrack may not find all folders in its default setup, for example how images are stored. I have myself also got issues when menues and links not works normally where I instead have to right click to open the link.


The HTTrack web site has quite a lot of information in the documentation and it also has a forum. And in the terminal, there is also good help about all additional available commands. I have in general for my usage found the simple first attempt to copy sites gives perfect or good enough result directly without need to research details.


So, when I want to preserve snapshot of earlier releases of my own sites or when I want to have an offline and preserved copy of an important site, I consider HTTrack to be an easy to use and yet powerful tool. I am aware other similar tools exist, but this is the one I currently use.


HTTrack website copier website:
https://www.httrack.com/

Provide feedback on this episode.

...more
View all episodesView all episodes
Download on the App Store

Hacker Public RadioBy Hacker Public Radio

  • 4.2
  • 4.2
  • 4.2
  • 4.2
  • 4.2

4.2

34 ratings


More shows like Hacker Public Radio

View all
Security Now (Audio) by TWiT

Security Now (Audio)

1,971 Listeners

Off The Hook by 2600 Enterprises

Off The Hook

117 Listeners

No Agenda Show by Adam Curry & John C. Dvorak

No Agenda Show

5,935 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

283 Listeners

LINUX Unplugged by Jupiter Broadcasting

LINUX Unplugged

265 Listeners

BSD Now by JT Pennington

BSD Now

89 Listeners

Open Source Security by Josh Bressers

Open Source Security

43 Listeners

Late Night Linux by The Late Night Linux Family

Late Night Linux

154 Listeners

The Linux Cast by The Linux Cast

The Linux Cast

35 Listeners

Darknet Diaries by Jack Rhysider

Darknet Diaries

7,864 Listeners

This Week in Linux by TuxDigital Network

This Week in Linux

36 Listeners

Linux Dev Time by The Late Night Linux Family

Linux Dev Time

21 Listeners

Hacking Humans by N2K Networks

Hacking Humans

314 Listeners

2.5 Admins by The Late Night Linux Family

2.5 Admins

92 Listeners

Linux Matters by Linux Matters

Linux Matters

20 Listeners