MESSAGE
DATE | 2008-09-30 |
FROM | From: "Paul Robert Marino"
|
SUBJECT | Re: [NYLXS - HANGOUT] wget & mirrors
|
ther is a program called w3m that does web site mirroring although I've never had a nead to use it so I couldn't tel you how well it works.
-----Original Message-----
From: Matthew Subj: Re: [NYLXS - HANGOUT] wget & mirrors Date: Tue Sep 30, 2008 12:15 pm Size: 960 bytes To: hangout-at-mrbrklyn.com
The biggest question is. Who ownes the website. If you do, then copy the database behind it.
If it is not your site, then use of wget with a bit of perl might do the job, to do a decent job it is really site by site. For this type of work you should use a swiss army chain saw AKA perl or similar.
Matthew On Mon, Sep 29, 2008 at 11:55:01PM -0400, email wrote: > Hi, > > I'm curious if anyone has experience using wget to mirror a website? I > have some but limited. > > More specifically, I am looking to really archive a site. For example, > if I'm mirroring a blog, I would want it to update (add) new pages, > update pages where the post is updated, plus, and heres the caveat, not > delete pages that are deleted, and not delete text that is deleted from > posts. > > So it would be more of an archival system than a mirror. Not sure if > wget is the tool for this but I'd love to hear any input/opinions from > anyone out there. > > Thx
|
|