MESSAGE
DATE | 2008-09-30 |
FROM | Matthew
|
SUBJECT | Re: [NYLXS - HANGOUT] wget & mirrors
|
The biggest question is. Who ownes the website. If you do, then copy the database behind it.
If it is not your site, then use of wget with a bit of perl might do the job, to do a decent job it is really site by site. For this type of work you should use a swiss army chain saw AKA perl or similar.
Matthew On Mon, Sep 29, 2008 at 11:55:01PM -0400, email wrote: > Hi, > > I'm curious if anyone has experience using wget to mirror a website? I > have some but limited. > > More specifically, I am looking to really archive a site. For example, > if I'm mirroring a blog, I would want it to update (add) new pages, > update pages where the post is updated, plus, and heres the caveat, not > delete pages that are deleted, and not delete text that is deleted from > posts. > > So it would be more of an archival system than a mirror. Not sure if > wget is the tool for this but I'd love to hear any input/opinions from > anyone out there. > > Thx
|
|