MESSAGE
DATE | 2004-11-24 |
FROM | Billy
|
SUBJECT | Re: [hangout] C/C++ coding problem.
|
On Tue, Nov 23, 2004 at 03:06:51AM -0500, Ruben Safir Secretary NYLXS wrote: > On Tue, Nov 23, 2004 at 02:02:17AM -0500, swd wrote: > > > > C/C++ Coders, > > I need to code a function that retrieves the HTML source code > > from a web site. I want to be able to do this from a command > > prompt. How the heck do I do this? Thanks for the help. > > And what is wrong with wget or LWP?
Well, he did say he needed a FUNCTION.
LWP is libwww-perl, there's a libwww (for C) published by the W3C which (eventually) does what he wants:
http://www.w3.org/Library/src/
But libcurl appears to be far superior:
http://curl.netmirror.org/libcurl/
Available in Debian via: apt-get install libcurl-dev
> Otherwise you need to open sockets and read the httpd protocals and > follow them.
Sheesh! Do you think Perl is the only language with libraries?
> Unless this is a homework assignemd, it's not worth it.
Sometimes, it's worth it. C can go many places Perl scripts and shell scripts cannot, and compiled C programs are much easier to distribute, because they can be linked for minimal host environmental dependency.
Also:
easiest solution to implement != easiest solution to support.
You have forced my hand, sir.. I present a poor-man's lwp-request, which compiles down to a 6k executable on my Debian system.
Most of the code is error checking.. if you didn't care about that, you could probably do this in 10 lines and just dump the URL to stdout.
===== curlbilly.c =====
#include
int download(char *url, FILE*fp) { CURLcode res; CURL *curl; char curlerrbuf[CURL_ERROR_SIZE]; curl = curl_easy_init(); curl_easy_setopt(curl, CURLOPT_ERRORBUFFER, curlerrbuf); curl_easy_setopt(curl, CURLOPT_FAILONERROR, 1); curl_easy_setopt(curl, CURLOPT_FOLLOWLOCATION, 1); curl_easy_setopt(curl, CURLOPT_URL, url); curl_easy_setopt(curl, CURLOPT_FILE, fp); if(!curl){ fprintf(stderr, "curl initialization error\n"); return -1; } res = curl_easy_perform(curl); if(res){ fprintf(stderr, "download failure:\n%s\n", curlerrbuf); } curl_easy_cleanup(curl); return res; }
int main(int argc, char **argv) { char *url; char *ofn; FILE *ofile; if(argc!=3){ fprintf(stderr, "Usage:\n\n\t%s url outfile\n\n", argv[0]); exit(1); } curl_global_init(CURL_GLOBAL_ALL); url = argv[1]; ofn = argv[2]; if(!(ofile = fopen(ofn, "w"))){ fprintf(stderr, "Error: couldn't open '%s' for writing:\n", ofn); perror(""); exit(1); } fprintf(stderr, "getting url: '%s' as file '%s'\n", url, ofn); download(url, ofile); fclose(ofile); curl_global_cleanup(); return 0; }
===== Makefile =====
CC=`curl-config --cc` CFLAGS=`curl-config --cflags` LDFLAGS=`curl-config --libs`
TESTURL="http://www.nylxs.com"
all: curlbilly
test: nylxs-index.html
nylxs-index.html: curlbilly ./curlbilly $(TESTURL) nylxs-index.html ____________________________ NYLXS: New Yorker Free Software Users Scene Fair Use - because it's either fair use or useless.... NYLXS is a trademark of NYLXS, Inc
|
|