One approach, if you have the lynx text-based HTML browser installed on your system, is this:
$html_code = `lynx -source $url`; $text_data = `lynx -dump $url`;The libwww-perl (LWP) modules from CPAN provide a more powerful way to do this. They don't require lynx, but like lynx, can still work through proxies:
# simplest version use LWP::Simple; $content = get($URL);
# or print HTML from a URL use LWP::Simple; getprint "http://www.linpro.no/lwp/";
# or print ASCII from HTML from a URL # also need HTML-Tree package from CPAN use LWP::Simple; use HTML::Parser; use HTML::FormatText; my ($html, $ascii); $html = get("http://www.perl.com/"); defined $html or die "Can't fetch HTML from http://www.perl.com/"; $ascii = HTML::FormatText->new->format(parse_html($html)); print $ascii;