One approach, if you have the lynx text-based HTML browser installed on your system, is this:
$html_code = `lynx -source $url`;
$text_data = `lynx -dump $url`;The libwww-perl (LWP) modules from CPAN provide a more powerful way
to do this. They don't require lynx, but like lynx, can still work
through proxies:
# simplest version
use LWP::Simple;
$content = get($URL); # or print HTML from a URL
use LWP::Simple;
getprint "http://www.linpro.no/lwp/"; # or print ASCII from HTML from a URL
# also need HTML-Tree package from CPAN
use LWP::Simple;
use HTML::Parser;
use HTML::FormatText;
my ($html, $ascii);
$html = get("http://www.perl.com/");
defined $html
or die "Can't fetch HTML from http://www.perl.com/";
$ascii = HTML::FormatText->new->format(parse_html($html));
print $ascii;