open(my $fh, 'filename.txt');where, in this example, $fh is the filehandle associated to filename.txt. This way of opening a file will be in read mode; more generally, one can specify the mode explicitly, as in
open(my $rfh, '<in.txt'); # open in read mode open(my $wfh, '>out.txt'); # open in write mode open(my $afh, '>>add.txt'); # open in append modeYou must have appropriate permissions to do the requested operations on these files. Note also that, in read mode, the file must exist prior to opening, and in write mode, any existing file of the same name will be overwritten.
For binary files, such as images, on Win32 generally after you open the file handle you should call
binmode($fh);for either reading or writing.
For one reason or another, opening a file may fail - for example, you may not have sufficient permission, or, for read mode, the file doesn't exist. Because of this, it is very good practice to use, if appropriate, a die statement to abort the program if the open call fails. The syntax of this is
open(my $fh, 'filename.txt') or die qq{Cannot open filename.txt for reading: $!};Then, if the open call fails, the program will cease execution, and print out the specified error message. In this message, the special Perl variable $! will be set to the system error message (eg, No such file or directory, in the case of a file not existing).
After opening a file, you can print to it (if opened in write or append mode) by specifying the filehandle in the print statement:
print $fh qq{This is some text\n};The printf statement also accepts a filehandle:
my $pi = 3.1415926539052; printf $fh ( qq{%.5f}, $pi);To open a file and loop over all lines in the file, the following construction can be used:
open(my $fh, '<data.txt') or die qq{Cannot open data.txt: $!}; while( $line = <$fh>) { print qq{The line read in is $line}; }What the while loop does is cycle through each line of the file, assigning the particular line to the variable $line, until the end of the file is reached, after which the loop is finished. Note that the line contains the newline character; if you want to remove it, use chomp($line);.
Often data files contain, on each line, various types of information in different columns. For example, suppose we have a file music.txt with some information arranged as:
Spears Britney female pop Lightfoot Gordon male folkrepresenting the last name, first name, gender, and category of some singers. We can extract the information in each line as follows, using a split function:
open(my $fh, '<music.txt') or die qq{Cannot open music.txt: $!}; while( my $line = <$fh>) { my ($first_name, $last_name, $gender, $category) = split ' ', $line; print qq{$first_name $last_name is a $gender who's into $category\n}; }The split ' ', $line function takes $line and splits it into a list (of however many elements that turns out to be), with whitespace being used as a separator between elements. More generally,
my @array = split /$separator/, $line, $number;will split $line into at most $number fields, based on the pattern contained in $separator as the field separator.
The use of the split function is a very powerful technique in extracting wanted information from some general structure. Matching as a regular expression can also come into play here. For example, suppose we have a directory listing:
04/04/2003 11:56a <DIR> . 04/04/2003 11:56a <DIR> .. 04/04/2003 11:57a 2,718 ifsa.aux 04/04/2003 11:57a 42,712 ifsa.dvi 04/04/2003 11:57a 5,976 ifsa.log 04/04/2003 11:57a 25,717 ifsa.tex 04/04/2003 11:57a 25,481 ifsa.tex~ 04/04/2003 11:57a 13,631 ifsa1.tex 04/04/2003 11:57a 484,374 lena.eps 04/04/2003 11:57a 12,244 lena.jpeg 04/04/2003 11:57a 1,511 lena.pl 04/04/2003 11:57a 1,529 lena.pl~ 04/04/2003 11:57a 1,972,444 lenalinear.eps 04/04/2003 11:57a 1,841,056 lenaquadratic.eps 04/04/2003 11:57a 3,369,583 maglin3.eps 04/04/2003 11:57a 136,714 maglin3.jpg 04/04/2003 11:57a 3,369,584 magquad3.eps 04/04/2003 11:57a 92,857 magquad3.jpg 04/04/2003 11:57a 1,919 save.txt 05/16/2003 09:57a 7,852 Resize.pm~ 05/16/2003 09:57a 318 draw.pl~ 05/16/2003 09:57a 413 test.pl 05/16/2003 04:08p 8,271 Resize.pm 05/16/2003 04:11p 318 draw.pl 05/16/2003 04:19p 20,502 t.jpegstored in a file data.txt and wished to extract from this the sizes of all jpg images. One could do this using split:
use strict; use warnings; open(my $fh, "<data.txt") or die "Could not open data.txt: $!"; while (my $line = <$fh>) { chomp $line; my @entries = split ' ', $line; next unless $entries[3]; if ($entries[3] =~ /\.(jpg|jpeg)$/) { print "Image $entries[3] has size $entries[2] bytes\n"; } }(the line next unless $entries[3]; is there to check that a file name exists, which does not happen in case the entry is a directory). Alternatively, one could check $line using a regular expression, and capture the appropriate information if a match succeeds:
use strict; use warnings; open(my $fh, "<data.txt") or die "Could not open data.txt: $!"; while (my $line = <$fh>) { chomp $line; if ( $line =~ /([,0-9]+)\s+(\w+\.jpeg|\w+\.jpg)$/ ) { my $file = $2; my $size = $1; print "File $file has size $size bytes\n"; } }(here we use and capture a character class [,0-9]+ consisting of either digits or a comma to extract the file size). Whether to use split in cases like this or regular expressions can be a matter of taste, but often one or the other is most natural.