<file_handle> - Reading Data from File Handles

Perl Tutorials - Herong's Tutorial Examples

∟<file_handle> - Reading Data from File Handles

This section describes various ways to use the <file_handle> operator in different ways to read data from file handles.

Reading input from a file handle is done by using the input operator, <file_handle>, which returns the next record from the input channel in a scalar context, and returns the rest of the records from the input channel in an array context:

$s = <file_handle>;
@a = <file_handle>;

Note that:

There is a special variable, $/, which stores the input record separator. The default value is \n (0x0A).
With ActivePerl on Windows system, \r (0x0D) is automatically removed from the input record, because \r\n is normally used as text line delimiters on Windows system.
Input record separator will be kept as part of the returning string of the input operator <>.

In order to test how the input operator reacts to different record separators, I wrote the following program, inputStat.pl, to count characters and records:

#- InputStat.pl
#- Copyright (c) HerongYang.com. All Rights Reserved.
#
   ($file) = @ARGV;
   die "Missing file name.\n" unless $file;
   $recordCount = 0;
   $charCount = 0;
   $totalCount = 0;
   open(IN, "< $file");
   while (<IN>) {
      $recordCount++;
      $totalCount += length($_);
      chop;
      $charCount += length($_);
   }
   close(IN);
   print "Number of records = $recordCount\n";
   print "Number of characters after chop = $charCount\n";
   print "Number of total characters = $totalCount\n";
   exit;

The first file, text.rn, has \r\n at the end of each record, like a normal text file on a Windows system. text.rn has only 10 bytes in two records: "123\r\nABC\r\n". Now run InputStat.pl with this file, you will get:

C:herong>InputStat.pl text.rn
Number of records = 2
Number of characters after chop = 6
Number of total characters = 8

The second file, text.n, has \n at the end of each record, like a normal text file on a Unix system. text.n has only 10 bytes in two records: "1234\nABCD\n". Now run InputStat.pl with this file, you will get:

C:herong>InputStat.pl text.n
Number of records = 2
Number of characters after chop = 8
Number of total characters = 10

The third file, text.r, has \r at the end of each record. text.r has only 10 bytes in two records: "1234\rABCD\r". Now run InputStat.pl with this file, you will get:

C:herong>InputStat.pl text.r
Number of records = 1
Number of characters after chop = 9
Number of total characters = 10

As you can see from the test results:

\r will be removed by the input operator if found as \r\n. This is why I got only 8 total characters, instead of 10.
\r will be kept as part of the input record, if found with \n. This is why I got one record on the third test.