The StreamReader class




The StreamReader class is helpful when reading files. The foremost advantage of streaming files instead of reading them into memory at once, by something like file_get_contents(), is of course that the streaming method will consume far less memory. And if you only want to read a chunk of bytes from the file there's no need to read the complete file into memory.

A good example of when the StreamReader class comes in handy is when you want to read binary data. Lets say I want to verify that an image I have really is a PNG image. The first eight bytes of a PNG image is a signature that always is the same. They look like 137 80 78 71 13 10 26 10 in decimal form.

So to verify that our image really is a PNG image we can read the first eight bytes, unpack them to decimal form and verify that they match the PNG signature.

I will be working on this image:

PNG-24 image

30 lines of PHP
  1. <?php 
  2. require_once 'PLib.php'; 
  3. PLib::Import('IO.StreamReader'); 
  4.  
  5. $image = 'png24.png'; 
  6.  
  7. //! The PNG signature
  8. $pngsig = array( 
  9.   1 => 137, 
  10.   2 => 80, 
  11.   3 => 78, 
  12.   4 => 71, 
  13.   5 => 13, 
  14.   6 => 10, 
  15.   7 => 26, 
  16.   8 => 10 
  17. ); 
  18.  
  19. //! Create a stream for the image
  20. $reader = new StreamReader($img); 
  21.  
  22. //! Read the first eight bytes and unpack them. 
  23. //! Hopefully this will result in an array identical to the $pngsig
  24. $sig = unpack('C*', $reader->Read(8)); 
  25.  
  26. if ($sig === $pngsig) 
  27.   echo "It's a PNG image"; 
  28. else 
  29.   echo "It's not a PNG image!"; 
  30. ?>

Not too difficult!

We can also use the method ReadBlock to easily read a block of bytes. For instance, lets read 4 bytes from byte 12 (13, 14, 15, 16) in the PNG image to check if that's really where the PNG image header (ihdr) begins and if so then read the next 13 bytes - which is the actual header - to determine the width, height, bit depth, colour type and so on of the image:

44 lines of PHP
  1. <?php 
  2. require_once 'PLib.php'; 
  3. PLib::Import('IO.StreamReader'); 
  4.  
  5. $image = 'png24.png'; 
  6.  
  7. $reader = new StreamReader($image); 
  8.  
  9. //! Move to byte 12 and read four bytes.
  10. $ihdr = $reader->ReadBlock(12, 4); 
  11.  
  12. if ($ihdr != 'IHDR') 
  13.     die('Malformed PNG image!'); 
  14.  
  15. //! Since we read up to and including byte 16 previously that means the file
  16. //! pointer in the StreamReader object has moved to the 16th byte so we can
  17. //! just read the next 13 bytes.
  18. $ihdr = $reader->Read(13); 
  19.  
  20. //! Format for unpacking the header data
  21. $ihdrFormat = 'Nwidth/'             . 
  22.               'Nheight/'            . 
  23.               'Cbitdepth/'          . 
  24.               'Ccolourtype/'        . 
  25.               'CcompressionMethod/' . 
  26.               'CfilterMethod/'      . 
  27.               'CinterlaceMethod'; 
  28.  
  29. $ihdr = unpack($ihdrFormat, $ihdr); 
  30.  
  31. print_r($ihdr); 
  32.  
  33. //! The result is
  34. Array 
  35. ( 
  36.       [width] => 191 
  37.       [height] => 69 
  38.       [bitdepth] => 8 
  39.       [colourtype] => 6 
  40.       [compressionMethod] => 0 
  41.       [filterMethod] => 0 
  42.       [interlaceMethod] => 0 
  43. ) 
  44. ?>

If you want to know more about the PNG format you can check out the specification at W3C.

Line by line

Another handy method of StreamReader is ReadLine. A common way to read a file line by line is to do like this:

7 lines of PHP
  1. <?php 
  2. $lines = file('thefile.txt'); 
  3.  
  4. foreach ($lines as $line) { 
  5.   // Do some stuff
  6. } 
  7. ?>

This is really easy but it can have a major impact on the memory consumption if you happen to deal with a really large file. Since the StreamReader class will stream the file line by line the memory consumption will be far far less.

In the following example I will list the file /etc/mime.types which contains a list of mimetypes and their file extensions (if one or more is associated with the mimetype).

21 lines of PHP
  1. <?php 
  2. require_once 'PLib.php'; 
  3. PLib::Import('IO.StreamReader'); 
  4.  
  5. $reader = new StreamReader('/etc/mime.types'); 
  6.  
  7. echo "<table>\n<tr><th>Mimetype</th><th>Extension</th></tr>\n"; 
  8.  
  9. while (($line = $reader->ReadLine()) !== false) { 
  10.     $line = trim($line); 
  11.  
  12.     //! Skip empty lines and comments
  13.     if (empty($line) || $line[0] == '#') 
  14.         continue; 
  15.  
  16.     preg_match('/(.[^ \t]*)[ \t]*(.*)$/s', $line, $m); 
  17.     echo "<tr><td>{$m[1]}</td><td>{$m[2]}</td></tr>\n"; 
  18. } 
  19.  
  20. echo "</table>\n"; 
  21. ?>

Here's the result of the code above