DEV Community

Ryan P
Ryan P

Posted on

CSV Parsing Library

I absolutely despise parsing CSV files because you have to know the column indexes for all the fields you want to read, but they could be in a different order for a different file. This makes it to where you have to sanity check the file first and it's going to be different for every file you want to parse. I did kinda find a way around it.

$header = array_flip(fgetcsv($fh));
Enter fullscreen mode Exit fullscreen mode

This will create an array where the header title is an key to the column index

Array(
    'column1' => 0,
    'column2' => 1,
    'column3' => 2,
    ...
)
Enter fullscreen mode Exit fullscreen mode

This is helpful, but you still have to put something like...

while($data = fgetcsv($fh)) {
    $column1 = $data[$header['column1']];
...
Enter fullscreen mode Exit fullscreen mode

That is nice, but not very simple and a pain to write every time you need to parse a file. So I did what any natural dev would do and wrote my own library that will do it all for me.

composer require godsgood33/csv-reader
Enter fullscreen mode Exit fullscreen mode

Then all you have to do is create a reader object and start parsing.

So given the following file:

ID | Name   | Phone Number | Email            | State
1  | Ryan   | 123-456-7890 | ryan@example.com | IN
2  | John   | 234-567-8901 | john@example.com | GA
Enter fullscreen mode Exit fullscreen mode
$csv = new CSVReader($file);
do
{
    $name = $csv->Name;
    $phone = $csv->PhoneNumber;
    $email = $csv->Email;
} while($csv->next());
Enter fullscreen mode Exit fullscreen mode

The library removes any non-alphanumeric characters [A-Z, a-z, 0-9, _] from the header titles so that they are valid properties (so the space is removed from "Phone Number").

There are a couple different options you can pass in as an array when you create the object.

[
    'delimiter' => ',' // character that splits the fields
    'enclosure' => '"' // character the encloses a field should it contain the delimiter
    'header' => 0 // default 0-based index for what row the header is on
    'required_headers' => [] // array of headers (post sanitizing) of fields you want to require are present in the file before any parsing can take place
]
Enter fullscreen mode Exit fullscreen mode

So if you wanted to require the ID, Name, and Email fields your object instantiation would look like:

$csv = new CSVReader($file, ['required_headers' => ['ID','Name','Email']]);

Other fields would still be available for use, but are optional fields, so you'd want to validate them before using them. If a column header is not present, it simply returns NULL.

This library can also be used for remote files, but currently only supports http(s). However, it would not be recommended for large remote files.

GitHub logo godsgood33 / csv-reader

This library is to create a more simple readable way to parse CSV files.

csv-reader

The purpose of this library is to simplify reading and parsing CSV files. I have parsed CSV files so many times and there is not really an easy way to do it. You have to know the index of the row you want to insert and it would be so much more readable if you could just use the header. A while ago, I started reading the header row, flipping the array so that the index of the row elements is now the value of the field index.

$header = array_flip(fgetcsv($fh))
/*
 array(
     'column1' => 0
     'column2' => 1
     'column3' => 2,
 )
 */

This allows you the ability to use the header title as an index into the data array.

$column1 = $data[$header['column1']];

This is nice, but isn't any more readable than just using the index itself (or store the index in a variable and using that)...

$column1

Discussion (0)