DEV Community

Cover image for A lesson in line endings

A lesson in line endings

alexoragz profile image Alex Oragwu ・2 min read

How many folks out there have installed git on a windows machine and seen this prompt:

Alt Text

I have come across this before and just clicked next with the default Checkout Windows-style, commit Unix-style line endings selection without thinking about it. Well, I had to think about line endings yesterday - a lot.

It started out when I had to parse some comma separated values data, something that looked like this:

""First Name","Last Name","Age""

☝️ I assumed each line was terminated with a new line or \n character (more on that soon).

I had to process the data - specifically remove the leading and trailing quotes from each row that would result when I split the data on the commas. For example, I wanted to avoid having ""First Name" in column 1 or "Age"" in column 3 of the header row and instead have "First Name" and "Age" respectively. So I went right to work:

const tableData = wweSuperstars.split('\n')
  .map(rowText => { 
    return rowText.split(',')
      .map(cellText => cellText.replace(/^"|"$/g, '')) // 👈replace leading and trailing quotes (from first and last columns)

Here is what I ended up with:

  ['First Name', 'Last Name', 'Age"'],
  ['Michael', 'Mizanin', '39"'],
  ['Stephanie', 'Garcia', '36"'],
  ['Bryan', 'Danielson', '39"']

Notice the last cell in each row still ends with the quote. What made this more puzzling was that when running the project locally, the data would get cleaned correctly but once deployed, the anomaly would surface.

After spending longer than I would like to admit trying to figure out why it behaved differently locally versus once deployed and what the root cause was, a colleague of mine came to my rescue and suggested it could be because of a carriage return being present at the end of each row. Armed with this new knowledge, I modified the code a little bit:

return rowText.split(',')
      .map(cellText => cellText.replace(/^"|"\r*$/g, ''))

☝️ added \r* to the regex to say remove trailing " and trailing carriage return (\r) if present

And that did it!

Basically, what I was missing was that on my machine, each row was terminated solely by a line feed (\n) while when deployed to the remote server, it was terminated by a carriage return followed by a line feed (\r\n) - explained by the fact that my machine was a mac (UNIX) while the remote server was Windows. This led to the \r sticking around at the end of each row (even though it was not visible to the naked eye). Aren't computers fun?!

The git installer tried to warn me about line endings - I should have listened.

Discussion (0)

Forem Open with the Forem app