DEV Community

Cover image for Element(ary), my dear Watson.
Andre Willomitzer
Andre Willomitzer

Posted on

Element(ary), my dear Watson.

Today I took on the task of fixing my html output files for the Release 0.1 of textToHTML, my command line SSG (static site generator) tool.

A couple days ago I thought I had it working because I tested some txt files I wrote and the paragraph HTML tags seemed to be around each line.

Then... I tested the Sherlock Holmes texts to see how it handled large data. And it turns out my program was creating a paragraph tag around every single line. To fix this, it was required to ask "what is a paragraph?". A paragraph is a line of text that is followed by 2 newlines rather than another line of text.

So using a combination of split on the string, and map was the solution:
const html = data.split(/\r?\n\r?\n/)
.map(para =>
<p>${para.replace(/\r?\n/, ' ')}</p>
).join(' ');

To break it down line by line:

  1. split the data by 0 or 1 occurrences of \r, followed by \n, followed by 0 or 1 occurrences of \r, followed by \n. This covers all the combinations of \r and \n we could have. Returns an array.
  2. Map the returned array, and replace within the paragraph string any occurrence of either \r and \n together, or just \n, with a space.
  3. .join(' ') makes it so that our HTML doesn't contain "," separating the paragraph tags as if it were an array. This makes it 1 string separated by spaces.
  4. const html will contain our returned paragraphs.

It was fun seeing a use of regex like that in web development, considering I am used to seeing them in Linux and bash scripts hehe.

Seems to be working good... to be improved. :)

All the best friends!!!

Andre Willomitzer

Top comments (0)