DEV Community

sfrunza13
sfrunza13

Posted on

Request 2 Pull

In the second lab for the OSD course we as a class were tasked with adding a feature to another students Simple Site Generator repository. The feature in question was the addition of the ability to accept markdown files as input and to be able to parse certain markdown tokens of our choice and convert them to their respective HTML tags.

I paired up with Rudy Chung whos Repo can be found here: https://github.com/rudychung/SauSaGe

Rudy and I both chose to parse bold and italics out of the markdown and convert that to the HTML. I did it on his previously linked C++ project and he did it on my Python based project linked here: https://github.com/sfrunza13/SiteGenerationTool

To start with getting set up was a bit rocky because I cloned Rudy's repository before he refactored the directory structure and added the .sln so we exchanged messages directly and he helped me to set up the src folder and gave me a zip of his project with the working directory structure. Afterwards the git README.md he had explained the rest well and I was ready to get started on my work.

At first, getting back into C++ was a little jarring. I was thinking about ways to separate the work into smaller functions to find the tokens and was for the first time in a while considering the difference between passing by value and reference, I realized after a while that I had started writing before thinking and stopped myself. I commented the code I had written and thought to look and see whether there was a C++ library for working with regular expressions so that maybe I could make life easier for myself.

There was a library for working with regex in C++, however I am not sure how much easier it made things. In my mind I had the logic that I wanted to write. To start I would check to see whether a file had an extension of .md, if it did whenever we would process the new line from it I would use the regex replace method on it to find patterns that were meaningful to me, in my case stretches of text wrapped in *, **, _ or __ and replace them with the same text contents except wrapped in <b> or <i> HTML tags.

Actually finding the correct way to escape special characters and create the regex patterns that I wanted to search for took longer than I expected. Eventually however I came up with the following lines of code and I did this for every pattern I was interested in:

std::regex boldAs("(\\*\\*)([^\\*]+)(\\*\\*)");

tempString = std::regex_replace(tempString, boldAs, "<b>$2<\/b>");

In these cases I was looking for a pair of asterisks to be wrapping the text that would indicate they would need to be replaced by bold tags.

I learned that you can separate regex into really neat and useful groups and I took the second group as my group of interest so to speak and kept that moving forward in the replacement.

The PR looks a bit wonky because it seems as though I deleted and added entire files given that I had to change the directory structure. Rudy ended up changing fixing the directory structure very shortly afterwards and the actual changes I made were only a few lines of code within the createHtml method he wrote earlier.

The issue and corresponding PR that I have been discussing:
https://github.com/rudychung/SauSaGe/issues/7
https://github.com/rudychung/SauSaGe/pull/9/files

The PR was accepted but Rudy later asked that I change the README as well to reflect some of the changes and so I made a seperate PR for that as well.

https://github.com/rudychung/SauSaGe/pull/12

As for Rudy's changes to my code, they initially found the first occurrence of a special token set within a new line and replaced it then moved on. I brought this to his attention after his first PR and he quickly changed it to continue looking for matches throughout the entire new line, I really liked the way that he wrote the code because he separated the work into neatly contained functions of work: a function to take the new line and then pass it to another function 4 times with 4 differing sets of parameters for each case, 2 for bold and 2 for italics. I am referring to how convertMarkdown is calling markdownSearch here:

def markdownSearch(regex, indChars, tag, line):
        newLine = line
        match = re.search(regex,newLine)
        while match != None:
            newLine = newLine[:match.span()[0]] + "<" + tag + ">" + newLine[match.span()[0]+indChars:match.span()[1]-indChars] + "</"+ tag +">" + newLine[match.span()[1]:]
            match = re.search(regex,newLine)
        return newLine
Enter fullscreen mode Exit fullscreen mode
 def convertMarkdown(line):
        newLine = line
        #bold
        newLine = SSJ.markdownSearch("\*\*[^*]+\*\*", 2, "b", newLine)
        newLine = SSJ.markdownSearch("__[^*]+__", 2, "b", newLine)
        #italics
        newLine = SSJ.markdownSearch("\*[^*]+\*", 1, "i", newLine)
        newLine = SSJ.markdownSearch("_[^*]+_", 1, "i", newLine)


        return newLine
Enter fullscreen mode Exit fullscreen mode

The thought to use that while loop and search for match again within was pretty smart too. I just think it's a really cool process of thought and I'm glad he added this to my project.

Rudy's issue and PR: https://github.com/sfrunza13/SiteGenerationTool/issues/4
https://github.com/sfrunza13/SiteGenerationTool/pull/5

Now that this is all done and dusted I will have to take David's (my prof's) suggestions into consideration and start looking at his comments. I have to fix the directory structure of my project and add a src directory and I also have to take a look at the way I treat paths as strings instead of using a library for that.

A lot yet to do, my project will look even better soon.

Top comments (0)