During the weekend, I found the following small JS function in a blog post:
const lineChecker = (line, isFirstLine) => {
let document = ``;
if (line !== "" && isFirstLine) {
document += `<h1>${line}</h1>`;
} else if (line !== "" && !isFirstLine) {
document += `<p>${line}</p>`;
} else if (line === "") {
document += "<br />";
}
return document;
};
I refactored it and was thinking that it could be an excellent beginner-level refactoring kata.
How would you refactor it?
Top comments (93)
Interesting function. I would need to have a closer look at the rest of the code to see whether refactoring this specific function is really the best option. It's seems like a weird function to me, and it does a few things.
The most straight forwards refactor would be:
But I don't know if line could be null or other falsy values as well, and if those would also return
<br />
or not. In case we want to check for null values too:If it was me, I would prefer separating logic into separate functions to separate concerns. A function to check whether the line is empty or not, another one to get the line, one to generate the header, etc... something like that.
This is the initial idea that comes to mind:
Note that I have separated it into smaller more composable functions for better reusability. It's a longer solution, but I feel like it makes it quite a lot more extendable and reusable across the rest of the codebase.
This could be taken in multiple ways though, so this is just my take!
Love it!
I'm a fan of simplicity so I ended up with the first solution you proposed (plus inlined
hasLine
), but I very much agree on your thoughts regardingnull
/undefined
being passed in.For your 3rd solution I was wondering where the line towards overengineering is, I think it depends very much on the project and it's context.
"It depends" definitely strikes me as the answer to the overengineering question.
Extracting predicates feels to me like an easy sell. Their reusability and the readability improvement approaches self evident. How many general purpose js libraries come without an
isEmpty
function?The html strings would require more context (and pose more questions - what situation are we in that this is preferable to a robust templating solution?). Either Keff didn't mention it or I overlooked it, but the separate functions also permit independent testing.
In this particular example, I would consider extracting predicates premature abstraction. You might never need them, and you add unnecessary cognitive complexity. When it comes to testing, the function can easily be unit tested as is.
That being said, I am in favor of extracting predicates etc. when there is reuse and value in doing so, i.e. following the 3 strikes rule for refactoring.
What added cognitive complexity is there in reading
isEmpty
rather than parsing the expressionline !== '' && line != null
?WRT to testing, a mistake in a predicate would cause a test of that branch's eventual value to fail. Not because the code that produces that value is wrong, but because the predicate is wrong. Yes, you can still test every one of the diverse things this function is doing, but conflating multiple behaviors just makes finding the specific problem harder. It's changing 'the problem is here' to 'the problem is in this general area'.
The complexity is in the number of lines to read and in the links that you need to follow. The individual pieces might be simpler, but their sum is not due to their links. E.g. if you want to understand what
paragraph
does, you need to find and read the definition, keep it in your head, go back to the place where it was used, and then put it together.Re testing, I agree that it's good to tear things apart (and introduce e.g. predicates), but only when the code is complex enought. I consider this method way too trivial to do that - it's far easier to test as one.
Okay, I follow what you mean by complexity. Thank you for clarifying. I don't wish to say I disagree, but I don't think I share your perspective.
I think there's an argument to be made in the very particular case that a function (eg
isEmpty
) has a name (or function signature, or instantly visible documentation if one has the convenience of IDE like features) that is clear to the reader, and that function is separately tested. That it's separately tested allows me to trust that the implementation does what it should. That the name or signature (particularly the signature, tbh) accurately describe behavior allows me ignore implementation details when I'm readinglineChecker
.If my units (ie functions as the unit of composition) are well named and tested, I should be able to compose them without repeatedly diving into every implementation of every unit. If one can't do that, I think there is a massive problem in one's approach to writing and testing software.
Diving into every implementation detail of every function while reading to understand the behavior of a function/module/unit/etc strikes me as a premature step in a manner similar to what you've expressed about extracting things before there are demonstrated instances of reuse.
Edit to add: If one has to jump back to the implementation of every unit within their own code, Im not sure I see how one can logically avoid doing the same thing with library code. Does the average React user feel the need to read the implementation of
useState
, for example? If not, why is that function call privileged?Overall I think I agree with you and would have made similar arguments in the past. I often split code along similar lines in my projects, maybe with slightly different thresholds to method/function extraction.
However it seems to assume a perfect world with perfect software development projects, or that you and only you own the code, and you never need to hand it over to someone else.
In the reality that I've encountered in professional life, time pressure, employee turnover, mistakes, and other forces lead to source code being far from perfect, sadly. Trusting the function name or that a function is sufficiently tested is a bet in that world. I would not call that "a massive problem in one's approach" - for one-person projects, it's not an issue at all. It happens when shifting teams of developers work on moving requirements over multiple years, and it's non-trivial to avoid.
In that world, I prefer going for code that's as simple and predictable as possible, and that's why I prefer not extracting functions or predicates unless it is genuinely needed (either for testing or because it reduces code complexity, e.g., by removing duplication).
Thanks for the long reply; I enjoyed hearing your perspective!
Fair enough.
Professionally I've encountered sprawling monoliths with no tests at all and team leads who felt it was okay because it had been in production longer than I had been with the company. Your quite right, none of us can avoid reality. It does not, however, change what I would aim for or advise to others.
Likewise, thank you for the civil discussion!
Great point on overengineering, I must admit the last solution could be a bit overengineered. That's the reason I would need to check the full code base and see the real use of this function.
I too prefer simplicity in general, so I would probably go for the first example.
On the point of the
hasLine
variable, it's just my personal preference, I really like how readable it makes the conditions.It definitely makes the code more understandable.
There is a small bug in your last solution
returns false when line = "" and true when line = "test"
possible solution
Ohh my bad. Thanks for pointing it out! I did not run the code after the refactor (big mistake when refactoring)
haha can happen :)
It definitely does, I've been away from javascript for a while, mostly working with dart/flutter...
the easiest way to check for all falsy values would be
if(!!line === false)
What do you think about?
To check for non-nullish I'd suggest using
??
.I love your refactor. So simple!
Solution to this question
For this particular function, I would just remove the unnecessary conditionals. I wouldn't turn it into a one-liner. I also wouldn't give it more complicated logic than simple
if
statements.The real problem
However, passing a Boolean into a function is considered a code smell. Some downsides are that:
h1
tags, without caring about the other cases. But, instead of calling something likecreateHeading(text)
, it needs to bother with more parameters and calllineToHtml(text, true)
The real solution
The solution is to refactor the parent of this function to something like this:
Then, make the lower-level functions only do one thing:
a H1 could be used multiple times, there fore it could be line 3, 9, 18 etc etc. So to check and set it only on first iteration, is from my perspective not correct
Also in the provided solution the
<br />
is missing and would there placed a empty<h1>
on first iteration and for all other<p>
The solution was just an example of how you can remove the Boolean argument for better code overall. It's just a fictional example.
Depending on what you need, you could make changes. For example, you can add a function for the
<br>
just liketoHtmlHeading
, or you could have a function such astoHtmlElement(text, tag)
instead of individual functions for each element, or you could have a completely different parent.Edit: Replaced my answer to also answer the added second part.
so based on the original provided code there is no iteration.
And on that function variable name is isFirstLine as boolean. And then just be an ashole, but isFirstLine could refer to new section
I know there's no iteration in the original example. In my answer I'm making a guess at what a caller of
lineChecker
might intend to do. It's not shown in the opening post. I'm just creating it myself. It's fictional, so it won't always apply.Sure,
isFirstLine
could refer to whatever you want. Anyone can name anything what they want. If, in your application, that could refer to a new section, then my example solution wouldn't apply.Sorry, I didn't quite catch that but did you mean to call me an "asshole" there, or am I misunderstanding?
No i'm not mentioning you, it meant to me because of the question. There I totally can understand your approach
I see, sorry for asking and for misunderstanding.
Good points :)
No sorry please. Always better to ask and to clarify.
Also good to see and hear different approaches.
Which also is good that you can defend and clarify your solution.
Excellent answer, I like that you got rid of the boolean argument.
Now I'm only waiting for someone to write unit tests, in my world those are written before refactoring :)
didnt test these but i think this would be a start for testing.
And it's a choice TDD or BDD. It all depends on application. preferred working approach,
And in this case the question is to refactor the code. Basically there would be no need to change the tests. Still you will insert same input and would expect the same output
Except 2 or 3 solutions added new functions. So only on that case you would write new test
Good suggestion :)
I would first start thinking about what this function actually does.
For me the name
lineChecker
is a little bit misleading because it looks like the function actually parses strings to html.This is what I came up with:
For a function this small there isn't really much to refactor. Parsing markdown has been done thousands of times already and there are libraries that do this perfectly.
Cool puzzle nonetheless.
Adding the types makes sense.
Only the else if else statement could be replaced. My suggestion based on your example
edited
I'd suggest putting the line argument in the function first, as line is more important and
isFirstLine
can have a default valuetrue i just copy paste it and didn't read the arguments.
The renaming idea + introducing types is great!
To me, the method seems to be more of a
print
than aparse
method (but I don't have the full context, just saw it in isolation).Extra bonus points for renaming the function. This would be my first improvement. Even if you left the function as is and just renamed it to describe its purpose that's a huge improvement in my book.
Assumption:
Nice refactoring, I like the tag extraction and the input type checking!
THX.
I'd address two problems:
I don't know what this function does
lineChecker
doesn't tell me what it's checking, or what a "line" is, for that matter. I can guess it's a single line from some text based on the contents of the function, but...This function doesn't do what it says
This function returns a formatted string based on the current line. It's producing output, not checking something. I'd expect a "check" function to return some meta information or a boolean regarding the passed data's validity.
As far as refactoring goes, there's no point in defining the
document
variable at all, and its name is misleading - it's a single HTML element as a string. "Document" is a word we associate with the DOM or, well, entire documents.The last
else if
is redundant, it's the same as anelse
there.I'd probably do this, but I'm not overjoyed about it:
Triple-backtick
javascript
worked for me to get JS formatting.I just changed it to that and it still breaketh!
javascript
needs to be immediately after the 3 backticks, and needs to be followed by a newline.It is, and it's exactly how I've done it in all my other posts where it works fine. I'm not sure what's wrong with this one. Anyway, people can imagine what it's like with colours :)
Simplified one liner, with default props
Note that this isn't a very scalable solution. Scaling this will need a complete rewrite.
The bigger answer is "it depends". If this is it (which probably isn't), you can use my solution. If not there are many ways of making this extensible, like mapping conditions and tags in an array so we can simply add an item to the array to add more elements. Or we could refactor the wrapping in elements to a custom function. There are many more ways we can do this, I'm just mentioning the ones which come to mind
Rerefactor based on Keff's answer.
Theoretically, this is the most optimized you can get.
BUT, if we use TypeScript, and go the mathematical route of only using one letter per var-name, we get this:
Assuming people use the correct types in JS, as if we we're in a strongly typed language, we get the final result of:
But, we can make
e
a one-time expression:LOOK! It's a 2-liner! :O
However, (to me) this looks like the end of the road for refactorization, I don't think we can make this any smaller,- and if you can, it's because of a very niche detail.
But I think this is the best optimization from the original function.
understandable and readable is very hard in your solution.
Refactor is not to get code smaller.
I prefer === checks instead ==
I prefer
!?.length
insteadl.length == 0
According to w3 guidelines, a space should be included before the trailing / and > of empty elements, for example,
<br />
. These guidelines are for XHTML documents to render on existing HTML user agents.edited
Then I think the people who submitted those one-liners (Which are even harder to understand than my code!) misunderstood the point. :p
the one line solutions with nested ternary expressions are also difficult to understand.
Goals for refactoring code: readability, understandability, maintainability, performance, usage of latest code style / programming / functions / methods
I don't think your example works as intended.
Also, why is that? Does the space do something to make the HTML more understandable to some interpreters?
edited my comment
According to w3 guidelines, a space should be included before the trailing / and > of empty elements, for example,
<br />
. These guidelines are for XHTML documents to render on existing HTML user agents.I came up with two revisions. The first is what I would actually prefer because of its readability. The second was just for fun.
This is nice and readable -- and a trick I learned early on was to exit early if you have an simple condition that determines the outcome.
This one does the same, but more concise.
The parentheses are not necessary, but makes it more readable. Also, I broke this into multiple lines to avoid horizontal scroll, but this is actually a single line function.
I like the shortest way, but it seems is not readability
I tend to prefer a balance between smaller and readable code:
But as many things, this is just a personal preference, and the team conventions should always supersede.