cakoko

Posted on Mar 25

Judging the Problem: Code Smells and Refactoring

#softwaredevelopment #architecture #swift #mobile

Introduction

After learning about Separation of Concerns, how to see a problem, and how to start separating it in the first article, now we move to the next step, learning how to judge the problem.

Once we can identify structural problems in code, the next question is not just what the problems are, but which ones need more attention. Which problems should be solved as soon as possible, and which ones are still good enough for the current context?

The main difference from the previous article is that the question is no longer just:

what concerns exist in this code?

but now becomes:

where is the code structure starting to become unhealthy?

This part is about learning how to look at working code and ask:

what kind of structural issue is happening here?
how serious is it?
what should be refactored first?
what is a real problem, and what is only a smaller imperfection?

We will learn about code smells that can be identified in code. But you need to remember its not about memorizing famous smell names. Its about learning how to judge structural risk.

What is Code Smell

Based on GeeksforGeeks article about Code Smells, The term code smell was first introduced by *Kent Beck* , an American Software Engineer and the creator of extreme programming. When we work on an application and write codes for it, we see a few patterns that are needed to be refactored. Those patterns either duplicates, or complicates, or might make code dependent on other code. Such patterns are called Code Smells and detection of such code is called Code Smelling.

So in simple terms, a code smell is a sign that something in the code structure may cause problems later, even if the code still works right now.

So smells are more like:

warnings
indicators
structural pressure points

This will help a question that came from first article

If this code grows, where will changes become more expensive than they should be?

There are many categories of code smells. Some of the main categories are Bloaters, Object-Orientation Abusers, Change Preventers, Dispensables, and Couplers. Each of these categories has its own examples. For instance, Bloaters include Long Method, Large Class, Primitive Obsession, Long Parameter List, and Data Clumps. You can read the more detailed explanation of each category on the Refactoring Guru website

To make them easier to remember, I asked ChatGPT to create a simplified version for me. After reviewing it, I think it is close enough to summarize the actual list above. So for this article, here are the code smell categories that I will use:

poor boundaries
fragile change points
duplication
shotgun surgery
large function / mixed responsibility
god object
feature envy

Code Smell Category

Poor boundaries

Poor boundaries happen when different responsibilities are placed too close together, or when it is not clear where one responsibility should stop and another should begin. This usually makes the code harder to understand because input/output, state changes, business logic, and flow control start mixing in the same place. In simple terms, poor boundaries mean the code has weak separation between jobs that should be clearer.

Fragile change points

Fragile change points are parts of the code where even a small change feels risky. A simple update can accidentally affect other behavior because the code is too tightly connected, too unclear, or too overloaded with responsibility. This smell matters because it shows where the code is starting to resist change, even if it still works today.

Duplication

Duplication happens when the same logic, structure, or decision appears in more than one place. This is not only about identical lines of code. It can also mean repeating the same idea in slightly different forms. Duplication becomes a problem because when the rule changes, we may need to remember to update it in multiple places, and it becomes easy for those places to drift apart.

Shotgun surgery

Shotgun surgery happens when one small change forces us to edit many different parts of the codebase. Instead of changing one clear owner, we have to “shoot” changes across multiple files, functions, or objects. This is a strong sign that responsibility ownership is weak, because one concern is spread too widely.

Large function / mixed responsibility

A large function is not bad just because it is long. It becomes a smell when that function starts doing many unrelated jobs at once. This usually means user input, validation, business logic, state mutation, output rendering, or flow control are all being handled together. The real problem is not the size alone, but the mixed responsibility inside it.

God object

A god object is an object that knows too much, owns too much, or controls too much. Instead of having one focused responsibility, it becomes the place where many unrelated concerns gather. This kind of object often becomes a central dependency in the system, which makes the code harder to understand, harder to test, and harder to change safely.

Feature envy

Feature envy happens when a function seems more interested in another object’s data than its own. Instead of using its own responsibility well, it keeps reaching into another object and doing work that probably belongs there. This is usually a sign that behavior is living in the wrong place, and that the ownership of logic may need to be reconsidered.

Code Smell Severity

Smell severity is about how serious a code smell is in a specific codebase and context.

There are many ways to rate smell severity. A lot of research has explored this, including work that uses machine learning to detect and evaluate code smells. Different approaches may use different criteria and scales. But for this article, I will use a simpler rating system:

weak
medium
strong

Smell severity is basically about this question:

How serious is this issue in this specific codebase?

The important point is that a code smell category and its severity are not the same thing. A smell may exist, but its severity can still be weak, medium, or strong depending on the context. This means:

Not every true smell is automatically a strong smell.

Writer note:

Here is some research about it:
https://www.sciencedirect.com/science/article/abs/pii/S0950705117301880, https://www.sciencedirect.com/science/article/abs/pii/S0957417424003488, https://sol.sbc.org.br/index.php/sbes/article/view/30345
You can find more using this keyword: code smell severity, code smell prioritization, code smell risk assessment, code smell severity classification, machine learning for code smell severity

Understanding Smell Severity More

When I first tried to understand this concept, I was confused about how to actually judge the code and give a rating. If I found one issue here and another issue there, should I rate both as strong just because both are real smells? That part confused me a lot.

After doing more research and discussing it with GPT, I understood the idea better when I tested it with a simple comparison.

For example, using Bahasa Indonesia for variable names in a codebase can be a real issue. It can reduce readability, especially if the team expects one shared language. But now compare that with a public function that can mutate the database in a dangerous way. Both are still valid problems, yet they clearly do not have the same severity.

That comparison helped me see the difference more clearly. The first issue may still count as a smell, but in that context it feels weak or mild. The second one is much more serious because the risk and impact are much higher.

This made me realize an important point:

smell presence and smell severity are different judgments

In other words, recognizing that a smell exists is only the first step. After that, I still need to judge how serious it is in the current codebase. That usually means comparing things like:

risk
impact
centrality
future damage

And those depend on the context, such as:

the size of the app
how often the smell appears
how central that code is
how much future pain it may create
whether it is the main structural problem or just a side issue

It also means comparing one smell against the other smells in the same codebase, not judging it in isolation.

That was the point where the whole idea of smell severity started to make much more sense to me.

Smell Classification for ToDo App

We will still use this base code from previous article. Try to classify the smell category and its severity for this code by yourself before see my answer.

import Foundation

struct TodoItem {
    var title: String
    var isDone: Bool
}

var todos: [TodoItem] = []

func showMenu() {
    print("")
    print("=== Todo App ===")
    print("1. Show all todos")
    print("2. Add todo")
    print("3. Toggle todo")
    print("4. Remove todo")
    print("5. Exit")
    print("Choose:")
}

func showTodos() {
    if todos.isEmpty {
        print("No todos yet.")
        return
    }

    for (index, todo) in todos.enumerated() {
        let mark = todo.isDone ? "[x]" : "[ ]"
        print("\(index + 1). \(mark) \(todo.title)")
    }
}

func addTodo() {
    print("Enter todo title:")
    if let input = readLine(), !input.trimmingCharacters(in: .whitespaces).isEmpty {
        todos.append(TodoItem(title: input, isDone: false))
        print("Todo added.")
    } else {
        print("Invalid title.")
    }
}

func toggleTodo() {
    showTodos()

    if todos.isEmpty {
        return
    }

    print("Enter todo number to toggle:")
    if let input = readLine(), let number = Int(input) {
        let index = number - 1

        if index >= 0 && index < todos.count {
            todos[index].isDone.toggle()
            print("Todo updated.")
        } else {
            print("Invalid number.")
        }
    } else {
        print("Please enter a valid number.")
    }
}

func removeTodo() {
    showTodos()

    if todos.isEmpty {
        return
    }

    print("Enter todo number to remove:")
    if let input = readLine(), let number = Int(input) {
        let index = number - 1

        if index >= 0 && index < todos.count {
            let removed = todos.remove(at: index)
            print("Removed: \(removed.title)")
        } else {
            print("Invalid number.")
        }
    } else {
        print("Please enter a valid number.")
    }
}

func runApp() {
    var isRunning = true

    while isRunning {
        showMenu()

        let choice = readLine() ?? ""

        switch choice {
        case "1":
            showTodos()
        case "2":
            addTodo()
        case "3":
            toggleTodo()
        case "4":
            removeTodo()
        case "5":
            isRunning = false
            print("Goodbye.")
        default:
            print("Unknown option.")
        }
    }
}

runApp()

Smell Classification Answer

Before giving the classification, I want to repeat one important thing, this is not about memorizing famous smell names and forcing every codebase into them. The goal is to judge structural risk in context. A smell may be real, but its severity still depends on how central it is, how much damage it can create, and how strongly it affects change in the current codebase.

From my perspective, this is the smell classification I ended up with for the Todo app above.

Poor boundaries : strong

I identified poor boundaries as a strong smell.

My reasoning connects directly to the first article. In Module 1, I used boundaries such as input/output, state, data, app flow, and business logic as the main lens for reading the code. In this Todo app, functions like addTodo(), toggleTodo(), and removeTodo() mix several of those together in the same place.

For example, addTodo() mixes:

input
output
state mutation
todo-related logic/business logic

That means the function does not have clear responsibility boundaries.

The important refinement here is that this smell is not strong just because “there are many things inside one function”. It is strong because those responsibilities can change for different reasons. The UI text may change. The input mechanism may change. The todo rules may change. Those should not all force edits in the same function. That is why poor boundaries is a real strong smell here.

Fragile change points : strong

I also identified fragile change points as strong.

One example that helped me see this was imagining a model change. If I add something like priority to TodoItem, that could force changes to:

object creation
input flow
output display
maybe sorting or filtering later

I also connected this to changing the medium of the app itself, for example moving from a CLI app to a graphical UI. Because input and output are mixed directly into the logic, changing the interface would be more painful than it should be.

The important refinement here is that the model changing is not bad by itself. The pain happens because that change ripples into multiple mixed places. That is why fragile change points are strong in this codebase: one concept change can leak through many parts of the current structure.

Duplication : medium

I noticed duplication in:

repeated empty checking
repeated input validation
the repeated overall shape between toggleTodo() and removeTodo()

At first, I was unsure whether duplication might actually be strong, because repeated validation across multiple places can create real pain if I forget to update one of them. But after re thinking the severity, I think that the duplication here is medium.

Why not strong? Because the larger structural problems in this codebase are still:

poor boundaries
mixed responsibilities
weak ownership

So duplication definitely exists, but it is not the most central problem yet.

One important lesson I got from this is that duplication is not only about copy-paste text. It can also appear as duplicated control-flow shape. That helped me see the smell more broadly.

Shotgun surgery : medium

I understood shotgun surgery as one conceptual change forcing many small edits across multiple places.

I connected this to things like:

model changes
validation changes
input flow changes

The smell is present, but in this app I would rate it as medium, not strong.

The reason is contextual. The app is still small, so the edit points are visible and manageable. The conditions for shotgun surgery already exist, but they have not exploded yet. So the smell is real, but at this stage it looks more like a visible structural risk than the most dominant pain point.

This was a useful reminder that smell severity always depends on context.

Writer Note:

Shotgun surgery is related to fragile change points, but they are not the same. Fragile change points focus on how risky and unstable a change location already is, while shotgun surgery focuses on how widely one conceptual change must spread across the code. In this Todo app, the core functions are already fragile because they mix multiple responsibilities, which is why fragile change points can be rated strong. But the codebase is still small, so although one change may spread to several functions, that spread is still limited and manageable. That is why shotgun surgery is better rated as medium here, not strong.

Large function / mixed responsibility : strong

This was one of the most important corrections for me.

At first, I thought the functions were not very large by line count, so maybe the smell was only mild. But that turned out to be the wrong lens.
The key lesson here was:

line count is not the main issue
responsibility load is

Even though addTodo(), toggleTodo(), and removeTodo() are still short functions, they mix:

output and prompting
input
validation
mutation
feedback

So the real smell is not “large function” in a superficial line-count sense. The real smell is mixed responsibility, and that is strongly present.

This was an important calibration point for me, because it connects to a recurring weakness from Module 1 too: I sometimes underestimate mixed responsibility when the function is still short. After correcting that, I now rate this smell as strong.

God object : weak / not really present

I rated god object as weak or not really present.

The reason is simple, this app is still too small to honestly call anything a god object. There is no single giant object that owns everything in a meaningful object-oriented sense. The bigger problems here are still poor boundaries and mixed responsibilities.

Feature envy : weak / not really present

I also rated feature envy as weak or not really present.

Again, the reason is that this is not the main pattern in the current structure. The app is too small, and the code is not interacting in a way that makes feature envy the most useful label.

Create Refactoring Plan

After identifying the smells and their severity, we can start creating a refactoring plan based on which problems matter most. Let me ask you first: what do you think should be the first priority? Try answering that in your own mind, along with the reasoning behind it.

For me, the first priority is poor boundaries. The reason is that this is the smell most closely connected to the deeper structural problem in the Todo app. Input, output, state mutation, and business logic are still too mixed together. As long as those boundaries remain unclear, other problems like fragile change points and duplication will keep appearing more easily.

Because of that, my first refactor direction was to create a store that holds state and business logic.

The idea is to introduce something like this:

final class TodoStore {
    private var todos: [TodoItem] = []

    var items: [TodoItem] {
        todos
    }

    func addTodo(title: String) throws {

    }

    func toggleTodo(at index: Int) throws {

    }

    func removeTodo(at index: Int) throws -> TodoItem {

    }
}

This felt like a good first move because it:

gives state a clearer owner
centralizes mutation
separates todo-related logic from input/output
reduces ripple effects
makes future changes safer

So this became the first refactor direction I chose.

How to refactor?

If this part feels a little jumpy because I suddenly introduced TodoStore and showed the code without walking through the full refactoring process, that is completely fair.

I will cover the actual refactoring process in the next article of this series (What I’m Learning About Writing Better Structured Code). The next article is “Guide to Solve the Problem: Introduction to SOLID Principles.” In that article, I will focus more on how I gradually moved from the original Todo app into a structure with clearer ownership, better boundaries, and more focused responsibilities, using the SOLID principles as a guide.

From the Writer

Hello, allow me to introduce myself. I’m Cakoko. We’ve reached the end of this article, and I sincerely thank you for taking the time to read it.

If you have any questions or feedback, feel free to reach out to me directly via email at cakoko.dev@gmail.com. I’m more than happy to hear your thoughts, whether they are about my English writing, the technical ideas in this article, or anything I may have misunderstood. Your feedback will help me grow.

I look forward to connecting with you in future articles. Btw, I’m a mobile developer, final year Computer Science student, and an Apple Developer Academy @ IL graduate. I’m also open to various opportunities such as collaborations, internships, or full-time positions. It would make me very happy to explore those possibilities.

Until next time, stay curious and keep learning.

Open for Feedback

This article is part of my personal learning journey. It may not be completely accurate or perfect, and that is okay. I’m sharing what I’ve learned so far in the hope that it can also help others who are exploring similar topics.

If you have any feedback, suggestions, or corrections, I would truly appreciate them. I’m always open to learning more and improving along the way.

DEV Community