Author: Andrey Karpov
In this article, we'll discuss how the PVS-Studio static analyzer encourages developers to refactor their code. After all, the shorter, simpler, and clearer the code is, the fewer errors it contains.
In the previous article — "Let's check the qdEngine game engine, part one: top 10 warnings issued by PVS-Studio" — we've already looked at an interesting example of code refactoring that was encouraged by the PVS-Studio warning message (see the tenth code fragment). Let's look at some more examples of how we can enhance the code using a static analyzer.
Overcomplicated 'restore' function
class qdCamera
{
....
sGridCell *Grid;
....
};
bool qdCamera::restore(sGridCell *grid,
int sx, int sy,
int csx, int csy)
{
if (Grid)
delete[] Grid;
Grid = new sGridCell[sx * sy];
if (!Grid)
return false;
memcpy(Grid, grid, sizeof(sGridCell) * sx * sy);
GSX = sx;
GSY = sy;
cellSX = csx;
cellSY = csy;
return true;
}
Here we have a function that reallocates a buffer of the required size. The analyzer spots this code because it detects a meaningless check of the pointer returned by the operator new[]. I'm talking about the following lines:
Grid = new sGridCell[sx * sy];
if (!Grid)
return false;
The PVS-Studio warning: V668 There is no sense in testing the 'Grid' pointer against null, as the memory was allocated using the 'new' operator. The exception will be generated in the case of memory allocation error. qd_camera.cpp 910
Indeed, if there isn't enough memory to create an array, the operator new[] throws an exception of the std::bad_alloc type. You can get a null pointer only using new (std::nothrow) [], but this isn't our case.
Here's a function where an unnecessary check has been removed:
bool qdCamera::restore(sGridCell *grid,
int sx, int sy,
int csx, int csy)
{
if (Grid)
delete[] Grid;
Grid = new sGridCell[sx * sy];
memcpy(Grid, grid, sizeof(sGridCell) * sx * sy);
GSX = sx;
GSY = sy;
cellSX = csx;
cellSY = csy;
return true;
}
Now the restore function always returns true. It either throws an exception or works properly. Returning a success status (true or false) no longer makes sense. We can simplify the function by removing the return value:
void qdCamera::restore(sGridCell *grid,
int sx, int sy,
int csx, int csy)
{
if (Grid)
delete[] Grid;
Grid = new sGridCell[sx * sy];
memcpy(Grid, grid, sizeof(sGridCell) * sx * sy);
GSX = sx;
GSY = sy;
cellSX = csx;
cellSY = csy;
}
As a bonus, we no longer need to check what the function returned in other code fragments.
What else can we enhance? There's no point in checking the pointer before calling the operator delete[]. The operator handles null pointers correctly (in this case, it does nothing).
Someone may argue that such a check is a micro-optimization, because if the pointer is null, you don't need to call the operator. That's nonsense. The function call time can be neglected when there are heavy operations such as memory allocation and deallocation. I've covered the topic in the following article: "Shall we check pointer for NULL before calling free function?" As you can see from the title, the article is about the free function, but it's the same with the operator delete.
The code without unnecessary checks:
void qdCamera::restore(sGridCell* grid,
int sx, int sy,
int csx, int csy)
{
delete[] Grid;
Grid = new sGridCell[sx*sy];
memcpy(Grid, grid, sizeof(sGridCell) * sx * sy);
GSX = sx;
GSY = sy;
cellSX = csx;
cellSY = csy;
}
We've significantly shortened the code without losing anything. Unfortunately, both the old and the new code still have a weakness:
delete[] Grid;
Grid = new sGridCell[sx * sy];
....
qdCamera::~qdCamera()
{
if (GSX)
{
delete[] Grid;
}
}
If an exception occurs when the operator new is called, the stack unwinding and destruction of various objects begin. The destructor of the qdCamera object is also called. And the Grid pointer may not be null, but it's no longer valid. The array is destroyed just before the operator new is called. This results in undefined behavior.
The simplest fix is to add pointer zeroing:
delete[] Grid;
Grid = nullptr;
Grid = new sGridCell[sx * sy];
That's a bad fix, though. We're wasting our energy on a lost cause. The main issue in the code is manual memory management. It causes programmers to write long, buggy code.
The correct solution is to use a smart pointer or std::vector. Let's start with the std::unique_ptr smart pointer, as std::vector isn't always practical and efficient.
class qdCamera
{
....
std::unique_ptr<sGridCell[]> Grid;
~qdCamera() = default;
....
};
void qdCamera::restore(sGridCell *grid,
int sx, int sy,
int csx, int csy)
{
Grid = std::make_unique<sGridCell[]>(sx * sy);
std::copy(grid, grid + sx * sy, Grid.get());
GSX = sx;
GSY = sy;
cellSX = csx;
cellSY = csy;
}
This is much better:
- we've shortened the function code;
- we've got rid of low-level new/delete calls thanks to a smart pointer and std::make_unique;
- we've secured the code;
- now we don't need a destructor at all.
Finally, here's an example of using std::vector.
class qdCamera
{
....
std::vector<sGridCell> Grid;
~qdCamera() = default;
....
};
void qdCamera::restore(sGridCell *grid,
int sx, int sy,
int csx, int csy)
{
Grid.clear();
Grid.reserve(sx * sy);
std::copy(grid, grid + sx * sy, std::back_inserter(Grid));
GSX = sx;
GSY = sy;
cellSX = csx;
cellSY = csy;
}
However, if you switch to containers during refactoring, the interface of many functions changes too. This is what the restore function is likely to look like:
void qdCamera::restore(const std::vector<sGridCell> &grid,
int sx, int sy,
int csx, int csy)
{
Grid = grid;
GSX = sx;
GSY = sy;
cellSX = csx;
cellSY = csy;
}
Although, I'm getting way off the point of the story. Obviously, during extensive refactoring, the code may become completely different. The restore function, for example, may disappear or transform into something else.
The main idea is that the PVS-Studio analyzer warnings help you write shorter and safer code.
C-style memory management in arButton class
In the arButton class, manual memory management is even more apparent than in the previous case. The class creates copies of strings and can return pointers to them.
There are string copy functions:
class arButton
{
....
void set_obj(const char* p)
{ if(obj_name) free(obj_name);
obj_name = strdup(p); }
void set_obj_regvalue(const char* p)
{ if(obj_name_regvalue) free(obj_name_regvalue);
obj_name_regvalue = strdup(p); }
void set_cmdline(const char* p)
{ if(cmd_line) free(cmd_line); cmd_line = strdup(p); }
void set_regkey(const char* p)
{ if(reg_key) free(reg_key); reg_key = strdup(p); }
void set_reg_exec_path(const char* p)
{ if(reg_exec_path_value) free(reg_exec_path_value);
reg_exec_path_value = strdup(p); }
void set_checkstr(const char* p)
{ if(check_after_exec) free(check_after_exec);
check_after_exec = strdup(p); }
....
char *obj_name;
char *obj_name_regvalue;
char *cmd_line;
char *reg_key;
char *reg_exec_path_value;
char *check_after_exec;
};
Each string has a its own getter, but we don't care about that for now. Here's just one example:
char* get_obj(void) { return obj_name; }
The analyzer issues six similar warnings about potential memory leaks for this class. Here's just one such warning from PVS-Studio, so as not to make the article too long: V773 The 'obj_name' pointer was not released in destructor. A memory leak is possible. ar_button.cpp 50
The thing is, the class destructor doesn't do anything.
arButton::~arButton(void)
{
}
Strings created using strdup aren't destroyed in set_* functions. Memory leaks are possible.
We could fix the code by refining the destructor, but would we really want to take that intermediate step? The point is that we not only want to fix the code, we'd like to enhance it right away. Let's look at one of the set functions again:
void set_obj(const char* p)
{ if(obj_name) free(obj_name);
obj_name = strdup(p); }
Here are the weaknesses in the code:
- The check before calling free is unnecessary. The free function does nothing if NULL is passed to it.
- There's no check for the pointer returned by the strdup function. So, the check is moved to the external code. And we can't determine why the get_* function returns NULL. There are two options. The first one is that the set_* function hasn't been called. The second option is that a memory allocation error has occurred, and the application has been disrupted.
- As we have already seen above, such code with manual memory management causes errors.
We can rewrite the code using smart pointers. However, this is pointless, as the best way is to use std::string. After all, the program is written in C++, not in C.
class arButton
{
....
void set_obj(const char* p) { obj_name = p; }
void set_obj_regvalue(const char* p) { obj_name_regvalue = p; }
void set_cmdline(const char* p) { cmd_line = p; }
void set_regkey(const char* p) { reg_key = p; }
void set_reg_exec_path(const char* p) { reg_exec_path_value = p; }
void set_checkstr(const char* p) { _after_exec = p; }
....
std::string obj_name;
std::string obj_name_regvalue;
std::string cmd_line;
std::string reg_key;
std::string reg_exec_path_value;
std::string check_after_exec;
};
Depending on the refactoring extent, the getter functions can look like this:
const char* get_obj(void) const { return obj_name.c_str(); }
Or like this:
const std::string& get_obj(void) const { return obj_name; }
Let's get back to the set functions. They're not "good" yet, but without external context, it's hard to say what's the best way to refine them. The functions have the following flaws so far:
- It's not clear what's the best way to handle cases where nullptr is passed to the function. Is that an "empty string" or "oh no, it's time to throw an exception"?
- Copying strings is inefficient. Before allocating a buffer for a string, it's necessary to calculate its length first.
Here's an abstract option for a possible enhancement:
....
void set_cmdline(std::string s) noexcept { cmd_line = std::move(s); }
void set_regkey(std::string s) noexcept { reg_key = std::move(s); }
....
Let's leave it at that, otherwise the question will follow: do we really need a layer of get/set functions? The code may need even more extensive refactoring, so we may end up with a completely different class structure.
Handy stuff for experienced programmers
Some readers may be a bit puzzled and thinking:
"What was that all about? We already know that manual memory management is bad, and it causes errors. Why would you mention the analyzer? We don't need it to see that the shown code needs to be rewritten".
I agree, but please note the following points:
- It was the analyzer that drew attention to the bad code.
- There's a reason not to do some abstract refactoring, but to fix bugs. If you always lack time for refactoring, give it a go! A good reason to justify the need to edit the code is to fix bugs/weak code fragments. You can also make the surrounding code look nicer :)
Thank you for reading! Download PVS-Studio to enhance your code.
Top comments (0)