Refactoring Tips and Tricks: From 1400 to 300 Lines of Code.
Click the links in the list to jump to the specific tips and tricks and skip the “cutscenes."
- Object Inheritance
- Using High-Level Functions
- Using GIT to Avoid Legacy Commented Code Blocks
- Unnecessarily Specific Functions and Repeated Code
- Make the Most of Your Data Sources
- Conclusion
The Background
This week, I had the privilege of working in a very interesting project: one of our clients received an email stating that the API version of the CRM they were using to process their commercial leads was going to be deprecated starting January 1, 2020. The client reached out to us and asked if we could update the library they were using to connect to the old API so that it worked with the V2 of the API. No big deal, right?
Well, of course, I immediately under budgeted the project and agreed with enthusiasm... but I was not ready to see what was under the hood of the library they were using to connect to the API.
It turns out that all of the scripting and communication to the API was written on not one, but two very legacy-ish Google script projects that resembled the example of how NOT to write maintainable code. (Don’t get me wrong - I never diss on someone else's work and I always try to be supportive and helpful to the developers, but as this was some development agency that was no longer connected to our client anymore. I had a good laugh on their expense.)
So, that’s where my journey in refactoring started. As this was for a client, I won’t be sharing the real code of his app, but I will be translating the issues that I found and their solutions in generic code.
Two Identical Projects?
The first thing I wanted to understand about this projects was simple: why two separate projects? I found that the reason for this was that they had a slightly different way to approach the handling of their leads, depending on the country of origin for the leads. All of the leads were treated the same, with the exception of commercial leads from Argentina, which were treated slightly different in a couple of steps of the processing.
This resulted in almost two identical copies of the same library. In an effort to avoid this, I decided to use object inheritance.
The Power of Object Inheritance
As this client already had a Laravel app already running, I decided to use it to save time on the setup and rewrote the library in PHP. Otherwise, I suppose Python would’ve been nice as well. (What would you have used?)
In order to tackle the problem of having two separate projects, I created one father class for the library with all the functionality and another special class that extended this father class.
Here's the parent class with all the functionality for handling the leads:
class LeadsHandler {
public $leadId;
public function __construct($leadId) {
$this->leadId = $leadId;
}
//This is the function that is different for the special case:
public function message(){
echo "I'm the global leadsHandler";
}
//..
/* OTHER VERY IMPORTANT FUNCTIONS THAT EVERY LEAD HANDLER HAS IN COMMON */
//..
}
Here's the class I used for the special case:
//For this class, the function message has been rewritten so that it does a different specific action, but all the other functionalities from the parent class will be the same.
class ArgentinaLeadsHandler extends LeadsHandler {
public function message() {
echo "Che, I'm the Argentinian leads handler!";
}
}
If we try it, we get this:
//These are hardcoded example values. In the real project, we got them through HTTP REST Requests.
$leadCountry = 'Japan';
$leadId = 'AABB12345';
if ($leadCountry == 'Argentina'){ //this is False
$leadHandler = new ArgentinaLeadsHandler($leadId);
}else{
$leadHandler = new LeadsHandler($leadId);
}
$leadHandler->message(); //This prints "I'm the global leadsHandler"
Now, if we change the leadCountry:
$leadCountry = 'Argentina';
$leadId = 'AABB12345';
if ($leadCountry == 'Argentina'){ //this is True
$leadHandler = new ArgentinaLeadsHandler($leadId);
}else{
$leadHandler = new LeadsHandler($leadId);
}
$leadHandler->message(); //This prints "Che, I'm the argentinian leads handler!"
There you have it. Now we don't need to have two almost identical classes and can just rewrite the few functions that are different for the special case.
If we wanted to make this scalable to other countries, we could create a class for each country with special needs and use a switch statement to instantiate the proper class.
//Switch approach
switch ($leadCountry) {
case "Argentina":
return new ArgentinaLeadsHandler();
case "Japan":
return new JapanLeadsHandler();
case "Colombia":
return new ColombiaLeadsHandler();
//Etc...
}
//This list can get very long if there are many different possible classes to instantiate.
However, since we are trying to keep the code minimal, we can go for this approach where we dynamically set the name of the class to instantiate:
//Shorter approach
$leadCountry = 'Argentina';
$class = $leadCountry . 'LeadsHandler'; //This returns a new instance of the ArgentinaLeadsHandler class
$leadHandler = new $class();
Keep in mind that you would have to follow the naming convention of finishing the classes country name with "LeadHandler" in order for this to work properly.
High-Level Functions
These are great to reduce the amount of lines in code while keeping the readability of it.
Using the Ternary Operator
If you're a somewhat experienced developer, you might have cringed a little bit at the "IF ELSE" block of the first section.That's because you could've replaced those 5 lines of code with a single line if I would've used the Ternary Operator.
This operator is present in most modern languages and works like this:
variable = condition ? value_if_true : value_if_false
For example, we could've rewritten these 5 lines:
if ($leadCountry == 'Argentina'){ //this is False
$leadHandler = new ArgentinaLeadsHandler($leadId);
}else{
$leadHandler = new LeadsHandler($leadId);
}
Into this single line:
$leadHandler = ($leadCountry == 'Argentina') ? new ArgentinaLeadsHandler($leadId) : new LeadsHandler($leadId);
Keep in mind that using the ternary operator can sometimes make your code a little less readable, so be smart when you use it. However, once you're comfortable with it, you're going to find yourself using it very often.
Research Your Framework Operators and Functions
While trying to make sense and rewrite this function, which can be fixed in many different ways, I browsed the Laravel documentation and I saw the function I was looking for: increment.
The following is the original (bad) function that required rewriting:
function updateLeadsCount($operator, $value){
//Basically, this function counts the total of leads and increases or decreases that value in the database.
$newLeadsCount = count($this->getRecords('leads'));
if($operator == 'decrease'){
$oldLeadsCount = $this->getLeadsCount();
$total = $newLeadsCount - $value;
DB::table('configs')->update('leadsCount', $total);
}else if($operator == 'increment'){
$oldLeadsCount = $this->getLeadsCount();
$total = $newLeadsCount + $value;
DB::table('configs')->update('leadsCount', $total);
}
}
Here is my rewritten version:
function updateLeadsCount($value){
//value is a positive or negative value that I will add up to the stored value 'leadsCount'
DB::table('configs')->increment('leadsCount', $value);
}
Now instead of having to retrieve the value, check if I have to increase or decrease it, and then store the value in the database, I can just pass that value to the function increment and it will do it all for me in one line!
It even supports negative values so I don't have to check if I have to use the function increment or decrement.
High-level functions can save you dozens of lines while increasing the readability of your code.
Using GIT to Avoid Legacy Commented Code Blocks
While browsing the library, I found a lot of test functions or deprecated functions that the previous developers were too afraid to delete. There were even comments warning not to delete these functions "just in case."
Although it is true that clients change their minds and requirements all the time, a huge commented code block is not the place to store legacy code that is not currently being used. My approach to avoid this is to always use a version control repository like GIT to store the different versions of my apps. This way I can make sure that I can always go back in time to take a look at the library code and, if necessary, retrieve functionalities if the client wants them back.
Regardless of this specific issue, you should always use GIT for a variety of reasons.
Unnecessarily Specific Functions and Repeated Code
Another problem I encountered was a bunch of very specific functions that all worked very similarly. For example, in order to get data from their CRM, they would have a function for each module to get the data as well as functions to get a specific public property of an object that had no need to be encapsulated.
To avoid this, I used generic functions and parameters to send the specifics of the different queries to perform.
Here's an example list of very specific functions:
//Getting one specific client:
function getClient($clientId){
$CRMconnection = new CRMconnection();
$CRMconnection->module = 'clients';
$CRMconnection->recordID = $clientId;
return $CRMconnection->get();
}
//Getting all clients:
function getClients(){
$CRMconnection = new CRMconnection();
$CRMconnection->module = 'clients';
return $CRMconnection->get();
}
//Getting all leads:
function getLeads(){
$CRMconnection = new CRMconnection();
$CRMconnection->module = 'leads';
return $CRMconnection->get();
}
//Getting specific data from client:
function getClientName($clientId){
$CRMconnection = new CRMconnection();
$CRMconnection->module = 'clients';
$CRMconnection->recordID = $clientId;
$client = $CRMconnection->get();
return $client->name;
}
//Getting one specific lead:
function getLead($leadId){
$CRMconnection = new CRMconnection();
$CRMconnection->module = 'leads';
$CRMconnection->recordID = $clientId;
return $CRMconnection->get();
}
//Custom specific get for client:
function getActiveClient($clientId){
$CRMconnection = new CRMconnection();
$CRMconnection->module = 'clients';
$CRMconnection->criteria = '(Active:true)';
$CRMconnection->recordID = $clientId;
return $CRMconnection->get();
}
//etc..
Here's the list of the two generic functions that suit all the other cases and more:
//Generic record get
function getRecord($module, $recordId, $criteria = null){
//I've also used the constructor class method with parameters to reduce more lines.
$CRMconnection = new CRMconnection($module, $recordId, $criteria);
return $CRMconnection->get();
}
//Generic records get:
function getRecords($module, $criteria = null){
$CRMconnection = new CRMconnection($module, null, $criteria);
return $CRMconnection->get();
}
The best thing about this approach of using generic functions is that, in case that the $CRMconnection class gets updated, I would only have to revisit the two generic functions instead of all the specific functions that were there in the first place.
Make the Most of Your Data Sources
When we are getting data from a data source like an API or a Database. I recommend trying to get the minimal raw data that you're gonna use.
For example, this library needs to get the latest uploaded lead in the system. In order to do that, they were getting all of the leads and iterating through them to find the latest one like this.
function getLatestLead(){
$latestLead = null;
$latestDate = date('01-01-1900');
$leads = $this->getRecords('leads');
foreach ($leads as $lead) {
if ($lead->date > $latestDate){
$latestDate = $lead->date;
$latestLead = $lead;
}
}
return $latestLead;
}
But, why don't we give the beaten up old server a rest and relay the work on the external data source?
The same data could've been retrieved with a function like this one:
function getLatestLead(){
//Here we are saying in the criteria that we just want one record and we want it to be the last one by specifying the order:
$criteria = '((Order:DESC)and(limit:1))';
$leads = $this->getRecords('leads', $criteria);
return $leads[0];
}
A similar approach can be used when querying databases.
Conclusion
I don’t think of myself as a refactoring guru, but I think that after going from 1400 lines of code written in two separate projects to just 300 lines of maintainable code, the project can definitely be considered a success. The code is much easier to follow, understand, and debug - and it works as well as the previous library.
However, I’m curious about how other developers would have approached the same project and issues. Which language and techniques would you have used?
Let me know in the comments.
Top comments (0)