Jarosław Szutkowski

Posted on Aug 22

When Classes Do Too Much: Using LCOM to Spot 'God Classes' in PHP

#php #cleanarchitecture #codequality #testing

Big, messy classes slow projects down and scare developers away.
In this post I’ll show you how to use LCOM to spot those "god classes" in your PHP code - and how to break them into something leaner and easier to test.

LCOM in PHP – spotting classes that do too much

Every project has them. The UserManager, the DataHelper, the dreaded Utils.

Classes so big and messy that nobody wants to touch them. They grow over time, swallow more responsibilities, and eventually slow the whole team down.

But how do you know when a class is doing too much?

Instead of guessing, you can measure it - with a metric called LCOM.

What is LCOM?

LCOM stands for Lack of Cohesion of Methods.

In simple terms, it measures how much the methods in a class actually belong together.

If most methods use the same set of properties, the class is cohesive - that’s good.

If methods work on completely different properties, the class is basically doing multiple jobs at once - that’s bad.

High LCOM = low cohesion = a "god class" that tries to handle too many responsibilities.

LCOM flavours (LCOM1 vs LCOM2)

There isn’t just one formula for LCOM. Over the years, different versions have been proposed.

The two most common are:

LCOM1 – looks at pairs of methods: the more pairs that share nothing, the higher the score.
LCOM2 – groups methods into clusters that share at least one property; the number of clusters is the score.

Quick examples

Messy class
- foo() uses $a,$d
- bar() uses nothing
- baz() uses $b,$d
- qux() uses $c

-> LCOM2 = 3 clusters ({foo,baz}, {bar}, {qux}) -> 3
-> LCOM1 = 5 pairs don't share anything, 1 shares $d (5 - 1 = 4) -> 4

Cohesive class
- All methods use $a (one uses $a,$b,$c,$d)

-> LCOM2: one cluster -> 1.
-> LCOM1: every pair shares something -> 0.

Both metrics point to the same thing (messy vs cohesive), but the values look different.

And beyond…

There are also LCOM3 and LCOM4, which tweak the maths further. I won’t dive into them here, but the key point is:

👉 different definitions can give different numbers, even for the same class. That’s why it’s best to treat LCOM as a relative signal — spot the outliers, watch trends over time, don’t obsess over exact figures.

How to measure LCOM in PHP

The easiest way is to use static analysis tools such as phpmetrics.

They scan your codebase and calculate LCOM values for each class.

Example with phpmetrics:

php vendor/bin/phpmetrics --report-json=build/report.json src

Phpmetrics also generates handy HTML reports with charts and diagrams.
These make it easy to spot classes with high LCOM values at a glance.

If you just want to test it quickly without installing anything, you can use the phpqa Docker image.

Here’s the one-liner I used to generate an HTML report:

docker run --init -it --rm \
  -v "$(pwd):/project" \
  -v "$(pwd)/tmp-phpqa:/tmp" \
  -w /project jakzal/phpqa \
  phpmetrics --report-html=build/report.html src

In the HTML report, open report.html/oop.html.
There you can sort classes by their LCOM value (highest or lowest) to immediately find the worst offenders.

Example in PHP code

Let’s look at a simple class that shows poor cohesion:

class UserManager
{
    private array $users = [];
    private $mailer;

    public function addUser(string $name): void { /* ... */ }
    public function findUser(string $name): ?User { /* ... */ }

    public function sendNewsletter(): void { /* ... */ } // unrelated to user storage
}

addUser and findUser both use the $users property.
sendNewsletter uses $mailer but has nothing to do with $users.

This means the class mixes two very different responsibilities. Tools will report a high LCOM here, because the methods don’t share common data. In practice, this is a sign that you need to split the class.

How to fix high LCOM

When a class shows a high LCOM score, the remedy is almost always the same: split it into smaller, focused classes.

Here’s the earlier UserManager refactored:

class UserRepository
{
    private array $users = [];

    public function addUser(string $name): void { /* ... */ }
    public function findUser(string $name): ?User { /* ... */ }
}

class NewsletterService
{
    private $mailer;

    public function sendNewsletter(): void { /* ... */ }
}

Now each class has a single responsibility.

UserRepository handles storing and finding users.
NewsletterService deals with sending newsletters.

This makes your design cleaner and your tests simpler: each class can be tested in isolation, with fewer dependencies and fewer edge cases to worry about.

Wrapping up

LCOM might sound academic, but in practice it’s a simple early warning system:

Low LCOM (close to 0) -> cohesive class, one clear job.
High LCOM -> class is mixing responsibilities, harder to maintain and test.

You don’t need to obsess over the exact numbers. Use them to spot the outliers in your codebase.

Top comments (11)

david duymelinck • Aug 22

I think it a metric you shouldn't use. It can be skewed too easily to create an outlier.
A few examples of the top of my head;

if you use getter and setter methods
if you use factory methods to make it easier to add an argument to the constructor
if you have fluent API methods
if you use different names for the same property because of context (user, loggedIn, admin)

For the example I think the sendNewsletter method is ambiguous.
If it is implemented the way you describe, it shouldn't be in the class. I think this will happen more in the case of a Utils class than a more defined class like UserManager
I think it is more likely scenario is the use of a current user property inside of the function and something is done with that property to send the newsletter.
But again it shows that LCOM formulas are skewed too easily, just by using protected or private properties in this case.

As a side note, I think we should stop using the god class term. The term itself is a god object, because it refers to too many dependencies or too many unrelated methods or both. It should be separated into heavy dependency object and inconsistent methods object. The LCOM metric only relates to the inconsistent methods object. The heavy dependency object can be checked by other metrics.

Jarosław Szutkowski • Aug 22

Thanks! LCOM’s not perfect, but it’s a handy signal that something might be off. And PhpMetrics even skips getters/setters, so they don’t mess with the score. The sendNewsletter was just an example, not my dream implementation 😉 And yeah, "god class" isn’t the cleanest term, but devs know what it means.

david duymelinck • Aug 24 • Edited

it’s a handy signal that something might be off

If you have to add suggestions to the metric like don't obsess over the numbers and look for the outliers. I think it is not a good signal.
It is like riding your car with the blinker on all the time. It could cause more harm than good.

PhpMetrics even skips getters/setters, so they don’t mess with the score

That is a good thing.

sendNewsletter was just an example

I understand, I just wanted to add more nuance to the example in the context of the metric.

Jarosław Szutkowski • Aug 25

Thanks for sharing your perspective, I really appreciate the nuance you bring 🙏 Out of curiosity, what kind of metrics do you usually rely on, and what tools do you use to measure them? Would be great to learn from a different point of view

david duymelinck • Aug 25

I'm not big on metrics to judge code quality. I prefer a more manual process.
I read the code and access the mental model behind the code. If it is easy to find the code is good.

I think the problem with every metric is that you only can measure the code.
And because there is a lot of hidden meaning in code, I think most code metrics are very low on the rating scale.

Best practices done indiscriminately lead to bad code. For instance loose coupling can become a major problem in certain scenarios.

Jarosław Szutkowski • Aug 25

I get your point about reading the code directly - that’s a good way to really understand it. I think scale makes a big difference. In a small app you can do that easily, but when you’ve got thousands of classes it gets a lot harder to see where the trouble spots might be. That’s where I find metrics useful - to point me in the right direction so I know where to look closer

david duymelinck • Aug 26

The way I evaluate a big project is by reading the documentation if there is any, reading the tests, identifying the core functionality and reading that code. Most of the time the core has not that many classes that connect everything. It that is maintainable I give it a green flag.

I think you don't need to check the whole project if it works as is. I think you have to make the code your problem if it gets you into troubles.
A Utils class can be a dump of random methods, but as long as those methods don't cause a problem they are not a high priority. They could get higher on the priority scale if you need to add or change a method. Then you can access if moving it to a different class is beneficial and maybe take a few related methods along in the move.

I think it is better thinking about maintainability and time management than using metrics as a way evaluate a project.

Lars Moelleken • Aug 28

I get your point: metrics like LCOM can be skewed. Getters, factories, fluent APIs… you can make any number look bad. But to me, the question isn’t whether the math is perfect, it’s whether the system works with us or starts working against us.

At small scale, you can just read the code and trust your mental model. At large scale, no one has time to eyeball thousands of classes. That’s where metrics are still useful... not as truth, but as headlights in the fog. They tell you where to look, not what to think.

What really drives change, though, is developer experience. If the new way is simpler, clearer, and feels right, people will adopt it naturally. Step by step, the old cruft... the Utils dumps, the ambiguous hybrids - fades (80%) away... the remaining 20% is often work that never gets done. Let's call it the trade-off. 🙈🙉🙊

“Simple things should be simple, complex things should be possible.” – Alan Kay

david duymelinck • Aug 29

I'm assuming when you talk about a project with thousand of classes, they are all custom code. When a project reaches that amount of classes that is not going to happen without following loose coupling and single responsibility practices.
So that is why I look for god objects in the core of the application.

I agree that a Datahelper or a Utils class are not the best names to collect the methods. But those classes can be refactored over a longer period, instead of needing to be refactored as soon as possible.

not as truth, but as headlights in the fog

I like the analogy, but I think it is not the right one. I think the better analogy is a searchlight. It makes a specific place visible in the dark.
The point I want to make it that when metrics can't prove an absolute they are not high value metrics.
Is someone productive because they do a lot of issue tickets, or when they write a lot of code? For me none of them are good metrics because it is possible to rig them or don't expose the unmeasured work.

I think the more important metrics are how fast can the code process the request, and what part of the code requires the most time.

Eric Walter • Aug 26

Great explanation! LCOM is a simple yet powerful way to spot “god classes” that try to do too much. A high LCOM score usually means the class is mixing different responsibilities, which makes it harder to test and maintain. The fix is straightforward: split big classes into smaller, focused ones. This keeps responsibilities clear, improves cohesion, and makes your PHP codebase easier to scale and work with.

SkyAccess • Aug 26

Thanks

View full discussion (11 comments)