DEV Community

Cover image for The Visitor Pattern in PHP
Doeke Norg
Doeke Norg

Posted on • Originally published at doeken.org

The Visitor Pattern in PHP

The Visitor Pattern isn't used often. This is because there are few situations in which it is applicable or even makes sense. However, it's a nice pattern to know and to have in your tool belt when the time comes. Let's look at how this pattern can be applied in a PHP environment.

🛑 The problem

Like a few other patterns, the Visitor Pattern tries to solve the problem of adding functionality to an entity without changing it (much...). In addition to this very generic problem, it provides a way of adding the functionality to multiple similar entities, which can't be completely handled in the same way.

So let's make the problem a bit more practical. Imagine you have two entities: Book and Document. And for both of these entities we want to know how many pages there are. Our Document has a public function getPageCount(): int which returns the number of pages, while the Book consists of an array of Chapter entities, which also have this function.

class Document
{
    public function __construct(private int $page_count) {}

    public function getPageCount(): int
    {
        return $this->page_count;
    }
}

class Chapter extends Document
{
    // Chapter specific code 
}

class Book
{
    public function getChapters(): array
    {
        return [
            new Chapter(5),
            new Chapter(7),
            new Chapter(2),
        ];
    }
}
Enter fullscreen mode Exit fullscreen mode

To streamline the process of returning the page count for either of these entity types, we create a PageCountDumper. A (somewhat naive) implementation of this could look like this:

class PageCountDumper
{
    public function handle($entity)
    {
        if ($entity instanceof Document) {
            var_dump($entity->getPageCount());
        } elseif ($entity instanceof Book) {
            $count = 0;

            foreach ($entity->getChapters() as $chapter) {
                $count += $chapter->getPageCount();
            }

            var_dump($count);
        } else {
            throw new \InvalidArgumentException('PaperCalculator can not handle the provided type.');
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

And we can call it like this:

$document = new Document(20);
$book = new Book();

$dumper = new PageCountDumper();

$dumper->handle($document); // int(20)
$dumper->handle($book); // int(14)
Enter fullscreen mode Exit fullscreen mode

This PageCountDumper has a handle() function that can handle both the Book and the Document entity, and will var_dump the proper page count for both. There are however a few things that stand out:

  • Because there is no shared interface or abstraction between Document and Book, the handle() function receives a mixed $entity and contains the logic for either situation. When adding on more entities, this type checking will pile on and can become quite cumbersome and unreadable.
  • We throw an exception when the entity type is unknown to avoid improper use.

We can do better!

👋 The Visitor Pattern Solution

So the Visitor Pattern provides a solution for this particular problem. It will remove the need for the instanceOf type checks, while keeping the reference to the entity type intact. And it will remove the need to explicitly throw an exception. Let's see how the Visitor pattern tackles these issues.

Entity specific functions

First off, to remove the instanceOf checks, it requires a method for every possible entity type. For convention's sake, we'll call these methods: visitBook(Book $book) and visitDocument(Document $document). And because we are creating a Visitor let's rename the calculator to: PageCountVisitor.

class PageCountVisitor
{
    public function visitBook(Book $book)
    {
        $count = 0;

        foreach ($book->getChapters() as $chapter) {
            $count += $chapter->getPageCount();
        }

        var_dump($count);
    }

    public function visitDocument(Document $document)
    {
        var_dump($document->getPageCount());
    }
}
Enter fullscreen mode Exit fullscreen mode

By implementing separate methods, with type-hinted arguments, we've removed the need for the instanceOf checks. And because we can only call these methods with the appropriate type, there is no need to throw an exception. PHP would already do so when we provide an invalid argument.

If there is another entity in the future that needs its pages to be counted, let's say a Report, we can add a pubilc function visitReport(Report $report) and implement that logic separately.

But, you might be thinking: This isn't better. I still need to know what type my entity is in order to call the correct method!. And you would be correct. But hold on, this refactoring is only half of the visitor pattern.

Accepting a visitor

Remember when I said the entities the visitor works on should not be changed much? Yeah, well; there is one change that is needed on every entity to make the Visitor Pattern work. But only one, and this will make it accept any visitor, and therefore add any (future) functionality.

To avoid the instanceOf check, there is only one context in which we can be sure the entity is of a certain type: within the entity itself. Only when we are inside a (non-static) method of a class, we know for certain that $this is an instance of that type. That is why the Visitor Pattern uses a technique called Double Dispatch, in which the entity calls the correct function on the visitor, while providing itself as the argument.

To implement this double dispatch we need a generic method that receives the visitor, and relays the call to the correct method on the visitor. By convention this method is called: accept(). This method will receive the visitor as its argument. In order to accept other visitors in the future, we first extract a VisitorInterface.

interface VisitorInterface
{
    public function visitBook(Book $book);

    public function visitDocument(Document $document);
}

class PageCountVisitor implements VisitorInterface
{
    // Make sure the visitor implements the interface
}
Enter fullscreen mode Exit fullscreen mode

Then we create a VisitableInterface and apply it on Book and Document.

interface VisitableInterface
{
    public function accept(VisitorInterface $visitor);
}

class Book implements VisitableInterface
{
    // ...
    public function accept(VisitorInterface $visitor)
    {
        $visitor->visitBook($this);
    }
}

class Document implements VisitableInterface
{
    // ...
    public function accept(VisitorInterface $visitor)
    {
        $visitor->visitDocument($this);
    }
}
Enter fullscreen mode Exit fullscreen mode

Here you can see the double dispatch in action. The Book class calls the visitBook() method on the visitor and Document calls visitDocument(). Both are providing themselves as the parameter. Because of this minor change to the entity we can now apply all kinds of different visitors that provide a certain functionality for every entity.

To use the visitor on the entities we need to adjust our calling code like this:

$document = new Document(20);
$book = new Book();

$visitor = new PageCountVisitor();

$document->accept($visitor); // int(20)
$book->accept($visitor); // int(14)
Enter fullscreen mode Exit fullscreen mode

With all the pieces now in place, we are free to create more visitors that implement the VisitorInterface and can perform a certain feature for both Book and Document. A WordCountVisitor for example.

note: I'm well aware what this is a contrived example, and that in this case it would be way easier to implement a Countable interface a make use of the count() function. This example is merely for demonstration purposes, and to make it easier to understand. However, when stacking on more and more actions, a class would need more and more interfaces. So please keep in mind that this would be a beneficial pattern when multiple actions are required, and possibly even new actions in the future.

Pros & cons

Like many other patterns, the Visitor Pattern isn't the one pattern to rule them all. There are multiple solutions to different problems. The Visitor Pattern is just that; a possible solution to a specific problem. Let's look at some reasons you might use it, and some reasons you might not.

✔️ Pros

  • You can add functionality to any entity by implementing the VisitableInterface once. This makes the entity more extendable.
  • By adding visitors the functionality you enforce separation of concern.
  • The entity is in control whether the visitor is accepted. You can omit the relay and cut the double dispatch.
  • The individual visitors are easier to test.

❌ Cons

  • The double dispatch can be confusing and make the code harder to understand.
  • The accept() and visit...() methods usually don't return anything, so you need to keep records on the visitor itself.
  • All Visitors need every method on the VisitorInterface while it might not have an implementation for it.

Real world examples

Realistically, you aren't likely to find this pattern much in the wild. However, it is a common practice in combination with Trees and Tree Traversal.

If you are unfamiliar with Trees & Tree Traversal, you can check out my previous blog on that.

When traversing a Tree, you are iterating over a continuous stream of Nodes. We can perform an action for every node in that Tree. This is called visiting... coincidence? These nodes are usually just an entity holding a value. Instead of adding a bunch of methods to these nodes; it's actually a nice way of adding different features to these otherwise dumb entities.

Some tree implementations I've seen actually have A PreOderVisitor and a PostOrderVisistor. These will then return an array of nodes in that order. While that is a perfectly acceptable visitor, I believe a Visitor should not dictate the order in which it is applied to the tree. For some features it might not even matter what the traversal order is, while in some cases it might.

In my Trees & Tree Traversal post I gave the example of a document inside a tree structure. When traversing that tree in PreOrder you get a logical flow of the document; starting at the cover page. Some visitors we might want to build for that tree are:

  • RenderPdfVisitor which could render every node as a PDF file.
  • TableOfContentsVisitor which could create a table of contents with the correct page numbering.
  • CombinePdfVisitor which could combine every previously rendered PDF into a single PDF document.

And basically every example from that blog post can be build as a visitor.

Thanks for reading

Like I said, the Visitor Pattern isn't very common, but it's nice to have up your sleeve. Do you have any experience with this pattern? Please let me know in the comments. I'm curious to hear what you think of it.

I hope you enjoyed reading this article! If so, please leave a ❤️ or a 🦄 and consider subscribing! I write posts on PHP almost every week. You can also follow me on twitter for more content and the occasional tip. If you want to be the first to read my next blog; consider subscribing to my newsletter.

Top comments (3)

Collapse
 
bdelespierre profile image
Benjamin Delespierre • Edited

The double dispatch can be confusing and make the code harder to understand.

While it is true more indirection can create confusion, the naive implementation (a service relying on instanceof to determine the correct algorithm to use) is very rigid (creates static dependencies between components) and doesn't scale at all (imagine handling hundreds of components this way, good luck 🙄)

The visitor pattern benefits clearly outweights the added complexity and create much cleaner and understandable code IMHO.

The accept() and visit...() methods usually don't return anything, so you need to keep records on the visitor itself.

I don't see why 🤷

All Visitors need every method on the VisitorInterface while it might not have an implementation for it.

You can mitigate that problem by applying the Interface Segregation Sprinciple (ISP) ("no client should be forced to depend on methods it does not use.")

Pro tip: using traits helps keep things clean & reusable.

Using the solution below, we see the Book class now only rely on the BookPageCountVisitorInterface and not the whole visitor. The visitor concrete class doesn't have to implement visitDocument if it's not needed or possible 👍

TL;DR makin small interfaces & traits gives you the flexibility to only implement what you need 😎

interface BookPageCountVisitorInterface
{
    public function visitBook(Book $book): int;
}

interface DocumentPageCountVisitorInterface
{
    public function visitDocument(Document $doc): int;
}

// you can still assemble them into a single interface if you wish
interface PageCountVisitorInterface extends BookPageCountVisitorInterface, DocumentPageCountVisitorInterface
{
    // empty
}

class Book
{
    public function accept(BookPageCountVisitorInterface $visitor)
    {
        return $visitor->visitBook($this);
    }
}

trait BookPageCountVisitor
{
    public function visitBook(Book $book): int
    {
        return 1;
    }
}

trait DocumentPageCountVisitor
{
    public function visitDocument(Document $doc): int
    {
        return 2;
    }
}

class PageCountVisitor implements PageCountVisitorInterface
{
    use BookPageCountVisitor;
    use DocumentPageCountVisitor;
}

$visitor = new PageCountVisitor();
$book = new Book();

echo $book->accept($visitor); // 1
Enter fullscreen mode Exit fullscreen mode
Collapse
 
doekenorg profile image
Doeke Norg • Edited

I don't see why 🤷

Because every visitor has a different reason for being, so they all will have different results. There is no single return type or return value type.

TL;DR makin small interfaces & traits gives you the flexibility to only implement what you need 😎

This will however replace one VisitorInterface with several new interfaces. Which doesn't make much sense to me either. Because now my Book can only receive a BookPageCountVisitorInterface which, by its name only serves one purpose. It needs to have the generic VisitorInterface to be able to receive any visitor.

And those trait BookPageCountVisitor traits will never be reused, because they serve the purpose of one visitor. So they might as well live on that visitor only, right?

I personally would rather have one interface that has a function for every type, and create an abstract base class that implements all those functions with an empty body. Then there would be less overhead in files, and my visitors only need to overwrite the functions that matter.

Collapse
 
andersbjorkland profile image
Anders Björkland

I really like this series of articles. There's much that is new to me and I'm learning a ton! 👍👍