tl;dr Passing an array as the second parameter in the query builder's
where
function allows you to use different comparison operators for your Typesense filters as[$operator, $value]
.
Laravel Scout is the first-party solution to adding a variety of full-text search tools, such as Algolia, Meilisearch or Typesense. Since it's a plug-and-play solution, they have to make the query builder API a one-size-fits-all solution for each driver.
In this post I'll talk about how I added support for more complex filtering for Typesense that was undocumented in Scout using the built-in where
query builder helper.
I'll be talking in the context of Laravel Scout with Typesense only. These principles or ideas won't necessarily apply to the other search tools.
You probably don't need Scout
First, a bit of a warning.
In my experience, you probably don't actually need Scout. Scout adds technical burden (read: debt) in a few ways, and unless you absolutely need full-text searching capabilities, with built-in typo support or other unique features, a simple query will do the trick most often than not.
- Configuration takes time. Setting up the fields, applying the configuration to each model, and ensuring the right data types can be time consuming. There are some caveats, such as you cannot use
null
for astring
field. - You have to maintain the schema for the collection in Typesense. This is a deployment burden if/when the schema for your model changes.
- If it changes often enough, then you'll be doing a lot of
scout:flush
andscout:import
calls. If you're not careful, exceptions will be thrown before you're able to update it in production.
- If it changes often enough, then you'll be doing a lot of
- Keeping the collection up-to-date is sometimes tricky. Scout does its best to keep the data in Typesense current, but getting out of sync is still easy.
When you need it
When you do actually need the capabilities of Typesense, then it's absolutely powerful. If you can overcome the annoyances I've described above and are certain you need to leverage Typesense, you have a lot of power at your fingertips.
Reading all the search options available in Typesense, it's easy to see they do all the heavy lifting for filtering and searching collections. It's fast and efficient, as advertised.
There are things to keep in mind in a Laravel app with Scout, though.
- You have to really understand the Scout query builder. Again, it's a unified query builder API for each vendor. This means it's simplified compared to what each vendor actually offers.
- Laravel is searching in Typesense, then converting that back to your model instances using your own database. You have to have a good mental model of the Typesense collection schema and the one that it actually references.
- Relationships aren't a thing in Typesense. You have to be creative in the ways that build your Typesense collection schema so that you can leverage filtering based on relationships.
The Setup
grantholle
/
scout-typesense-example
An example of using Typesense with Scout using undocumented filtering
Laravel Scout with Typesense
- Clone the repo.
- Run
composer install
. - Copy
.env.example
to.env
. - Set your Typesense server URL and API key in the
.env
file. php artisan key:generate
php artisan migrate --seed
php artisan scout:import "App\Models\Post"
I've created a sample repo that shows how this works.
It's a simple configuration with a Post
model that has many Comment
relationships. The Scout configuration for Typesense looks like this:
\App\Models\Post::class => [
'collection-schema' => [
'fields' => [
[
'name' => 'id',
'type' => 'string',
],
[
'name' => 'title',
'type' => 'string',
],
[
'name' => 'content',
'type' => 'string',
],
[
'name' => 'comments',
'type' => 'string[]',
],
[
'name' => 'created_at',
'type' => 'int64',
],
],
'default_sorting_field' => 'created_at',
],
'search-parameters' => [
'query_by' => 'title,content,comments',
],
],
When we call Post::search()
, by default we want to search across the title, content and comment contents. We can adjust this if we wanted to, but this will allow us to leverage Typesense in a way that is harder than a typical Eloquent query. Again, the uses cases are less common than you might think for reaching for Scout. A blog search is a good example, though.
Here's how we're indexing our actual data:
public function makeSearchableUsing(Collection $models): Collection
{
return $models->load('comments');
}
public function toSearchableArray(): array
{
return [
'id' => (string) $this->id,
'title' => $this->title,
'content' => $this->content,
'comments' => $this->comments->pluck('content')->toArray(),
'created_at' => $this->created_at->timestamp,
];
}
First, we're making the search query a bit more optimized by eager loading comments in makeSearchableUsing()
. Then we're populating the fields we configured in scout.php
.
We're wanting to search by comments as well, so we're creating an array of each comment content
, which is supported by the field's string[]
type. Now, Typesense knows and can search based on the comments. Note: since we're indexing the Post
model, it's not going to reindex when a comments change. We're going to have to re-trigger the index manually on the comment's post
when comments change.
One caveat I'll point out here: if title
was nullable in our database, we'd want to update toSearchableArray()
in the title
field to be 'title' => $this->title ?? '',
. If we had null
as a title
value, indexing would throw an error: "Error importing document: Field title
must be a string."
Searching
When we visit the endpoint, there's a simple controller to perform the searches for us.
public function __invoke(Request $request)
{
$posts = Post::search($request->input('q', '*'))
->when(is_numeric($request->input('created')), function (Builder $query) use ($request) {
$query->where('created_at', ['>=', now()->subDays($request->integer('created'))->timestamp]);
})
->when(! empty($request->input('exclude')), function (Builder $query) use ($request) {
$query->whereNotIn('id', explode(',', $request->input('exclude')));
})
->when($request->input('not_title'), function (Builder $query, string $title) {
$query->where('title', ['!', $title]);
})
->get();
return $posts->load('comments');
}
Here's the breakdown of the search options:
- If we pass in a
q
query string variable, this is the whole point of Scout. This performs the "search" in Typesense leveraging all of its magic. It's searching across the fields we've set up toquery_by
. It's typo-tolerant, etc.. - Sending a
created
value will search for posts that have been created after X days ago. More on this later. - We can exclude post ID's by comma separating them in
exclude
- Lastly, with
not_title
we can exclude posts where titles contain the given values.
So, what's special?
Behind the scenes, Scout is using the query builder to retrieve results from Typesense and taking those results to subsequently query the actual database to retrieve the model values.
In this example, we could have just leveraged Scout for its searching capabilities and called it a day. However, that's not really a real-world example. Usually we're going to have multiple filtering options like I've implemented.
Since Scout is a one-size-fits-all and each driver will implement filtering differently, it's basic by nature out of the box. As the docs state:
Since a search index is not a relational database, more advanced "where" clauses are not currently supported.
We can hook into the query that's used to retrieve our models after performing the search in Typesense using the query
hook. There's also a catch there:
Since this callback is invoked after the relevant models have already been retrieved from your application's search engine, the query method should not be used for "filtering" results. Instead, you should use Scout where clauses.
Here's an example. Let's say we're just performing our search and paginating on those results with page sizes of 5. If we use the query
hook to filter further, we're only going to filter on those 5 results that were returned. Each page would have inconsistent number of results, or even no results if the filter doesn't apply to any of those 5.
Scout supports simple where
's out of the box, but it's always strict comparison. When it's static, like give me posts with a given tag ID (tag = 1), then it's straightforward. But what about other types of comparisons?
Spoiler: Typesense allows us to configure the comparison operators.
Behind the scenes
In our example, we're retrieving posts that were created with in the last X days (created_at > 10 days ago). By default, Scout doesn't support this. The method signature for where
is the following:
public function where($field, $value)
{
$this->wheres[$field] = $value;
return $this;
}
It only accepts a field key and its value. Unlike Eloquent's query builder, which accepts an $operator
parameter, Scout's is simplified.
The good news is, since each engine (Algolia, Typesense, Meilisearch) implements the filtering themselves, we can dive into Typesense's engine to see what's possible.
The beginning of the hunt for the functionality comes from buildSearchParameters()
. The key we're after for filtering is the filter_by
key, which is set by the filters()
function, which ultimately builds the filter clauses in parseWhereFilter()
:
protected function parseWhereFilter(array|string $value, string $key): string
{
return is_array($value)
? sprintf('%s:%s', $key, implode('', $value))
: sprintf('%s:=%s', $key, $value);
}
If the $value
we pass in is an array, it's constructing the comparison using the two parameters. Otherwise, it's a strict "exactly equals" filter (:=
).
This is what explains our filter of created_at
in the controller:
$query->where('created_at', ['>=', now()->subDays($request->integer('created'))->timestamp]);
We can do numeric filtering now (greater than, less than, etc.) by passing an array as the value.
$query->where($field, [$comparison, $value]);
Now we can retrieve consistent results for pagination without having to rely on Eloquent to filter further for us which will be inconsistent behavior. The same goes for the not_title
query variable. We can use the :!
Typesense operator (not exact contains) to exclude posts with a title that contains a given value.
Conclusion
Thankfully, the Scout Typesense engine was built to be flexible enough to support these filters out of the box. This allows us to leverage Typesense to its full potential. By passing a tuple of [$operator, $value]
, we can use Typesense's built-in filtering much simpler, providing an excellent all-in-one search/filter solution for our data.
Extra credit: the Typesense also allows us to define a typesenseSearchParameters()
function on the search model to include or overwrite other parameters that it supports when searching.
Top comments (0)