Filter By Null or Missing Attributes
On this page
What happens when your index contains an attribute that isn’t present in all records?
For example, consider an online book store where people can buy, but also rate books, from 0 to 5. Any record without the rating
attribute is assumed not to be rated yet.
Objects are schemaless, so this isn’t a problem until you want to filter on records with and without a specific attribute.
Generally speaking, selective filtering becomes a problem when the existence or non-existence of a filter value actually means something. The Algolia engine doesn’t support filtering on null
value or missing attributes.
In other words, in the preceding example, if you wanted to combine books with a specific rating and books that aren’t yet rated in the same filtering statement, this would require some modification of the data.
There are two approaches:
- using the
_tags
attribute, - using a boolean attribute.
Dataset
In the three following records: one has a correctly filled rating
attribute, a second has a null rating
, and the third doesn’t have a rating
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[
{
"title": "The Shining",
"author": "Stephen King",
"rating": 5
},
{
"title": "Fantastic Beasts and Where to Find Them",
"author": "J. K. Rowling",
"rating": null
},
{
"title": "Run Away",
"author": "Harlan Coben"
}
]
Here, only the first record has a rating. The other two are assumed not to have been rated yet. Note that a null
or nonexistent attribute is different from zero, which represents a book with a rating equal to 0.
Creating a tag
At indexing time, you can compute a tag that specifies what it means when the attribute is present, set, or absent.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[
{
"title": "The Shining",
"author": "Stephen King",
"rating": 5,
"_tags": ["is_rated"]
},
{
"title": "Fantastic Beasts and Where to Find Them",
"author": "J. K. Rowling",
"rating": null,
"_tags": ["is_not_rated"]
},
{
"title": "Run Away",
"author": "Harlan Coben",
"_tags": ["is_not_rated"]
}
]
To search for records that don’t have the attribute or attribute value present, you can now use tags filtering:
1
2
3
$index->search('query', [
'filters' => '_tags:is_not_rated'
]);
Creating a boolean attribute
At indexing time, you can compute a boolean attribute named is_rated
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
[
{
"title": "The Shining",
"author": "Stephen King",
"rating": 5,
"is_rated": true
},
{
"title": "Fantastic Beasts and Where to Find Them",
"author": "J. K. Rowling",
"rating": null,
"is_rated": false
},
{
"title": "Run Away",
"author": "Harlan Coben",
"is_rated": false
}
]
To search for records that don’t have the attribute or attribute value present, you can now use boolean filtering:
1
2
3
$index->search('query', [
'filters' => 'is_rated = 0'
]);