Indexing Queue
Introduction
If any issues arise with your data, indices, or queue, please check the troubleshooting guide.
Before the Magento data is searchable, it needs to be uploaded to our servers and indexed. The indexing queue manages all the synchronization with our servers automatically.
To read more about the indexing process for Magento 2, please read the page on indexing.
The indexing queue processes updates to products, categories, pages, and any other data by sending the updates to our servers asynchronously. This way, the data in Magento and on our servers will be up to date at all times, providing the best user experience for the customers.
Because the queue processes asynchronously, the Magento administrator does not have to wait after every change until the index is updated.
Configuring the queue
To enable the indexing queue, navigate to Stores > Configuration > Algolia Search > Indexing Queue/Cron in the Magento administration.
Enabling the indexing queue is recommended for production environments.
All queued operations will be stored in the database, in a table called algoliasearch_queue
.
By default, the queue will run no more than 10 operations at a time.
This amount can be scaled up or down in the settings, to better suit the resources available on the server.
To find out how many operations the server can handle, the following process can be followed:
- Turn off the cron job.
- Set the number of operations to process to 10.
- Manually run the indexer.
- Measure how long one run takes.
- If it’s lower than 4 minutes, increase the number of jobs to process.
- Repeat from point 3 until the highest possible value has been found.
Don’t forget to turn the cron job back on.
Note: the steps mention 4 minutes, and not 5. This is to keep a safe margin with the running of the cron job. It’s best practice to keep a margin in case the cron job is slower than normal due to unforeseen circumstances, like a high server load.
Failed operations
Whenever an operation fails during the processing of the queue, it will be re-triggered to be processed the next time the queue runs. In order to prevent operations from being re-triggered infinitely, the maximum amount of retries can be configured in the settings.
Processing the queue
Once the queue is enabled, the process to run it needs to be set up. There are two ways to achieve this.
Automatically
The preferred way to handle the queue is by processing it at a regular time interval. To do this, the following crontab entry has to be configured.
1
*/5 * * * * php absolute/path/to/magento/bin/magento indexer:reindex algolia_queue_runner
This crontab will run every five minutes, running the amount of operations set in the configuration (10 by default).
Manually
While the crontab entry is preferred to get regular updates to the data when necessary, it’s also possible to manually trigger the indexing jobs. To process the queue manually, run the following command from the command line:
$
php path/to/magento/bin/magento indexer:reindex algolia_queue_runner
Running the queue manually may not empty the whole queue, as it will only run the amount of operations set in the configuration.
Processing the full queue
When all operations queued need to be processed at once, the PROCESS_FULL_QUEUE=1
parameter needs to be passed when manually processing the queue.
Run the following command in the command line to process the complete queue at once:
$
PROCESS_FULL_QUEUE=1 php path/to/magento/bin/magento indexer:reindex algolia_queue_runner
Please note that this is not recommended. When processing a large amount of operations, the chance of error increases - network timeouts, php timeouts, and memory problems could possibly happen. If these errors occur, the queue will not be completely empty, and the data will not be fully up-to-date. To resolve any encountered errors, please have a look at our troubleshooting pages.
Clearing the queue
To clear your queue you can go to Stores > Algolia Search > Indexing Queue page in your Magento backoffice and click the “Clear Queue” button to remove all jobs from the queue.
Additionally, you can also truncate your algoliasearch_queue
table in your database.
We recommend that if you clear your indexing queue, you should run a full reindex of your data to ensure that they are up to date in Algolia.
Indexing queue logs
To determine how well your indexing queue is performing, you can review the algoliasearch_queue_log
table in your database. Each row represents a process of the algolia_queue_runner
indexer, whether it comes from the cronjob or a manual run.
The duration
column is represented in seconds. This duration should be at least a minute shorter than your cronjob interval. For instance, if you have the cronjob set to every 5 minutes, the duration at most should be less than 4 minutes (or 240 seconds). This accounts for any extra processing time.
If you’re performing well under the recommended duration, you can increase the number of processed jobs to optimize your queue runner. You can find this setting in Stores > Configuration > Algolia Search > Indexing queue / Cron > Number of jobs to run each time the cron is run.
If you’re performing over the recommended time, we recommend that you reduce the number of processed jobs.
Available in version 1.13.0, you can find the indexing queue logs in the Magento back office, by going to Stores > Algolia Search > Indexing Queue > See Run Logs.
Indexer modes
Magento has two indexing modes: Update on Save
and Update on Schedule
. With Update on Save
, when an entity updates, it indexes the data that the save event holds. If you use Update on Schedule
, Magento bypasses these events by creating MySQL table triggers to store updated entity IDs in a change log *_cl
table. The Magento cron indexes these IDs at a later time.
You can set the following indexers to Update on Schedule
:
Indexer name | Indexer ID |
---|---|
Algolia Search Products | algolia_products |
Algolia Search Categories | algolia_categories |
Algolia Search Pages | algolia_pages |
Setting the Algolia Search Queue Runner algolia_queue_runner
indexer to Update on Schedule
isn’t recommended. In combination with the recommended cron, changing this mode to Update on Schedule
can cause strange indexing behavior including data loss from concurrent job processing. For better performance, keep this indexer set to Update on Save
.