Whether you’re using the API or Algolia dashboard, it’s best to send several records at a time instead of pushing them one by one. This has many benefits: it reduces network calls and speeds up indexing. Batching has a major impact on performance when you have a lot of records, but everyone should send indexing operations in batches whenever possible.
For example, imagine you’re fetching all data from your database and end up with a million records to index. That would be too big to send in one take, because Algolia limits you to 1 GB per request.
Plus, sending that much data in a single network call would fail anyway before ever reaching the API. You might go for looping over each record and send them with the saveObjects method. The problem is that you would perform a million individual network calls, which would take way too long and saturate your Algolia cluster with as many indexing jobs.
A leaner approach is to split your collection of records into smaller collections, then send each chunk one by one. For optimal indexing performance, aim for a batch size of ~10 MB, which represents between 1,000 or 10,000 records depending on the average record size.
Batching records doesn’t reduce your operations count.Algolia counts indexing operations per record, not per method call, so from a pricing perspective, batching records is no different from indexing records one by one.
$client=new\AlgoliaSearch\Client('AJ0P3S7DWQ','••••••••••••••••••••ce1181300d403d21311d5bca9ef1e6fb');$index=$client->initIndex('actors');$records=json_decode(file_get_contents('actors.json'),true);// Batching is done automatically by the API client$index->saveObjects($records,['autoGenerateObjectIDIfNotExist'=>true]);
1
2
3
4
5
6
7
8
9
10
require'json'require'algolia'client=Algolia::Search::Client.create('AJ0P3S7DWQ','••••••••••••••••••••ce1181300d403d21311d5bca9ef1e6fb')index=client.init_index('actors')file=File.read('actors.json')records=JSON.parse(file)# The API client automatically batches your recordsindex.save_objects(records,{autoGenerateObjectIDIfNotExist: true})
importjsonfromalgoliasearch.search_clientimportSearchClientclient=SearchClient.create('AJ0P3S7DWQ','••••••••••••••••••••ce1181300d403d21311d5bca9ef1e6fb')index=client.init_index('actors')withopen('actors.json')asf:records=json.load(f)# Batching is done automatically by the API client
index.save_objects(records,{'autoGenerateObjectIDIfNotExist':True});
usingSystem.IO;usingNewtonsoft.Json;usingNewtonsoft.Json.Linq;publicclassActor{publicstringName{get;set;}publicstringObjectId{get;set;}publicintRating{get;set;}publicstringImagePath{get;set;}publicstringAlternativePath{get;set;}}AlgoliaClientclient=newAlgoliaClient("AJ0P3S7DWQ","••••••••••••••••••••ce1181300d403d21311d5bca9ef1e6fb");Indexindex=client.InitIndex("actors");// Don't forget to set the naming strategy of the serializer to handle Pascal/Camel casingIEnumerable<Actor>actors=JsonConvert.DeserializeObject<IEnumerable<Actor>>(File.ReadAllText("actors.json"));// Batching/Chunking is done automatically by the API clientboolautoGenerateObjectIDIfNotExist=true;index.SaveObjects(actors,autoGenerateObjectIDIfNotExist);
importjava.io.FileInputStream;importjava.io.InputStream;importcom.fasterxml.jackson.databind.ObjectMapper;publicclassActor{// Getters/Setters ommittedprivateStringname;privateStringobjectId;privateintrating;privateStringimagePath;privateStringalternativePath;}// Synchronous versionSearchClientclient=DefaultSearchClient.create("AJ0P3S7DWQ","••••••••••••••••••••ce1181300d403d21311d5bca9ef1e6fb");SearchIndex<Actor>index=client.initIndex("actors",Actor.class);ObjectMapperobjectMapper=Defaults.getObjectMapper();InputStreaminput=newFileInputStream("actors.json");Actor[]actors=objectMapper.readValue(input,Actor[].class);// Batching/Chuking is done automatically by the API clientbooleanautoGenerateObjectIDIfNotExist=true;index.saveObjects(Arrays.asList(actors),autoGenerateObjectIDIfNotExist);
packagemainimport("encoding/json""io/ioutil""github.com/algolia/algoliasearch-client-go/v3/algolia/search")typeActorstruct{Namestring`json:"name"`Ratingint`json:"rating"`ImagePathstring`json:"image_path"`AlternativeNamestring`json:"alternative_name"`ObjectIDstring`json:"objectID"`}funcmain(){client:=search.NewClient("AJ0P3S7DWQ","••••••••••••••••••••ce1181300d403d21311d5bca9ef1e6fb")index:=client.InitIndex("actors")varactors[]Actordata,_:=ioutil.ReadFile("actors.json")_=json.Unmarshal(data,&actors)// Batching is done automatically by the API client_,_=index.SaveObjects(actors)}
valclient=ClientSearch(ApplicationID("AJ0P3S7DWQ"),APIKey("••••••••••••••••••••ce1181300d403d21311d5bca9ef1e6fb"))valindex=client.initIndex(IndexName("actors"))valstring=File("actors.json").readText()valactors=Json.plain.parse(JsonObjectSerializer.list,string)index.apply{actors.chunked(1000).map{saveObjects(it)}.wait()// Wait for all indexing operations to complete.
With this approach, you would make 100 API calls instead of 1,000,000. Depending on the size of your records and your network speed, you could create bigger or smaller chunks.