Bulk Import Documents Into Elasticsearch Using NEST
Elasticsearch is a best of breed search platform, but before you can search, you’ll need to import your documents. Being a .NET shop, we have adopted NEST as our communication mechanism to talk to our Elasticsearch cluster. This post will show you how to take a large set of documents and bulk import them into your Elasticsearch cluster with relative ease.
var documents = /* your list of documents */;
var waitHandle = new CountdownEvent(1);
var bulkAll = _elasticClient.BulkAll(documents, b => b
.Index(indexName) /* index */
.BackOffRetries(2)
.BackOffTime("30s")
.RefreshOnCompleted(true)
.MaxDegreeOfParallelism(4)
.Size(1000)
);
bulkAll.Subscribe(new BulkAllObserver(
onNext: (b) => { Console.Write("."); },
onError: (e) => { throw e; },
onCompleted: () => waitHandle.Signal()
));
waitHandle.Wait();
There are a few notables things happening in this code:
- We create a wait handle using a
CountdownEvent
. This will tell our process when all the documents are imported. - We have retry logic, in case our initial attempt fails.
- We call
RefreshOnCompleted
, which tells Elasticsearch to make sure the documents are indexed before giving clients read access.
It really is that simple, and this code is generic enough to be used in any project utilizing NEST and Elasticsearch. Also, reactive for the win!
Note: We found this code in the great sample project written by Martijn Laarman.