Allegro OpenSource — embedded-elasticsearch
At Allegro we want to be sure that our software works as designed. That’s why tests are so important to us. In several projects we are using Elasticsearch. In order to make writing integration tests that uses Elasticsearch easier, we’ve created a little tool called embedded-elasticsearch. It sets up Elasticsearch instance that you need for your tests (including installation of plugins) and gives you full control of it.
Background and motivation #
Once, we were writing a service that extensively used Elasticsearch. Almost every request to that service
ended up in a few requests to Elasticsearch, wrapped them up in some other structure and returned a
response to the requester. In software that relies so much on other software, integration tests give you much
more feedback than simple unit tests that mock out external services. So we started writing our
integration tests using Elasticsearch’s NodeBuilder
to create an embedded instance of Elasticsearch:
Node node = nodeBuilder()
.settings(Settings.settingsBuilder().put("http.enabled", false))
.client(true)
.node();
Client client = node.client();
We were happy with that solution. But then we started to struggle with external plugins, which cannot be
simply added using NodeBuilder
. We also discovered that in projects that use Elasticsearch we tend
to create simple utility classes that make our tests easier. Those classes let us recreate indices using predefined
schema and settings, start and stop a cluster for testing behaviour during cluster stability problems,
and so on. We were not fully satisfied with this approach because our main dependency — Elasticsearch was
not started in the same way, using a starting script, as in production environment, but using its internal
classes. So we decided to create a tool that would solve these problems.
Solution #
Our tool — embedded-elasticsearch is actually pretty simple. You describe desired Elasticsearch instance and it downloads proper archives and sets up everything. It also gives you control over created instance and it’s indices.
To start using embedded-elasticsearch in your project add it as a test dependency:
Gradle:
testCompile 'pl.allegro.tech:embedded-elasticsearch:1.0.0'
Maven:
<dependency>
<groupId>pl.allegro.tech</groupId>
<artifactId>embedded-elasticsearch</artifactId>
<version>1.0.0</version>
<scope>testCompile</scope>
</dependency>
Below, we show the example of a simple integration test. It’s a quite dumb spec that writes a document into Elasticsearch, and checks if it’s really been written. Note that the test is written using Spock framework and Groovy language:
package tech.allegro.blog
import pl.allegro.tech.embeddedelasticsearch.EmbeddedElastic
import spock.lang.Specification
class EmbeddedElasticExampleSpec extends Specification {
static final CLUSTER_NAME = "my_example_cluster"
static final PORT = 12913
static final ELASTIC_VERSION = "2.3.3"
def embeddedElastic = EmbeddedElastic.builder()
.withElasticVersion(ELASTIC_VERSION)
.withClusterName(CLUSTER_NAME)
.withPortNumber(PORT)
.build()
.start()
def client = embeddedElastic.createClient()
def "should write document into Elasticsearch"() {
given: "index with single user"
def usersIndex = "users"
embeddedElastic.index(usersIndex, "user", '{ "name": "Joe", "surname": "Doe" }')
when: "searching for all documents"
def result = client.prepareSearch(usersIndex).execute().actionGet()
then: "one document should be returned"
result.hits.hits.length == 1
}
}
When you run that test, you will see in logs that proper Elasticsearch archive is being downloaded and after
that test starts. Archive is downloaded to temporary directory and is not removed after tests execution.
So if you run tests again, same archive will be used to speed up whole process. But get back to our test.
The most important thing for us here is initialization of embeddedElastic
object. Here you describe your
desired installation: Elasticsearch version (if preferred, you can use url location of your
archive), plugins, cluster name, port number, used indices and their settings. Later on, you use that object
to operate on cluster: create/delete/recreate indices, create documents, start/stop instance and a few other
things. Here is another more sophisticated example. We have a base class for integration tests of a service
that uses four indices and an external plugin called Decoumpound:
package tech.allegro.blog
import pl.allegro.tech.embeddedelasticsearch.EmbeddedElastic
import spock.lang.Specification
class IntegrationBaseSpec extends Specification {
def CLUSTER_NAME = "elasticsearch"
def SEARCH_INDEX_NAME = "products_v2"
def SUGGESTER_PRODUCT_NAMES_INDEX_NAME = "suggester_product_names"
def SUGGESTER_PRODUCT_NAMES_INDEX_TYPE = "products"
def SUGGESTER_PHRASES_INDEX_NAME = "suggester_phrases"
def SUGGESTER_PHRASES_INDEX_TYPE = "phrases"
def SUGGESTER_PHRASES_AGGS_INDEX_NAME = "suggester_phrases_aggs"
def SUGGESTER_PHRASES_AGGS_INDEX_TYPE = "phrases_aggs"
def PORT = 9300
def ELASTIC_VERSION = "2.3.3"
def DECOMPOUND_DOWNLOAD_URL = new URL("http://xbib.org/repository/org/xbib/elasticsearch/plugin/elasticsearch-analysis-decompound/${ELASTIC_VERSION}.0/elasticsearch-analysis-decompound-${ELASTIC_VERSION}.0-plugin.zip")
static def embeddedElastic = EmbeddedElastic.builder()
.withElasticVersion(ELASTIC_VERSION)
.withPlugin("decompound", DECOMPOUND_DOWNLOAD_URL)
.withClusterName(CLUSTER_NAME)
.withPortNumber(PORT)
.withIndex(SEARCH_INDEX_NAME)
.withIndex(SUGGESTER_PRODUCT_NAMES_INDEX_NAME, IndexSettings.builder()
.withSettings(getResourceAsStream("products/elastic-settings.json"))
.withType(SUGGESTER_PRODUCT_NAMES_INDEX_TYPE, getResourceAsStream("products/elastic-mapping.json"))
.build())
.withIndex(SUGGESTER_PHRASES_INDEX_NAME, IndexSettings.builder()
.withSettings(getResourceAsStream("phrases/elastic-settings.json"))
.withType(SUGGESTER_PHRASES_INDEX_TYPE, getResourceAsStream("phrases/elastic-mapping.json"))
.build())
.withIndex(SUGGESTER_PHRASES_AGGS_INDEX_NAME, IndexSettings.builder()
.withSettings(getResourceAsStream("phrases-aggs/elastic-settings.json"))
.withType(SUGGESTER_PHRASES_AGGS_INDEX_TYPE, getResourceAsStream("phrases-aggs/elastic-mapping.json"))
.build())
.build()
.start()
def cleanupSpec() {
embeddedElastic.recreateIndices()
}
}
Index definition is straightforward: you must specify index name, settings, and optionally one or more types
with their schemas. It is also advisable to recreate indices after (or before) each specification execution in order to make tests more reliable. Here, it is done in cleanupSpec()
. Note that you don’t have to bother to stop Elasticsearch after tests:
it’s done automatically in the shutdown hook.
Source code #
Source code is available under Apache licence, and can be found on GitHub. Feel free to use that tool, submit suggestions and pull requests.