One of the key elements ensuring efficient operation of the services we work on every day at Allegro is fast responses from the database. We spend a lot of time to properly model the data so that storing and querying take as little time as possible. You can read more about why good schema design is important in one of my earlier posts. It’s also equally important to make sure that all queries are covered with indexes of the correct type whenever possible. Indexes are used to quickly search the database and under certain conditions even allow results to be returned directly from the index, without the need to access the data itself. However, indexes are not all the same and it’s important to learn more about their different types in order to make the right choices later on.
It’s easy to find resources about improving Elasticsearch performance, but what if you wanted to reduce it? In Part I of this two-part series we looked under the hood in order to learn how ES works internally. Now, in Part II, is the time to apply this knowledge in practice and ruin our ES performance. Most tips should also be applicable to Solr, raw Lucene, or, for that matter, to any other full-text search engine as well.
It’s easy to find resources about improving Elasticsearch performance, but what if you wanted to reduce it? This is Part I of a two-post series, and will present some ES internals. In Part II we’ll deduce from them a collection of select tips which can help you ruin your ES performance in no time. Most should also be applicable to Solr, raw Lucene, or, for that matter, to any other full-text search engine as well.
The main goal of boosting website performance is to improve the user experience. In theory, a satisfied customer is more likely to use a particular company’s services, which is then reflected in business results. However, from my own experience I can say that not every change can be easily converted into money. I would like to tell you how to reconcile these two worlds, how to convince the business that the benefits of better performance are a long-term investment, and how to streamline the development process during the design or code writing process.
The following article is an excerpt from Software Mistakes and Trade-offs book. In real-world big data applications, the amount of data that we need to store and process can be often counted in the hundreds of terabytes or petabytes. It is not feasible to store such an amount of data on one physical node. We need a way to split that data into N data nodes.