Online MongoDB migration

MongoDB is the most popular database used at Allegro. We have hundreds of MongoDB databases running on our on—premise servers. In 2022 we decided that we need to migrate all our MongoDB databases from existing shared clusters to new MongoDB clusters hosted on Kubernetes pods with separated resources. To perform the migration of all databases we needed a tool for transfering all the data and keeping consistency between old and new databases. That’s how mongo-migration-stream project was born.

From 3TB to 100GB: A Cost-Saving Journey in Database Maintenance

In the era of ubiquitous cloud services and an increasingly growing PaaS and serverless-oriented approach, performance and resources seem to be becoming less and less important. After all, we can scale horizontally and vertically at any time, without worrying about potential performance challenges that the business may introduce.

Debugging hangs - piecing together why nothing happens

As a part of a broader initiative of refreshing Allegro platform, we are upgrading our internal libraries to Spring Boot 3.0 and Java 17. The task is daunting and filled with challenges, however overall progress is steady and thanks to the modular nature of our code it should end in finite time. Everyone who has performed such an upgrade knows that you need to expect the unexpected and at the end of the day prepare for lots of debugging. No amount of migration guide would prepare you for what’s coming in the field. In the words of Donald Rumsfeld there are unknown unknowns and we need to be equipped with the tools to uncover these unknowns and patch them up. In this blog post I’d like to walk you through a process that should show where the application hangs, although there seems to be nothing wrong with it. I will also show that you don’t always know what code you have – problem known as dependecy hell, place we got quite cosy in during this upgrade.

Trust no one, not even your training data! Machine learning from noisy data

Label noise is ever-present in machine learning practice. Allegro datasets are no exception. We compared 7 methods for training classifiers robust to label noise. All of them improved the model’s performance on noisy datasets. Some of the methods decreased the model’s performance in the absence of label noise.

Dynamic Workload Balancing in Hermes

Hermes is a distributed publish-subscribe message broker that we use at Allegro to facilitate asynchronous communication between our microservices. As our usage of Hermes has grown over time, we faced a challenge in effectively distributing the load it handles to optimize resource utilization. In this blog post, we will present the implementation of a dynamic workload balancing algorithm that we developed to address this challenge. We will describe the approach we took, the lessons we learned along the way, and the results we achieved.

allegro.tech

At Allegro, we build and maintain some of the most distributed and scalable applications in Central Europe. This poses many challenges, starting with architecture and design, through the choice of technologies, code quality and performance tuning, and ending with deployment and devops. In this blog, our engineers share their experiences and write about some of the interesting things taking place at the company.