Sep 10 2024
At Allegro, we continuously improve our development processes to maintain high
code quality and efficiency standards. One of the significant challenges we
encounter is managing code migrations at scale, especially with breaking changes
in our internal libraries or workflows. Manual code migration is a severe burden, with over
2000 services (and their repositories). We need to introduce some kind
of code migration management.
Sep 4 2024
In one of our core services, the execution of a single unit test took approximately 30 seconds,
while a single integration test ranged between 65 and 70 seconds.
Running the entire test suite took circa 6 minutes.
Aug 26 2024
Hi, I am Magda and I will tell you a story about coming back to work after a break of 21 months and 2 days. Everything here will be a subjective perspective about my experience.
Aug 5 2024
Are you, as a test automation engineer, tired of Selenium’s flakiness? Are you seeking a better tool to automate your end-to-end tests? Have you heard of
Playwright? Perhaps you’ve encountered opinions that it is only worth using within a Node.js environment. I have. And as a tester, I decided to verify if
this is true. If you’re interested in the results, I encourage you to read the following article.
Jul 26 2024
This article is a case study of how we improved stability in our critical application.
It’s mostly a technical analysis of what happens in fresh Java based instance,
how JIT Compiler toyed with us at application start and how we learned to control it.
Jul 16 2024
If you have experience with Event Storming and have ever found yourself wishing there was a way to document the insights gathered during a session,
or wanting to communicate the process to other team members, then I have a solution for you. This idea can be expressed in a famous saying:
One picture is worth more than a thousand words.
Jul 1 2024
Site performance is very important, first of all, from the perspective of users, who expect a good experience when visiting the site.
The user should not wait too long for the page to load. We all know how annoying it can be when we want to press an element
and it jumps to another place on the page or when we click on a button and then nothing happens for a very long time. The state of a
site’s performance in these aspects is measured by Web Vitals performance metrics and most importantly by a set of three major
Core Web Vitals metrics (LCP — Largest Contentful Paint, CLS — Cumulative Layout Shift, INP — Interaction to Next Paint). They are
responsible for measuring the 3 things: loading time, visual stability and interactivity. These metrics are also important for the
websites themselves because, in addition to the user experience, they are also taken into account in terms of the website’s positioning
in search engines (SEO), which is crucial for most websites on the Internet, Allegro included.
Jun 20 2024
In this article we’ll present methods for efficiently optimizing physical resources and fine-tuning the configuration of a Google Cloud Platform (GCP)
Dataflow pipeline in order to achieve cost reductions.
Optimization will be presented as a real-life scenario, which will be performed in stages.
Jun 11 2024
One tech blog/newsletter gained traction and popularity for a couple of years now: Pragmatic Engineer.
Jun 4 2024
The purpose of this article is to present how to design, test, and monitor a REST service client.
The article includes a repository with clients written in Kotlin using various technologies such as WebClient,
RestClient,
Ktor Client,
Retrofit.
It demonstrates how to send and retrieve data from an external service, add a cache layer, and parse the received response into domain objects.
May 16 2024
This story shows our journey in addressing a platform stability issue related to autoscaling, which, paradoxically, added some additional overhead instead
of reducing the load. A pivotal part of this narrative is how we used Couchbase — a distributed NoSQL database. If you find
yourself intrigued by another enigmatic story involving Couchbase, don’t miss my
blog post on tuning expired doc settings.
Apr 12 2024
In early 2024, I hit ten years at Allegro, which also happens to be how long I’ve been working with microservices.
This timespan also roughly corresponds to how long the company as a whole has been using them, so I think it’s a good time to outline the story of project
Rubicon: a very ambitious gamble which completely changed how we work and what our software is like. The idea probably seemed rather extreme at the time, yet I
am certain that without this change, Allegro would not be where it is today, or perhaps would not be there at all.
Mar 6 2024
At Allegro, we use Kafka as a backbone for asynchronous communication between microservices. With up to
300k messages published and 1M messages consumed every second, it is a key part of our infrastructure. A few months ago, in our main Kafka cluster, we noticed
the following discrepancy: while median response times for produce requests
were in single-digit milliseconds, the tail latency was much worse. Namely, the
p99 latency was up to 1 second, and the p999 latency was up to 3 seconds. This was unacceptable for a new project that we were about to start, so we
decided to look into this issue. In this blog post, we would like to describe our journey — how we used Kafka protocol sniffing and eBPF to identify and remove
the performance bottleneck.
Feb 20 2024
Have you ever thought about ways of reducing repetitive, monotonous tasks? Maybe you would like to try to automate your own tasks? I will show you what
technology we use at Allegro, what processes we have automated, and how to do it on your own.
Feb 12 2024
This story shows how we strive to fix issues reported by our customers regarding inconsistent listing views on our e-commerce platform.
We will use a top-down manner to guide you through our story. At the beginning, we highlight the challenges faced by our customers, followed by presenting
basic information on how views are personalized on our web application. We then delve deeper into our internal architecture, aiming to clarify how
it supports High Availability (HA) by using two data centers. Finally, we advertise a little Couchbase,
distributed NoSQL database, and explain why it is an excellent storage solution for such an architecture.
Jan 24 2024
Ready to turn web accessibility from a headache into a breeze? Join us as we demystify WCAG, explore its latest 2.2 version, and gaze into the future of digital
inclusivity. Get ready for a journey that’s as enlightening as it is entertaining!
Jan 10 2024
Icons are an integral part of most modern UIs. What is the best way to embed icons nowadays?
Dec 14 2023
This article is a form of a public postmortem in which we would like to share our bumpy way of revealing the cause of a mysterious performance problem.
Besides unveiling part of our technical stack based on open-source solutions, we also show how some false assumptions made such a bug triage process much
harder.
Besides all NOT TO DOs, you can find some exciting information about performance hunting and reproducing performance issues on a small scale.
As a perk, we prepared a repository where you can reproduce the problem and make yourself familiar with tools
that allowed us to confirm the cause.
The last part (lessons learned) is the most valuable if you prefer to learn from the mistakes of others.
Nov 27 2023
B-tree is a structure that helps to search through great amounts of data.
It was invented over 40 years ago, yet it is still employed by the majority of modern databases.
Although there are newer index structures, like LSM trees,
B-tree is unbeaten when handling most of the database queries.
Oct 30 2023
The idea for this article arose during a meeting where we learned that our supervisor would be leaving the company to pursue new opportunities. In response, a
colleague lamented that what we would miss most is the knowledge departing with the leader. Unfortunately, that’s how it goes. Not only do we lose a colleague,
but we also lose valuable knowledge and experience. However, this isn’t a story about my supervisor; it’s a story about all those individuals who are experts in
their fields, who understand the paths to success and paths that lead to catastrophic failures. When they leave, they take with them knowledge that you won’t
find in any book, note, or Jira ticket. And this leads to a fundamental question: What can be done to avoid this “black hole” of knowledge? How can we ensure
it doesn’t vanish along with them? That’s what this article is all about.
Sep 14 2023
MongoDB is the most popular database used at Allegro. We have hundreds of MongoDB databases running on our on—premise servers.
In 2022 we decided that we need to migrate all our MongoDB databases
from existing shared clusters to new MongoDB clusters hosted on Kubernetes pods with separated resources.
To perform the migration of all databases we needed a tool for transfering all the data and keeping consistency between old and new databases.
That’s how mongo-migration-stream project was born.
Aug 22 2023
After six years as a Team Leader, I went back to hands-on engineering work, and I’m very happy about taking
this step. While it may appear surprising at first, it was a well-thought-out decision, and actually I’ve already
performed such a maneuver once before.
Jul 10 2023
In the era of ubiquitous cloud services and an increasingly growing PaaS and serverless-oriented approach, performance
and resources seem to be becoming less and less important.
After all, we can scale horizontally and vertically at any time, without worrying about potential performance challenges
that the business may introduce.
May 31 2023
As a part of a broader initiative of refreshing Allegro platform, we are upgrading our internal libraries to Spring Boot 3.0 and Java 17.
The task is daunting and filled with challenges,
however overall progress is steady and thanks to the modular nature of our code it should end in finite time.
Everyone who has performed such an upgrade knows that you need to expect the unexpected and at the end of the day prepare for lots of debugging.
No amount of migration guide would prepare you for what’s coming in the field.
In the words of Donald Rumsfeld there are unknown unknowns and we need to be equipped with the tools to uncover these unknowns and patch them up.
In this blog post I’d like to walk you through a process that should show where the application hangs,
although there seems to be nothing wrong with it. I will also show that you don’t always know what code you have – problem known as dependecy hell,
place we got quite cosy in during this upgrade.
Apr 18 2023
Label noise is ever-present in machine learning practice.
Allegro datasets are no exception.
We compared 7 methods for training classifiers robust to label noise.
All of them improved the model’s performance on noisy datasets.
Some of the methods decreased the model’s performance in the absence of label noise.