BigFlow — a Python framework for data processing on GCP
We are excited to announce that we have just released BigFlow 1.0 as open source. It’s a Python framework for big data processing on the Google Cloud Platform.
We are excited to announce that we have just released BigFlow 1.0 as open source. It’s a Python framework for big data processing on the Google Cloud Platform.
Imagine that you were charged with finishing a task started by your colleague. He or she implemented it just before leaving for a vacation. Now it is your job to finish and release it.
Most applications need to be able to persist and retrieve their state to be fully functional. In my previous post I compared methods for persisting application state. In this post I will compare the methods for retrieving this state.
Every day, many developers of enterprise systems have to face complex problems for which incorrect solutions often end with a loss of enormous amounts of money. That’s why they constantly learn new paradigms, frameworks, tools, read books and broaden their knowledge in technical conferences. It’s a lot of time and effort! All this dedication is on behalf of creating a high-quality product — but what is the result? Obviously it depends (as every software architect would say), but too many times the result isn’t the best. It takes a while, has got too many errors, often the code quality isn’t high… And let’s better not mention testing. Why is all this happening then? Everyone does their best, spends additional time on learning, so what exactly is missing? The answer is quite straightforward - constant practicing. Let’s say a few words about “code kata” which can make a remedy for the mentioned problems.
Let me start with a story. Once upon a time I stumbled upon an excellent article by Philip Walton where he describes how expensive script evaluations could (and should!) be deferred until the browser is idle or they are actually needed. One of the examples that awakened my interest was creating an instance of the Intl.DateTimeFormat object, as I was using this great API quite often but never thought it can cause real performance problems. Turns out it can, especially if used inside loops. Apart from the technique described in Philip’s article, another solution is to simply reuse Intl.DateTimeFormat instances instead of creating them every time.
We run over 800 microservices in our on-premise and public clouds. They are developed by hundreds of engineers in various technologies: from the most popular Java and Kotlin, through Scala, Clojure, Python, NodeJS, Golang, to less mainstream like Elixir or Swift. All these applications need to handle common technical concerns: logging, monitoring, service discovery, tracing, internationalization, security and more. How to provide common solutions to these requirements in such a heterogeneous setup? In this post I’m going to explain how we originally solved this problem with common libraries and how we are currently changing our runtime environment to make things even easier.
An application can be defined as a set of use cases. It often happens that use case A requires a previously executed use case B for its execution. In such situation, it should be ensured that use case B has been executed while executing use case A. To achieve this, application state that is common to both use cases, is introduced. The state must be persisted to be visible to more than one use case. Most often, various types of databases are used for this purpose. While working with source code, I have encountered various methods of persisting the application state. I also came up with my own variations. In this post I will make a subjective comparison of these methods based on specific criteria.
When you go through articles related to Hexagonal Architecture (HA) you usually search for practical examples. HA isn’t simple, that’s why most trivial examples make readers even more confused, though it is not as complex as many theoretical elucidations present it. In most posts you have to scroll through exact citations or rephrased definitions of concepts such as Ports and Adapters or their conceptual diagrams. They have already been well defined and described by popular authors i.e. Alistair Cockburn or Martin Fowler. I assume you already have a general understanding of Domain Driven Design and that you understand terms such as Ports and Adapters. I’m not a HA expert, yet I use it everyday and I find it useful. The only reason I write this post is to show you that Hexagonal Architecture makes sense, at least if your service is a little more than a JsonToDatabaseMapper.
This year Allegro.pl turns 21. The company, while serving millions of Poles in their online shopping, has taken part in many technological advances. Breaking the monolith, utilising public cloud offerings, machine learning, you name it. Even though many technologies we use might seem as just following the hype, their adoption is backed by solid reasoning. Let me tell you the story of a project I’ve had the privilege of working on.
As a developer interested in both web technologies and game development I always found myself disagreeing with a large part of articles about using a particular technology to solve some problems. While such articles are often true, they often skip some important details that make given solution unacceptable in some other cases. And in this article I will try to look at immutability in a negative way from game development perspective and how it can affect web services too. It is always more fun to look in a negative way at something everyone loves ;)
This post was published on April 1st, 2020, and should not be taken too seriously.
The group that you could help is huge. The Polish Association of the Blind reports that there are more than 3 million people in Poland with visual impairment and 42 thousand with complete blindness (based on official, national statistical data of 2016 - not so fresh; still, it is not better in 2020, so…). These numbers do not include people with other problems, for example, those resulting from severe sickness, with movement disabilities, or with learning problems, older or just foreigners, who are not fluent with the language. All of them, and many more, are not able to use digital products the way people without disabilities do.
You may not know this, but there is a part of Allegro codebase which we started developing in C# due to some special requirements. This implies new programming opportunities and challenges — one of these is creating a completely new .NET Core starter project. Let’s explore one potential solution: dotnet new templates.
Recently, our crucial microservice delivering listing data switched to Spring WebFlux. A non-blocking approach gave us the possibility to reduce the number of server worker threads compared to Spring WebMvc. The reactive approach helped us to effectively build a scalable solution. Also, we entered the world of functional programming where code becomes declarative: statements reduced to the minimum immutable data preferred no side effects quite a neat chaining of method calls causing it easier to understand.
Configuration management is one of the key challenges you have to face when you decide to build an application as a distributed system based on microservices deployed to the Cloud. There are multiple ways of addressing different aspects of this problem, using several tools such as Spring Cloud Config Server or Hashicorp Consul. However, this article will focus on the tools that Google Cloud Platform offers out of the box. The approaches mentioned should be seen as complementary rather than mutually exclusive.
When designing the architecture of a system, one always needs to think about what can go wrong and what kind of failures can occur in the system. This kind of problem analysis is especially hard in distributed systems. Failure is inevitable and the best we can do is to prepare for it.
Surprisingly, this is the first time we share with you a report from Allegro Tech Meeting (ATM) — even though it’s already the 12th edition of this special event.
One of the first challenges a programmer has to face is organizing classes within a project. This problem may look trivial but it’s not. Still, it’s worth spending enough time to do it right. I’ll show you why this aspect of software development is crucial by designing a sample project’s architecture.
In this last instalment of our series about team tourism at Allegro, our two engineers describe their eventful visits in two rather technical teams, one dealing with our message broker - Hermes, the other with web performance.
One of the coolest features added in just announced TypeScript 3.7 is optional chaining syntax. It promises a much shorter and more readable code for dealing with deeply nested data structures. How may this nice new feature affect the performance of your project?
Following up on our previous post about team tourism at Allegro, as promised we present you with three case studies describing experiences of our engineers during their visits in other teams. The teams visited worked on content automation, machine learning and search engine optimisation.
Are you dreaming about getting a job in IT, but you feel stuck? You may be constantly working your way through books and tutorials about the technology that excites you (be it JavaScript, PHP or whatever else), but it still feels like you will never be able to get that Junior Developer position you want.
We often hear about the importance of exchanging knowledge and practices between different teams. However, less often do we hear concrete suggestions for how to do it. In this article, we discuss one of our practices to address the problem.
When we measure the page loading speed from the user’s perspective, we pay attention to the appearance of subsequent elements on the screen. Metrics such as First Contentful Paint, First Meaningful Paint and Visually Complete directly reflect what the user sees and when. But what if the page is invisible, when it loads in the background, for example in a different tab? Should we consider such views interesting for us? Don’t the collected metrics distort the results?
Developing our own A/B testing platform from scratch came with many lessons during the process. Our platform was built for people who differ vastly in terms of statistical knowledge, also for those who are not familiar with statistics at all. So the main challenge wasn’t technical implementation. The hardest thing was (and still is) developing good practices and spreading them among our users. We want to share some of our current knowledge that we think is crucial for A/B testing at scale. From this article, you will get to know about calculating sample size, what it means and what benefits it provides for you and your organization.