Articles tagged with
big data





29 Jun 2017

Presto - a small step for DevOps engineer but a big step for BigData analyst

I bet you have found this article after googling some of the issues you encounter when working with a Hadoop cluster. You probably deal with Hive queries used for exploratory data analysis that are processed way too long. Moreover, you cannot adapt Spark in your organization for every use case because of the fact that writing jobs requires quite strong programming skills. Clogged Yarn queues might be your nightmare and waiting for the launch of the container when you run even a small query drives you mad. Before we deployed Presto — a Fast SQL engine provided by Facebook — our analysts struggled with these problems on a regular basis.


26 Jan 2017

Estimating the cache efficiency using big data

Caching is a good and well-known technique used to increase application performance and decrease overall system load. Usually small or medium data sets, which are often read and rarely changed, are considered as a good candidate for caching. In this article we focus on determining optimal cache size based on big data techniques.


17 Dec 2014

Big Data Spain 2014 review

Big Data Spain is an annual conference on Big Data and related topics held in the suburbs of Madrid. This year’s, i.e. third, edition has so far been the biggest; it has attracted more than 500 guests and various speakers including Big Data celebrities like Paco Nathan of Databricks. During two days of the conference, guests could attend many keynotes, speeches and workshops and learn about variuos products, services and specific use-cases, in both English and Spanish. Allegro was represented by two employees with a presentation on Hadoop pitfalls and gotchas.


05 Nov 2014

Hadoop World 2014 New York from a developer’s point of view

This year’s edition of Strata Hadoop World held in New York was humongous, 16 workshops, over 20 keynotes, over 130 talks and most importantly over 5000 attendees! This massive crowd wouldn’t fit in Hilton hotel where the previous edition was held. That is why organizers had to move the conference to Javits Conference Center - an enormous building in which Big Data believers occupied just one sector. The fact that the European edition of Hadoop Summit experienced exactly the same transition (the third edition is going to be held in a bigger location in Brussels) gives pleasant assurance that Big Data technologies are still a hot topic and that Big Data Community grows at a stable pace.