Probabilistic Data Structures and Algorithms in NoSQL databases
What would you say if we stored 1 000 records in a database, and the database claimed that there were only 998 of them? Or, if we created a database storing sets of values and in some cases the database would claim that some element was in that set, while in fact it was not? It definitely must be a bug, right? It turns out such behavior is not necessarily an error, as long as we use a database that implements probabilistic algorithms and data structures. In this post we will learn about two probability-based techniques, perform some experiments and consider when it is worth using a database that lies to us a bit.