Deletable Learned Bloom Filter

TL;DR

We explore deletable variants of Learned Bloom Filters (LBF) and Sandwiched LBF (SLBF) to support element deletion. Our two main proposals are: SLBF with two Counting Bloom Filters (CBFs) for minimal false positive rate (FPR), and LBF with three Standard Bloom Filters (SBFs) for guaranteed deletability at the cost of a controlled false negative rate (FNR). Simulation results confirm a trade-off between FPR, FNR, and deletability, offering practical design choices based on memory constraints and application needs.

Abstract

The Bloom Filter is a space-efficient probabilistic data structure widely used for approximate membership queries. More recently, the Learned Bloom Filter (LBF) has been introduced to improve performance by leveraging models trained on the data. However, both traditional Bloom Filters and LBFs lack native support for element deletion - a critical limitation for applications such as spam filtering and IP routing.

This study addresses this gap by extending the LBF and Sandwiched Learned Bloom Filter (SLBF) to support deletions while maintaining low false positive rates (FPR). We propose two primary designs: (1) an SLBF with two Counting Bloom Filters (CBFs) that enables deletion by replacing Standard Bloom Filters (SBFs) with CBFs, and (2) an LBF with 3 SBFs that guarantees deletion by explicitly tracking removed elements, introducing a controlled false negative rate (FNR) to optimize for FPR.

We derive analytical expressions for FPR, deletability, and FNR for each design and validate our approach through simulation-based evaluation. Results show that the SLBF with 2 CBFs achieves the lowest FPR, while the LBF with 3 SBFs provides perfect deletability at the cost of slightly increased FNR. These findings underscore a fundamental trade-off among FPR, FNR, and deletability, and offer new strategies for enabling deletion in LBF variants.

Errata

Deletability

The analysis of the deletability of the SLBF with two Counting Bloom Filters (CBFs) overlooks the impact of $\alpha \times n$ deletions deletions. This omission is problematic, as deletability is inherently dependent on the number of elements that have been removed. As deletions accumulate, the probability that a given element remains deletable changes. Specifically, in the SLBF with two CBFs, deletability tends to increase as more elements are deleted.

While the report formally defines deletability as the probability of successfully deleting the first inserted element, and the technical analysis adheres to this definition, the interpretation used in the evaluation deviates from it. In the experimental section, deletability is implicitly redefined as the probability that an element can be deleted after $\alpha \times n$ deletions deletions. This reinterpretation is inconsistent with the original definition and potentially misleading. In retrospect, a more appropriate approach would have been to define deletability explicitly as a function of the number of deletions, thereby capturing its dynamic behavior more accurately.

Citation

@misc{qiudlbf2025,
  author = {Yuxiang Qiu},
  title = {Deletable Learned Bloom Filter},
  year = {2025},
  url = {https://yuxqiu.github.io/assets/pdf/writtings/2025/dlbf.pdf},
  note = {Preprint}
}