Thesis Defense: Albert Kim, Title: Optimizing Queries with Disjunctions


Albert Kim


Sam Madden
Abstract: Despite decades of research into query optimization, optimizing queries with disjunctive predicate expressions remains a challenge. Solutions employed by existing systems (if any) are often simplistic and lead to much redundant work being performed by the execution engine. In this thesis, we present a two-pronged approach to address this problem: (1) For single-table queries, we provide a formal analysis of the problem for column-oriented engines, and based on our analysis, we present several algorithms capable of generating optimal predicate evaluation plans. Our algorithms are polynomial-time (unlike existing works), theoretically proven to produce optimal plans, and can be implemented in existing systems with minimal effort. (2) For multi-table queries, we present a novel form of query execution called tagged execution, which groups tuples into subrelations based on the predicates they satisfy and tags them with that information. Query operators can take advantage of the additional context provided by tags during runtime, and this allows them to eliminate much of the redundant work performed by traditional engines and even achieve pushdown optimizations for disjunctive predicates. Both our approaches greatly outperform existing solutions, sometimes by orders of magnitude, and we believe the combination of a theoretical approach (1) and a practical approach (2) helps meet the needs of most users.