Dubo ranks number 1 in accuracy on BIRD

November 27, 2023

We’re the #1 most accurate text-to-SQL model on the BIRD benchmarks.

Progress with BIRD

Early on, we noticed that many text-to-SQL benchmarks didn’t represent a real-world business environment. Our jobs as data scientists often included messy data, convoluted multi-part JOINs, and confusing WHERE clauses that filter nested JSON. Some well-known benchmarks mostly concerned a single table or small sets of JOINs. They lacked "ecological validity" – the evaluation failed to map to its real-world context.

The BIRD benchmarks are designed to be more representative of real-world SQL. Developed by researchers from University of Hong Kong, Tsinghua University, MIT CSAIL, University of Illinois at Urbana-Champaign, and elsewhere, BIRD contains phenomenona like large tables with many values and considers data across industries from entertainment to healthcare. We think it is a much better representation of corporate databases.

We are pleased to have achieved this result, and thank the BIRD team for running the benchmark.