My Project on Github
https://github.com/nkwork9999/icedduck
The Problem
When implementing a data visualization extension for DuckDB, I encountered a critical issue: executing SQL queries within a scalar function context caused the entire database to deadlock. The original implementation looked like this:
// This causes deadlock!
auto x_result = context.Query(x_query_str, false);
auto y_result = context.Query(y_query_str, false);
The Solution: Using LIST Types
Instead of accepting SQL query strings and executing them within the function, the solution was to accept pre-aggregated LIST types directly. This approach is both safer and more idiomatic for DuckDB extensions.
Key Changes
- Function Signature Update
// Before: Accepting query strings
auto bar_chart_func = ScalarFunction(
"bar_chart",
{LogicalType::VARCHAR, LogicalType::VARCHAR, LogicalType::VARCHAR},
LogicalType::VARCHAR,
BarChartFunction
);
// After: Accepting LIST types
auto bar_chart_func = ScalarFunction(
"bar_chart",
{LogicalType::LIST(LogicalType::VARCHAR),
LogicalType::LIST(LogicalType::DOUBLE),
LogicalType::VARCHAR},
LogicalType::VARCHAR,
BarChartArrayFunction
);
- Data Extraction Logic
// Extract data from LIST types
auto x_list_value = x_list.GetValue(0);
if (!x_list_value.IsNull() && x_list_value.type().id() == LogicalTypeId::LIST) {
auto &x_children = ListValue::GetChildren(x_list_value);
for (idx_t i = 0; i < x_children.size(); i++) {
if (i > 0) x_data += ",";
x_data += x_children[i].ToString();
}
}
How to Use
Prerequisites
- Build the Rust chart viewer:
cd rust_hello_duck
cargo build --release
cd ..
- Build the DuckDB extension:
make clean
make
Running the Extension
-- Start DuckDB
./build/release/duckdb
-- Load the extension
LOAD 'build/release/extension/quack/quack.duckdb_extension';
-- Create sample data
CREATE TABLE sales (category VARCHAR, amount DECIMAL);
INSERT INTO sales VALUES
('Electronics', 1500.50),
('Clothing', 800.25),
('Food', 1200.00);
-- Generate bar chart using LIST aggregation
SELECT bar_chart(
LIST(category ORDER BY amount DESC),
LIST(amount ORDER BY amount DESC),
'Sales Chart'
) FROM sales;
What Happens
- The
LIST()
functions aggregate the data in DuckDB's query engine - The aggregated data is passed to the extension function
- The function writes the data to a temporary file
- A separate Rust process is spawned to display the chart using Iced GUI
Advanced Usage
You can use more complex aggregations:
-- Group by month and sum revenues
WITH monthly_data AS (
SELECT
month,
SUM(revenue) as total_revenue
FROM monthly_sales
GROUP BY month
ORDER BY total_revenue DESC
)
SELECT bar_chart(
LIST(month),
LIST(total_revenue),
'Monthly Revenue Analysis'
) FROM monthly_data;
Key Takeaways
- Avoid Nested Queries: Never execute SQL queries within DuckDB function implementations
- Use Aggregate Types: LIST types are perfect for passing collections of data
- Separate Concerns: Data aggregation happens in SQL, visualization in the extension
- Process Isolation: GUI applications should run in separate processes to avoid threading issues
Performance Benefits
This approach is not only safer but also more performant:
- No query parsing overhead within the function
- Direct memory access to aggregated data
- Better query optimization by DuckDB's planner
The complete implementation demonstrates how proper function design can avoid common pitfalls in database extension development while providing a clean, efficient interface for users.
Top comments (0)