Elusion v8.0.0 just dropped with something I'm genuinely excited about: native SQL execution and CopyData feature.
Functional API still going strong:
Write queries however you want - Unlike SQL, PySpark, or Polars, you can chain operations in ANY order. No more "wait, does filter go before group_by or after?" Just write what makes sense:
use elusion::prelude::*;
#[tokio::main]
async fn main() -> ElusionResult<()> {
let sales = CustomDataFrame::new("sales.csv", "sales").await?;
let result = sales
.select(["customer_id", "amount", "order_date"])
.filter("amount > 1000")
.agg(["SUM(amount) AS total", "COUNT(*) AS orders"])
.group_by(["customer_id"])
.having("total > 50000")
.order_by(["total"], ["DESC"])
.limit(10)
.elusion("top_customers")
.await?;
result.display().await?;
Ok(())
}
Raw SQL when you need it - Sometimes you just want to write SQL. Now you can:
There is small macro sql! to simplify usage and avpid using &[&df] for each Dataframe included in query.
use elusion::prelude::*;
#[tokio::main]
async fn main() -> ElusionResult<()> {
let sales = CustomDataFrame::new("sales.csv", "sales").await?;
let customers = CustomDataFrame::new("customers.csv", "customers").await?;
let products = CustomDataFrame::new("products.csv", "products").await?;
let result = sql!(
r#"
WITH monthly_totals AS (
SELECT
DATE_TRUNC('month', s.order_date) as month,
c.region,
p.category,
SUM(s.amount) as total
FROM sales s
JOIN customers c ON s.customer_id = c.id
JOIN products p ON s.product_id = p.id
GROUP BY month, c.region, p.category
)
SELECT
month,
region,
category,
total,
SUM(total) OVER (
PARTITION BY region, category
ORDER BY month
) as running_total
FROM monthly_totals
ORDER BY month DESC, total DESC
LIMIT 100
"#,
"monthly_analysis",
sales,
customers,
products
).await?;
result.display().await?;
Ok(())
}
COPY DATA:
Now you can read and write between files in true streaming fashion:
You can do it in 2 ways: 1. Custom Configuration, 2. Simplified file conversion
// Custom Configuration
copy_data(
CopySource::File {
path: "C:\\Borivoj\\RUST\\Elusion\\bigdata\\test.json",
csv_delimiter: None,
},
CopyDestination::File {
path: "C:\\Borivoj\\RUST\\Elusion\\CopyData\\test.csv",
},
Some(CopyConfig {
batch_size: 500_000,
compression: None,
csv_delimiter: Some(b','),
infer_schema: true,
output_format: OutputFormat::Csv,
}),
).await?;
// Simplified file conversion
copy_file_to_parquet(
"input.json",
"output.parquet",
Some(ParquetCompression::Uncompressed), // or Snappy
).await?;
If you hear for Elusion for the first time bellow are some core features:
π’ Microsoft Fabric - OneLake connectivity
βοΈ AzureAzure BLOB storage connectivity
π SharePoint connectivity
π‘ FTP/FTPS connectivity
π Excel file operations
π PostgreSQL database connectivity
π¬ MySQLMySQL database connectivity
π HTTP API integration
π Dashboard Data visualization
β‘ CopyData High-performance streaming operations
Built-in formats: CSV, JSON, Parquet, Delta Lake, XML, EXCEL
Plus:
Redis caching + in-memory query cache
Pipeline scheduling with tokio-cron-scheduler
Materialized views
To learn more about the crate, visit: https://github.com/DataBora/elusion
Top comments (0)