All Data and AI Weekly #225-19 Jan 2026
( AI, Data, NiFi, Iceberg, Polaris, Streamlit, Flink, Kafka, Python, Java, SQL, MCP, LLM, RAG, Cortex AI, AISQL, Search, Unstructured Data )
Sorry for missing the Flink talk on Thursday, there was a death in my family.
❄️ Tim Spann's New Stuff ❄️
https://www.youtube.com/watch?v=9PkA6rzcsEo
https://www.youtube.com/watch?v=OYwvfK3U6LA
🚀 Top Headlines
OpenAI Introduces GPT-5.2 Codex
A massive leap forward in code generation and reasoning. This release promises to change how we approach automated development pipelines.
Read the Announcement
Snowflake & OpenAI: Cortex AI Evolution
Snowflake has officially announced deep integration with OpenAI for Cortex AI, streamlining the ability to bring LLMs to your data without moving it.
Read the Blog
Snowflake to Acquire Observe for $1 Billion A major move in the observability space. Snowflake announces its intent to acquire Observe to deliver AI-powered observability at enterprise scale, bridging the gap between data and action. Press Release | News Coverage | Why Observe?
2025 Databases Retrospective Professor Andy Pavlo’s annual retrospective is out. A must-read summary of the winners, losers, and major shifts in the database landscape over the last year. Read the Blog
Data Engineering in 2026: From ETL to Autonomy An outlook on how the role of the data engineer is shifting from building pipelines to managing autonomous data systems. Read Article
❄️ Snowflake Technical Deep Dives
- Real-Time CDC: A guide to low-latency Postgres CDC to Snowflake for real-time AI analytics. Medium Guide
- OpenFlow Integration:
- Iceberg & Interop: Connect Databricks Unity Catalog to Snowflake via Iceberg REST. Tutorial
- Optimization:
- Apps: Build a News Sentiment Analytics app completely with prompts. Tutorial
- Community:
❄️ Snowflake Ecosystem Updates
The Snowflake platform continues to evolve with a focus on governance and data apps.
- Trust Center & Classification: New features for sensitive data classification are now live. Release Notes
- Data Clean Rooms (DCR): Updates to DCR functionality for secure data sharing. Documentation
- Schema Evolution: Snowpipe Streaming now supports schema evolution, reducing pipeline breakage. Learn More
-
Performance Tuning:
- Warehouse Sizing Guide
- Query Insights - Specific recommendations for per-query improvements.
- Views & Lineage:
🧊 Data Engineering: Iceberg, Flink & Streaming
Real-time architectures are shifting heavily toward open table formats.
- Deep Dive PDF: Real-Time AI Pipeline Architectures with Flink SQL, NiFi, Kafka, and Iceberg. View Slides
- Apache Polaris: Mapping legacy and heterogeneous data lakes in Polaris. Blog Post
- Implementation Guides:
🤖 AI Agents & LLM Tooling
The "Agentic" workflow is becoming the standard for enterprise AI.
- Connecting Agents: Bridging Snowflake Cortex to the A2A (Agent-to-Agent) protocol. Medium Article | GitHub Repo
- Healthcare Agents: A curated list of awesome AI agents for healthcare. GitHub Repo
- Asset Management: Agentic AI for Asset Management guide. GitHub Repo
- Pocket TTS: High-quality text-to-speech generation. Kyutai Labs
- Java for AI: Java 25 features JBang notebooks and GraalPy for interactive learning. JavaPro Article
🤖 AI Agents, Cortex & LLMs
- Cortex & Agents Resources:
-
Learning & Frameworks:
- 30 Days of AI: A streamlined learning path for building AI apps with Streamlit. GitHub
- TruLens + LlamaIndex: Evaluating Agent workflows. Notebook
- Auto-Claude: Automating tasks with Claude. GitHub
- LLMRouter: Optimized routing for LLM queries. GitHub
- VideoRAG: Retrieval Augmented Generation for video content. GitHub
🛡️ Security & OSINT Toolkit
A large collection of Open Source Intelligence and security tools dropped this week.
- IntelOwl: Open source Intelligence Management Platform. GitHub
- DataSploit: OSINT Framework to perform reconnaissance. GitHub
- TheHive Cortex: Analyzers for the Security Orchestration platform (Note: Distinct from Snowflake Cortex). GitHub
- Osweep: Open Source Wireless Security Auditing. GitHub
- Web Check: All-in-one website OSINT tool. Website
- Snowflake Security:
🛠 Developer Toolkit
New utilities to speed up your workflow.
- Data Lake Explorer: A tool for navigating your data lake. GitHub
- Frogmouth: A markdown viewer for the terminal. GitHub
- Glow: Render markdown on the CLI. GitHub
- Bat: A cat clone with wings (syntax highlighting). GitHub
- Kangaroo: SQL client for popular databases. GitHub
🛠 Developer Utilities
- Taws: A Terminal User Interface (TUI) for AWS. GitHub
- Hardwood: A Kafka log viewer/tool. GitHub
- Gastown: Steve Yegge's latest project. GitHub
- VSCodium: Free/Libre Open Source Software Binaries of VS Code. Why VSCodium?
- Quantum Tunnel: Expose your local service to the internet. Docs
- Strix: GitHub | YT Playlist Downloader: GitHub
📺 Watchlist & Events
- Webinar: Architect for AI with Open Formats (AWS + Snowflake). Watch Here
- Upcoming Meetup: NY Apache Flink Meetup. RSVP Here
- Video: Java 25 & Interactive Learning
- Video: Latest AI Talks
- Webinar: Build Your Own Enterprise Intelligence Agent (Jan 8, 2026). Watch Replay
- Event: Retail AI @ Hakkoda. Event Page
- Video: Featured Tech Talk
Data
https://data.nj.gov/Transportation/Traffic-Counts-Data/c74r-6c8d/about_data
https://open-data-portal-njdot.hub.arcgis.com/maps/9b7498a04a6146ff859c07ac242ccbd1/about
https://open-data-portal-njdot.hub.arcgis.com/datasets/ab33be6e4a51439c9ed179809b4d99cf_0/explore
🍭 2026: SNACK-AI
As we look forward in 2026, we are introducing SNACK-AI. This architectural pattern is designed to power AI applications at any scale—from mobile phones and robots to massive enterprise clusters.
SNACK-AI integrates:
- Snowflake Platform
- NiFi (Apache)
- Apache Iceberg (Apache Iceberg)
- Cortex AI (Agents/Search/MCP)
- Kafka (Apache Kafka)
Deep Dive into SNACK-AI:
- Read: SNACK-AI: The 2026 Pattern (Medium)
- Code: Official SNACK-AI GitHub Repository
- Weekly Context: SNACK-AI Overview
Thanks
https://github.com/timothyspann
© 2020-2026 Tim Spann https://www.youtube.com/@FLaNK-Stack
Top comments (0)