DEV Community: Asaf Lecht | אסף לכט

Deployment grind and basic fixes

Asaf Lecht | אסף לכט — Fri, 15 May 2026 00:00:00 +0000

This past week was a mixed bag for the FloorSight Video Analysis project. We’re all about taking raw video and pulling out useful insights, like activity patterns or object tracking. The goal is to turn hours of unwatchable footage into actionable data. Most of my time went into deployment, but I squeezed in some bug fixes too.

Upload crash fix: I had a classic “duh” moment. Video uploads were crashing because I forgot to include width and height fields in the VideoInfo object. A simple data integrity check I completely missed, causing a real headache.
Coolify deployment was a grind: Getting Coolify to play nice took up most of my week. I spent ages documenting its API in CLAUDE.md, wrote a coolify_manager.py script for debugging, and dug into environment variables like COOLIFY_API_TOKEN.
Real-time logs, finally: The biggest win was adding PYTHONUNBUFFERED=1 to the Dockerfile. This might sound small, but it was a game-changer for getting real-time logs. Before that, logs were delayed, making debugging a nightmare. Talk about fighting the OS for basic visibility.
Schema validation and polite errors: Tidied up some schema validation errors that kept popping up. Also, made sure our user-facing error messages were a bit more polite, hiding internal model names from users. Just good practice.

Honestly, it was frustrating at times. You fix one thing, and another deployment hurdle pops up. It felt like a constant battle against configuration and environment issues instead of building new features. But there’s a certain satisfaction in wrestling with these infrastructure demons and coming out on top. It’s a stark reminder that building software isn’t just about writing features, it’s about making sure it can actually live somewhere reliably. I learned a ton about deployment pipelines and the importance of good observability. Felt like a level-up in my devops understanding, even if it wasn’t the ‘fun’ coding I usually prefer.

Next: Continue stabilizing deployment, and maybe, just maybe, some actual feature work.

The great recorder CLI pivot

Asaf Lecht | אסף לכט — Fri, 15 May 2026 00:00:00 +0000

My google-recorder-cli project is all about bringing Google Recorder’s transcription and search to the command line. I use Recorder constantly, but I always wanted more programmatic control, like searching recordings from the terminal or bulk exporting transcripts. This is my personal quest to make that happen. The last week or so has been a whirlwind.

API archaeology: I kicked things off with a deep dive into the underlying service’s API. I spent a “live discovery session” mapping out endpoints, requests, and responses. It felt like being an archaeologist, digging through network requests. Documenting a pretty full API surface felt like a huge win.
Hitting the API wall: After all that discovery, the hard truth hit: directly interacting with the API was going to be a nightmare. Authentication was complex, payloads intricate, and I felt like I was constantly fighting anti-bot measures. I made the tough call: a complete architectural rewrite for Phase 2-3.
Browser-as-runtime: I pivoted to a “browser-as-runtime” approach, using a headless browser to interact with the web interface, managed by a “watchdog” process. It feels a bit like admitting defeat on the direct API front, but it’s a pragmatic solution. This also meant adding an API monitoring strategy to the TODO list, because a web interface can change.
Squashing a classic: I squashed a classic bug: a hardcoded path in snapshot-a11y.js. Rookie mistake, but it’s always satisfying to replace a brittle path with os.tmpdir().
Polishing the CLI: I’ve been looking at patterns from other great CLIs, like the millionco/expect style, and started incorporating some of those improvements. It’s not just about making it work, it’s about making it feel good to use. I also updated author credits in the docs, adding my Hebrew name and clarifying I modified it, not built the original service.

Honestly, the biggest struggle was that API wall. It was frustrating to spend so much time on discovery only to realize the direct approach was a dead end. But the pivot to “browser-as-runtime” felt like a creative workaround, even if it adds complexity I initially wanted to avoid. It’s a reminder that sometimes the “ideal” solution isn’t feasible, and you have to find a robust “working” solution. Adapting felt good, rather than getting stuck. Seeing the CLI start to feel more polished is always a boost.

Next: Implement core recording search and export functionality using the new architecture.

Building a content sync pipeline: The struggle with large files and CI

Asaf Lecht | אסף לכט — Tue, 12 May 2026 00:00:00 +0000

This past week, I kicked off the halachot-orvishua project. The idea is a reliable pipeline to pull Jewish law content from orvishua.net. I want an automated system that checks for new content, downloads it, and then tells me what changed, especially which “mega-files” need re-uploading. It’s about keeping a local, updated copy and knowing what’s fresh.

This period was mostly about setting up the core infrastructure and figuring out the content flow.

Scripts-only repo, content on Google Drive: My first big decision was to make this a “scripts-only repo.” The actual content wouldn’t live in Git; it would be on Google Drive. This immediately caused problems.
CI facepalm: My initial CI setup tried to download content. That was a no-go, given the “scripts-only” decision and the sheer size of the content. I had to quickly fix it to a “detect-only” mode where the workflow doesn’t try to pull down the actual files. Total facepalm moment, setting CI to do the exact opposite of what I wanted Git to manage.
Core logic: change detection and notifications: I got the system to detect changes. Crucially, it now specifies which sources needed re-uploading and exactly which mega-files were affected. This was a big win, the whole point of the project.
Evolving content strategy: The “scripts-only” idea felt clean, but then I realized I needed some representation of the output in Git for tracking. I implemented syncing the output to a git-tracked directory. It felt like a necessary compromise. Later, I reorganized everything: scripts into scripts/, tracked content into content/, using new_content/ and content_index.json for systematic management. It’s a constant battle between keeping it simple and making it robust for large, dynamic data.
Line endings, of course: Had to add .gitattributes and normalize them to LF. Always line endings.

I finally got a successful sync run, pulling down 2 new Q&A items and confirming 0 new articles. That little green checkmark felt good.

Honestly, the hardest part was wrestling with the content strategy. “Scripts-only, content on Drive” seemed clean, but the practicalities of CI, tracking changes, and needing some versioned output meant constant re-evaluation. It felt like I was building the plane while flying it. The “mega-files” aspect really drove home that Git isn’t always the right tool for all data, but you still need a way to track metadata about that data. Architectural decisions are rarely one-and-done; they evolve as you understand the problem better.

Next: Refine the notification content and potentially add more robust error handling.

VisionMaker: Growing pains and pruning gains

Asaf Lecht | אסף לכט — Tue, 12 May 2026 00:00:00 +0000

VisionMaker started as a focused pipeline for annotating visual data, specifically for material science involving silica. It’s meant to be a smart assistant for our ARAD team, automating image and video frame labeling. This past week, things got wild.

SSIM for data deduplication: We got the initial ARAD silica pipeline running, but immediately hit a wall with data volume. Implementing SSIM (Structural Similarity Index Measure) for frame deduplication was a huge win. It verified a ~40% drop in frames at a 0.95 threshold, which felt amazing. That’s a massive cut in processing time and storage.
Monorepo architecture shift: VisionMaker was outgrowing its initial structure. I bit the bullet and restructured everything into a monorepo with a proper per-project layout. This wasn’t just moving files; it was rethinking how components interact and scale. The mental overhead of ensuring nothing broke was intense, like open-heart surgery on a running system.
Massive cleanup and standardization: Along with the monorepo, I dropped an old “junction” component that was more trouble than it was worth. Standardized all labeling naming conventions (finally!) and set up consistent subfolder reports. It’s wild how much clearer things get when naming is consistent. I also pruned older cluster and silica_filter tools, plus some outdated reports/ and references/ directories. Clearing out that cruft felt good, even if some of those tools were “my babies” once.
YOLO upgrade for object detection: We retired OWLv2 and brought in YOLO. This meant setting up both training and prediction pipelines for YOLO, then restructuring the documentation to reflect all these changes. Switching core models is always a bit nerve-wracking, making sure the new one performs and integrates smoothly.

This week felt like a microcosm of software development. Start with a problem, build a solution, immediately hit a scaling issue, solve that, then realize the whole structure needs an overhaul because the project is growing. The satisfaction of seeing the SSIM dedupe work, and the clarity from the monorepo and cleanup, was immense. It’s a constant reminder that code isn’t just about new features; it’s about tending the garden, pruning, reorganizing, and sometimes replanting entirely. I’m learning to embrace the messiness of growth and the discipline of cleanup.

Next: Integrating new data sources and refining the YOLO models further.

My AI digest project: From weekend hack to operational beast

Asaf Lecht | אסף לכט — Fri, 08 May 2026 00:00:00 +0000

This whole thing started simple. I’m in dozens of messaging groups, everything from professional communities to neighborhood chats. Keeping up was impossible. I’d open an app and see 500+ unread messages across 30 groups. Most of it was noise, but the important stuff was buried deep.

So I built a monitor. It watches my groups and generates an AI-powered digest, a concise summary of everything important, organized by topic. Gemini handles the AI part. That worked great. Then, naturally, scope creep kicked in.

Now it also:

Ingests email newsletters via IMAP polling and includes them in the digest.
Filters online articles using the LLM as a relevance judge based on my interest profile.
Has separate digest pipelines for different content categories (tech, community, religious content).
Auto-forwards the daily digest to relevant groups with randomized timing.
Respects schedule constraints, it knows about holidays and off-hours.
Sends push notifications when something critical happens, like session loss or system failures.

Honestly, the operational side was half the project, maybe more.

Fighting Windows and Docker: My system runs in Docker on my Windows desktop. Docker Desktop crashes sometimes, and Windows loves sleep mode or random updates. I had to build a whole keep-awake system just to stop the PC from sleeping. Talk about fighting the OS!
I built a watchdog with escalating recovery: restart container, then restart Docker, then alert me.
Catch-up logic and duplicates: When the system came back from downtime, it needed to process missed messages without creating duplicates. I cleaned up over 2000 duplicate records from the early days before I got dedup right. Felt amazing when that finally worked.
LLM prompt engineering: Getting the digest quality right took many iterations. Gemini would either be too verbose (defeating the purpose) or miss important things. I ended up giving the prompt group names and content categories as context, telling it to be as long as needed but only skip truly redundant content.
Some messaging groups have permanently broken API responses. No fix available upstream. I just had to accept it and handle the errors gracefully.

This started as a weekend hack. Now it’s the tool I check every morning before anything else. It’s like having a personal research assistant that reads everything I can’t and tells me what matters. I showed this to a friend and they said, “you built a product.” I don’t know about that, but it’s definitely the most complex system I’ve built on my own.

Next: Migrating it to a Raspberry Pi so it doesn’t depend on my desktop being awake.

Scraping Hebrew news: A deep dive into unexpected complexity

Asaf Lecht | אסף לכט — Fri, 08 May 2026 00:00:00 +0000

I thought scraping clean article text from Israeli news sites would be a quick win. Just grab the body, right? Turns out, that’s one of those problems that looks easy until you actually try it, especially with Hebrew content. I’ve now got a web service running on Google Cloud Run that handles 17 major Israeli news outlets, but getting here was a proper fight.

Most off-the-shelf extraction libraries (like newspaper3k) are built for English. They make assumptions about text direction, paragraph structure, and encoding that just fall apart with Hebrew. RTL text handling adds a whole layer of pain.
The tech stacks are wild. One site uses DraftJS, another is AMP, a third has a WordPress API, and a fourth runs on some custom CMS from the early 2000s. Each one needed a unique approach.
Some sites are actively hostile to automation. They serve completely different content depending on who’s asking, or they’re just plain unreachable from cloud infrastructure. Talk about fighting the OS!
My solution ended up being a chain of per-site extraction strategies. If the primary CSS selector fails, I try the next one. If that fails, I hit the site’s API if it has one. Only as a last resort do I fall back to a generic library.
The real work isn’t building it, it’s keeping it alive. I’ve done two full audit rounds across all 17 sites, fixing bugs on about half of them. Selectors drift constantly as sites update their markup. It’s a never-ending reverse-engineering project.

This project taught me that building a scraper is maybe 20% of the job. Maintaining it is the other 80%. Every site redesign or new content protection layer means another mini-battle. I’m seriously considering if a browser extension approach might be more sustainable long-term. The interesting part here isn’t the code, it’s the constant problem-solving and learning how the web actually works behind the scenes.

Next: Investigate browser extension feasibility for more robust, long-term extraction.

Automating invoices with Apps Script and Gemini

Asaf Lecht | אסף לכט — Tue, 05 May 2026 00:00:00 +0000

Just shipped something I’m genuinely proud of. A client’s accounting team was buried under manual data entry, opening dozens of supplier invoice PDFs each week, reading details, and typing everything into their ERP. It took 15-30 minutes per invoice, mind-numbingly tedious work.

I built a system to automate this with AI, running entirely on Google Apps Script. No servers, no deployment pipeline, no infrastructure to maintain. The team already lives in Google Sheets, so the whole thing sits right inside their existing workflow. A file watcher monitors their inbox, Gemini AI extracts structured data, and if the fast model isn’t confident, it upgrades to a larger one. Results land in a spreadsheet, and the system generates ERP export files. I even built a Hebrew web dashboard for review.

The challenges were not what I expected:

Multi-invoice PDFs: Some suppliers bundle five or more invoices into one file. I had to detect these, split them, extract unique invoice numbers, and then merge rows for the same invoice if they spanned pages. That was a fun puzzle.
AI hallucinated dates: Gemini just made up dates sometimes. Completely fabricated. I had to add validation ranges and trigger re-extraction with the stronger model when dates looked suspicious. Trust, but verify, right?
Hebrew RTL rendering: The web dashboard, built for review, was a headache with right-to-left text layout. Mixing Hebrew with numbers and English text broke in surprising ways. Fighting the OS on that one.
Apps Script 6-minute timeout: Processing dozens of invoices meant hitting Apps Script’s execution limit. I had to build a checkpoint and resume system so processing could continue across multiple runs.
Strict ERP import format: The client’s ERP expected an extremely specific format: specific tab delimiters, DD/MM/YY dates, exact column positions. One wrong character and the import would fail silently. Debugging those was a joy.

I’m especially proud of the model fallback logic. Flash handles about 85% of invoices correctly. The ones it struggles with (bad handwriting, unusual layouts) automatically escalate. I save both model responses for comparison, which helps me improve prompts over time.

What started as “just parse some PDFs” turned into full accounts payable automation. The client went from 30 minutes per batch to a 5-minute review workflow. Classic scope creep, but the result genuinely saves them hours every week. It reminds me that AI doesn’t need to be a fancy chatbot to be valuable. Sometimes the biggest impact is just removing tedious manual work from someone’s day. The accounting team doesn’t care about the AI, they care that they don’t have to squint at PDFs anymore.

Next: Refine prompt engineering for edge cases.

Taming multi-invoice PDFs and building a customer dashboard

Asaf Lecht | אסף לכט — Tue, 05 May 2026 00:00:00 +0000

This sprint was a beast. We’re automating invoice processing with AI, pulling critical data like supplier, amounts, and line items, then spitting out structured files for accounting systems. It’s all about cutting manual data entry and human error. This time, I tackled multi-invoice PDFs and built a full customer dashboard.

Smarter AI extraction: I enabled LLM “thinking mode” for better accuracy, clarified date prompts, and added robust validations (like 9-digit supplier HP, total matching subtotal + VAT). If validation fails or key fields are missing, it now falls back to a more powerful AI or flags for review.
File management and auditing: Built an auditInvoiceFiles() function to scan folders, hash files, and detect duplicates. It resumes gracefully and has a timeout. Squashed a bug with empty audit arrays. Also added utilities to manage the inbox, moving unique files and trashing duplicates, all while tracking full file paths.
Multi-invoice PDFs were a nightmare. This meant a full data model overhaul, adding parent_row_id and ensuring child rows inherited sources. The re-processing logic was brutal, managing existing child rows, deleting old data, and merging. It now falls back to the Pro model for these and skips duplicates. After much pain, it’s finally working, exporting tricky batches.
TXT export format update: Dates are now DD/MM/YY (required a one-time fix for existing files), and the whole export is tab-delimited. Allocation numbers and the Google Drive URL for the original invoice are now included.
Customer web dashboard (GAS): Built a customer web dashboard using Google Apps Script. This is a huge UX win, with a full Hebrew UI, translated messages, and direct data editing. I added fallback chains for expense accounts, split settings, and per-status guidance. Users can now edit line items, rename suppliers, and compare amounts. Squashed a few UI bugs along the way.

This sprint was intense, but deeply satisfying. I had moments of pure frustration, especially wrestling with multi-invoice logic or getting the GAS UI to behave. But seeing those multi-invoice PDFs finally export, or the dashboard come alive with a localized UI, felt amazing. It really hammered home the importance of breaking down complex problems. I learned a ton about robust error handling, data integrity, and building a decent UI on a restrictive platform.

Next: Address remaining TODOs and refine the dashboard’s user experience.

Wrestling with data: OCR, monitors, and the never-ending quest for clean info

Asaf Lecht | אסף לכט — Fri, 01 May 2026 00:00:00 +0000

This past week, I’ve been deep in the guts of our RAG knowledge base pipeline. The whole point is to build a system that can ingest a ton of diverse content, like articles, Q&As, and transcripts, and make it searchable for our RAG models. It’s all about getting good, clean data into the system so the AI can actually do its job.

Here’s what’s been happening:

Hebrew OCR notebook: I finally tackled Hebrew OCR. A lot of our source material is scanned documents or images in Hebrew, so this was a big step. I put together a notebook, which feels like a solid foundation, and made sure to link up backup info for the processed files. Don’t want to lose that output.
New content monitor: I built out a monitor for a specific source. This thing automatically downloads new content and pushes it through our conversion pipeline. It sounds simple, but getting it right was a beast.
- I refined the URL matching to be manifest-based, which is way more robust than my initial naive approach.
- Canonical URL checks were crucial to avoid downloading duplicates. Nothing worse than redundant data clogging the system.
- The convert_all.py script needed tweaks. Q&A content requires a different touch than regular articles, so it got some love to handle both properly.
- A small but annoying fix: parsing dates for answers. Sometimes an author prefix messes things up, so I added logic to strip that out and fall back to the published date if needed. Data cleaning, man, it’s always the little things that bite you.
The hardest part was getting that content monitor to be truly reliable. It’s one thing to write a script that downloads some stuff, another to build something that consistently ingests all new content without breaking or duplicating. Each “fix monitor” commit felt like a small battle won against the chaos of real-world data.
I’ve also got about 2,000 online lesson transcripts waiting in the wings, but that’s currently blocked on someone else. Sometimes the biggest blockers aren’t technical.

It’s been a mix of satisfaction and mild frustration. There’s a real sense of accomplishment when the monitor pulls in new content flawlessly, or when the OCR spits out readable Hebrew. But then there’s the grind of debugging why a date isn’t parsing correctly. It really hammers home the importance of robust, fault-tolerant data pipelines. You can have the fanciest RAG model, but if the data going in is garbage, the output will be too. This process is teaching me a lot about the “dirty work” of AI, the data engineering side that’s absolutely critical.

Next: Get those ~2,000 online lesson transcripts processed once they’re unblocked.

Browser automation: Fighting for session stability

Asaf Lecht | אסף לכט — Fri, 01 May 2026 00:00:00 +0000

This project wraps a complex web application in an API, letting other services interact with it programmatically. This past week was all about making that actually reliable, especially around authentication and session persistence.

SDK integration and storage_state.json: We ditched our custom client for the official SDK, which felt like a no-brainer for stability. The big win was getting native authentication working using Playwright’s storage_state.json for session saving. This immediately led to tackling automated session refresh, because these things will expire.
Auth hardening was a beast. I added robust cookie validation after export, with backoff on repeated failures, and crucially, made sure we never overwrite a valid storage_state.json with invalid cookies. Distinguishing between genuine auth errors and transient network glitches in the keep_alive loop was a subtle but important fix.
Those pesky CSRF tokens were silently killing sessions. We started auto-rebuilding the client every 25-35 minutes with jitter to refresh them. Auto-retrying chat RPC errors with a rebuilt client was another key reliability improvement.
Browser-as-runtime shift: The biggest architectural change was moving all RPC calls through a persistent Playwright browser instance. This simplified the keep_alive logic significantly and felt much more robust. I also realized periodic page refreshes were inadvertently killing active sessions, so those got disabled.
Google login is always a “fun” challenge, especially with the account chooser. After all that, I ran 24-hour endurance tests, simulating human-like pacing across two separate notebooks. Brutal, but crucial for finding long-tail issues like memory leaks or services silently dropping connections after a few hours.

This past week felt like a deep dive into the trenches of making something truly robust. It wasn’t about flashy new features, but grinding through the details, fixing subtle bugs, and building resilience. There were moments of frustration, especially when a session would mysteriously die after an hour. But the satisfaction of seeing the endurance tests run for 24 hours without a hitch was immense. Reliability isn’t a feature you add; it’s a continuous process of refinement and anticipating failure. I feel like I’ve leveled up my understanding of stateful systems and the sheer complexity of modern web applications.

Next: Scaling for concurrent users, multi-session support, and proper process supervision with Docker.

Taming the Colab beast for a RAG pipeline

Asaf Lecht | אסף לכט — Tue, 28 Apr 2026 00:00:00 +0000

My “Rav Oury Cherki RAG pipeline” project is all about making his teachings accessible. This past week, I spent most of my time wrestling with environments and getting the data ingestion solid. The initial RAG idea was easy, but then the real work began: making it runnable in Google Colab.

Colab environment was a nightmare. I had to pin lightning-fabric to a specific version, deal with some weird partial-numpy swaps after an install cascade, and switch to %pip magic because regular pip was causing issues. Then came the “runtime shims” for things like torchaudio.info and numpy.NAN. It felt like I was constantly patching holes in a leaky boat.
Dependency hell and pyannote: The numpy issues were particularly frustrating. Getting pyannote (for diarization) to work consistently across different Colab runtimes and local venv setups was a nightmare. I spent a lot of time exploring venv theory just to understand why things were breaking.
Data ingestion wins: Despite the environment fight, I built playlist recovery to grab content, a WhatsApp chat parser (which was surprisingly tricky), and name-variation search. A batch diarize notebook is now working, and getting a GDrive downloader working was a big win for source material. I also added candidate review tools and fixed needs_transcription regeneration.
Hacky solutions: I even had to bypass SSL certificate verification for some URL shorteners just to follow redirects. Not ideal, but necessary to get the data.

When I finally got a “successful E2E run,” it was a huge relief. All those little fixes, the constant debugging, the moments of wanting to throw my laptop across the room… it paid off. It really hammered home how much time environment setup can suck up, even on seemingly simple projects. The “glamorous” AI stuff often sits on a foundation of gritty, low-level engineering. I learned a ton about dependency management and the quirks of cloud notebooks.

Next: Focus on improving RAG retrieval quality and expanding data sources.

RAG’s dirty secret: data prep is the real fight

Asaf Lecht | אסף לכט — Tue, 28 Apr 2026 00:00:00 +0000

I’ve been knee-deep in RAG lately, building knowledge bases for a couple of projects. Everyone talks about retrieval and generation, but the real monster? Data preparation. Garbage in, garbage out, especially when your source documents are a mess. Scanned PDFs with OCR artifacts, inconsistent formatting, tables that broke, duplicate content, irrelevant boilerplate – if you feed that raw, the AI’s answers are mediocre at best.

So I built an LLM-powered cleanup pipeline. The idea was to run every document through Gemini to fix OCR errors, standardize formatting, and remove noise, but with strict guardrails to stop the AI from changing meaning.

Here’s what went into it:

The pipeline steps: Everything converts to clean Markdown first. Then we score each document for RAG-readiness, detect duplicate content, and run the LLM cleanup via Gemini. I used async concurrent chunk processing for speed.
The 105% quality gate: This was the most important decision. I rejected any chunk where the output exceeded 105% of the input length. Without it, the LLM would “helpfully” paraphrase and expand, which is exactly what you don’t want for reference material. I also had to explicitly tell the model not to reconstruct mathematical formulas; it kept “fixing” correct notation.
Hebrew language surprises: I built a benchmark to compare different AI platforms on Hebrew questions. The differences in multilingual support were pretty stark. Some platforms clearly invested more there.
Async gotcha: When processing multiple files sequentially with async code, you have to reset the Google SDK client between event loop runs. Otherwise, you get obscure connection pool errors that look like network issues. That cost me hours.

At the end of this phase, I archived 75 one-off scripts. Cleaning up that messy root directory to just the pipeline scripts and docs felt amazing.

This whole thing hammered home that AI-assisted workflows aren’t about a smart model. It’s about building the right pipeline around it: validation, quality gates, fallbacks, monitoring. The AI is just one step.

Next: Integrating this cleanup pipeline into our main ingestion flow.