Zain Naboulsi

Posted on Feb 16 • Originally published at dailyairundown.substack.com

Daily AI Rundown - February 15, 2026

#ai #machinelearning #news #newsletter

This is the February 15, 2026 edition of the Daily AI Rundown newsletter. Subscribe on Substack for daily AI news.

Tech News

OpenAI

Peter Steinberger is joining OpenAI to drive the next generation of personal agents. He is a genius ...

OpenAI CEO Sam Altman has announced that Peter Steinberger is joining the company to lead the development of "personal agents," a move intended to make sophisticated multi-agent interactions a core part of its future product offerings. This strategic shift is accompanied by the transition of the OpenClaw project into an open-source foundation, signaling OpenAI's commitment to supporting open infrastructure as it pursues a vision of autonomous, collaborative AI systems.

OpenClaw, OpenAI and the future

The creator of the open-source project OpenClaw has announced they are joining OpenAI to accelerate the development of consumer-ready AI agents. Seeking to leverage frontier research and unreleased models, the developer chose the partnership over building a traditional startup to prioritize global impact and safety. OpenClaw will transition into an independent foundation to ensure it remains open-source and continues to support data ownership across various AI ecosystems. OpenAI has committed to sponsoring the project, allowing the founder to continue fostering the community while working within the lab’s research and development teams.

Other News

Brain inspired machines are better at math than expected

Researchers at Sandia National Laboratories have developed a breakthrough algorithm that enables brain-inspired neuromorphic hardware to solve complex partial differential equations (PDEs) with unexpected efficiency. While these systems were previously limited to pattern recognition, the study published in *Nature Machine Intelligence demonstrates their capacity to model demanding scientific phenomena such as fluid dynamics and electromagnetic fields. This advancement suggests a viable path toward building the first neuromorphic supercomputers, potentially revolutionizing energy-intensive simulations critical to national security and scientific research. By mimicking the human brain’s low-energy processing, this technology could drastically reduce the massive power requirements currently needed for large-scale computational modeling.*

Solving sparse finite element problems on neuromorphic hardware

Researchers have demonstrated that scalable neuromorphic hardware, specifically Intel’s Loihi 2, can effectively solve complex finite element method (FEM) problems by implementing a specialized spiking neural network. This brain-inspired approach adapts the mathematics of partial differential equations into a natively spiking algorithm modeled after the motor cortex, providing a transparent and mathematically grounded alternative to black-box deep learning methods. The study shows that the system achieves high numerical accuracy and near-ideal scaling for essential engineering tasks, such as solving Poisson equations and linear elasticity across irregular three-dimensional meshes. This development marks a significant advancement in scientific computing, suggesting that energy-efficient neuromorphic platforms could soon supplement or replace power-intensive conventional processors for large-scale simulations.

Prefer to listen? ReallyEasyAI on YouTube

Biz News

Other News

Longtime NPR host David Greene sues Google over NotebookLM voice

Former NPR host David Greene has filed a lawsuit against Google, alleging the company’s NotebookLM AI tool used his voice as the basis for its male podcast persona without permission. Greene, who currently hosts KCRW’s “Left, Right, & Center,” claims the AI host replicates his distinct cadence and intonation, arguing that his vocal identity is a crucial professional asset. In response, Google denied the allegations, stating that the Audio Overview voice was recorded by a paid professional actor and is entirely unrelated to Greene. This legal action follows a similar high-profile dispute between Scarlett Johansson and OpenAI, highlighting growing concerns over the unauthorized use of celebrity likenesses in generative AI development.

Nvidia, Groq and the limestone race to real-time AI: Why enterprises win or lose here

The evolution of artificial intelligence compute is shifting from the brute force of traditional GPUs toward specialized architectures designed for real-time inference and advanced reasoning. While Nvidia continues to innovate with its Rubin architecture to lower token costs, emerging hardware providers like Groq are addressing the "inference time compute" bottleneck to enable complex processing without the penalty of lag. This paradigm shift follows a broader industry trend where architectural efficiency, exemplified by cost-effective models like DeepSeek, is replacing pure scaling as the primary driver of performance. For enterprises, the competitive landscape is now defined by the ability to deliver instantaneous, high-level intelligence through a combination of high-speed throughput and optimized model architectures.

Goldman Unveils ‘AI-Resilient’ Trade Basket for Software Sector

Goldman Sachs has launched a new "AI-Resilient" stock basket designed to help investors navigate the software sector as generative artificial intelligence disrupts traditional business models. Curated by analysts led by Kash Rangan, the basket features industry leaders like Microsoft, ServiceNow, and Intuit that possess proprietary data and essential workflows resistant to replacement by large language models. The initiative marks a strategic shift for the bank from AI infrastructure toward the application layer, targeting firms with deep competitive moats capable of monetizing AI through enhanced productivity features. This investment tool arrives as the software-as-a-service industry faces significant volatility over concerns that AI could automate core functions and threaten traditional per-seat licensing revenue.

Yahoo Finance

Fundstrat’s Tom Lee warns that artificial intelligence is creating an existential threat to the $450 billion software sector, potentially triggering widespread job losses and a disinflationary shift in the broader economy. As core CPI is projected to return to pre-COVID levels, Lee anticipates a dovish Federal Reserve will implement significant interest rate cuts to address these structural changes and a shrinking labor market. This disruption is fueling a massive market rotation out of the "Magnificent 7" tech giants and into AI infrastructure suppliers, a trend Lee predicts could cause a 10-20% decline in tech-heavy U.S. indices. Consequently, capital is expected to favor international markets that are more heavily weighted toward the industrial and material sectors currently powering the global AI buildout.

The great computer science exodus (and where students are going instead)

Computer science enrollment at University of California campuses has declined for the first time since the dot-com crash, signaling a broader shift as students pivot from traditional degrees toward specialized artificial intelligence programs. While system-wide computer science enrollment fell 6% last year, institutions such as UC San Diego and MIT are seeing significant growth by launching dedicated AI majors and interdisciplinary departments. This academic transition mirrors aggressive efforts in China to mandate AI literacy, even as some U.S. universities face internal faculty resistance and parental concerns regarding automation’s impact on the job market. To remain competitive, American institutions are increasingly prioritizing AI-specific infrastructure and curricula to meet evolving student demand.

Disney Blasts ByteDance With Cease And Desist Letter Over Seedance 2.0 AI Video Model

Disney has issued a cease and desist letter to ByteDance, accusing the TikTok parent company of utilizing a "pirated library" of copyrighted characters to populate its new Seedance 2.0 AI video platform. The entertainment giant’s legal action follows widespread condemnation from the Motion Picture Association and major creative guilds over the tool's ability to generate unauthorized, high-fidelity deepfakes of Hollywood films and television series. While Disney is aggressively policing unauthorized use by firms like ByteDance and Google, the company has simultaneously signaled a path toward regulated integration through a $1 billion licensing agreement with OpenAI’s Sora. This escalation highlights an intensifying industry-wide conflict over the protection of intellectual property as generative AI technology rapidly disrupts traditional content creation.

The enterprise AI land grab is on. Glean is building the layer beneath the interface.

Glean is positioning itself as the foundational intelligence layer for enterprise AI, serving as the connective tissue between large language models and internal corporate data. While tech giants compete for the user interface, Glean focuses on a model-agnostic approach that allows businesses to toggle between various providers while maintaining deep integrations with tools like Slack and Salesforce. The company’s core value proposition rests on its ability to ground generic AI models in a firm’s specific business context through a permissions-aware governance layer. This infrastructure addresses critical security concerns, enabling large organizations to deploy scalable AI agents that respect existing data access rights.

Yahoo Finance

Amazon.com Inc. shares recently completed a nine-day losing streak, their longest since 2006, wiping out approximately $463 billion in market valuation. The 18% decline was primarily triggered by investor anxiety over the company's aggressive capital expenditure, specifically a projected $200 billion investment in data centers and chips to support artificial intelligence. This selloff reflects a growing broader skepticism regarding the massive AI spending budgets of major technology firms and the subsequent pressure these costs place on free cash flow. As shares hit their lowest levels since May, market strategists warn that these high expenses may force a fundamental shift in how the industry’s largest players are valued.

Prefer to listen? ReallyEasyAI on YouTube

Podcasts

Biases in the Blind Spot: Detecting What LLMs Fail to Mention

Researchers have identified a critical transparency issue in Large Language Models termed "unverbalized biases," where models make systematic decisions based on factors they never explicitly cite in their chain-of-thought reasoning. To uncover these hidden influences, the authors developed a fully automated black-box pipeline that generates hypothetical bias concepts and tests them through controlled input variations, such as modifying a resume to include or exclude Spanish fluency. By applying this method to decision-making tasks like hiring and loan approval across multiple models, the study not only validated previously known biases regarding gender and race but also discovered new unverbalized preferences for writing formality and language proficiency. These findings demonstrate that a model's stated explanation is often an unreliable indicator of its actual decision-making process, highlighting the need for rigorous external testing to detect discriminatory patterns that reasoning traces fail to reveal.

https://arxiv.org/pdf/2602.10117
https://github.com/FlyingPumba/unfaithful-biases-red-teaming

GoodVibe: Security-by-Vibe for LLM-Based Code Generation

GoodVibe is a novel framework designed to enhance the intrinsic security of code generated by Large Language Models (LLMs) used in fast-paced development environments often referred to as vibe coding. Recognizing that security-relevant reasoning is localized within specific parts of a model, the framework employs a gradient-based method to identify a small subset of critical neurons and optimizes them using a structured clustering approach. This targeted fine-tuning strategy allows GoodVibe to significantly reduce the generation of vulnerable code in languages such as C++, Java, Swift, and Go, matching or exceeding the effectiveness of full-parameter fine-tuning while requiring orders of magnitude fewer trainable parameters. By focusing updates solely on these security-critical areas, the method preserves the general utility of the model and offers a more computationally efficient alternative to existing parameter-efficient techniques like LoRA.

https://arxiv.org/pdf/2602.10778

The Surprising Case for AI Judges

Bridget McCormack, the president and CEO of the American Arbitration Association and former Chief Justice of the Michigan Supreme Court, advocates for the integration of artificial intelligence into the legal system to address the significant barriers to accessing justice, noting that 92 percent of Americans cannot afford legal assistance. In a discussion with Nilay Patel, McCormack details the AAA's new "AI Arbitrator," a tool currently utilized for document-based construction disputes that deploys AI agents to parse claims and evidence while retaining a human arbitrator to issue final awards. Although Patel expresses skepticism regarding the potential for AI hallucinations and the lack of accountability in private arbitration, McCormack argues that the current human-run legal system is fraught with error and inefficiency, suggesting that a transparent, audited AI system that ensures parties feel heard could actually restore trust in dispute resolution. McCormack anticipates a future where reliance on human judges for every dispute will seem antiquated, viewing AI as a critical mechanism for democratizing legal access and streamlining case management.

https://archive.ph/bCmYc

LLMs Encode Their Failures: Predicting Success From Pre-Generation Activations

Researchers investigate whether Large Language Models (LLMs) inherently encode their probability of success within their internal activations prior to generating output, specifically within mathematics and coding domains. By training linear probes on these pre-generation activations, the study reveals that LLMs possess a model-specific concept of difficulty that is distinct from human perceptions, a divergence that intensifies as models employ extended reasoning strategies. Although the reliability of these predictive probes diminishes with longer reasoning chains, they remain effective tools for optimizing inference efficiency. The authors demonstrate that routing queries to appropriate models based on these internal difficulty estimates can match the performance of high-capability models while reducing inference costs by up to 70%, highlighting the practical utility of accessing internal model states for resource allocation.

https://arxiv.org/pdf/2602.09924
https://github.com/KabakaWilliam/llms_know_difficulty

Can LLMs Cook Jamaican Couscous? A Study of Cultural Novelty in Recipe Generation

This study investigates the capacity of Large Language Models (LLMs) to generate culturally adapted content by analyzing their ability to create cooking recipes, a domain where cultural identity and creativity deeply intersect. By comparing human-authored recipes from the GlobalFusion dataset with those generated by various LLMs, the researchers aimed to determine if the models could produce adaptations that align with established measures of cultural distance. The findings reveal that LLMs fail to create culturally representative adaptations, as their generated outputs show a divergence that does not correlate with cultural distance in the way human adaptations do. Instead of capturing nuanced cultural shifts, the models tend to overproduce superficial novelty and often replace specific, culturally significant ingredients with generic alternatives like salt and oil. Furthermore, the authors demonstrate that this lack of cultural alignment is partly due to the models losing cultural information within their internal layers and struggling to distinguish between concepts of creativity and tradition.

https://arxiv.org/pdf/2602.10964
https://github.com/fcarichon/LLMCokingNovelty

BagelVLA: Enhancing Long-Horizon Manipulation via Interleaved Vision-Language-Action Generation

BagelVLA is a novel Vision-Language-Action framework designed to enhance robotic manipulation in complex, long-horizon tasks by unifying linguistic planning, visual forecasting, and action generation within a single transformer architecture. Unlike traditional models that treat high-level reasoning and low-level control as separate modules, BagelVLA explicitly interleaves these capabilities, allowing the system to decompose instructions into textual plans, predict future visual states, and generate corresponding actions in a cohesive sequence. To mitigate the high computational costs associated with visual generation, the authors introduced Residual Flow Guidance (RFG), a technique that leverages the current observation as a structural prior to efficiently predict future keyframes through single-step denoising. Extensive experiments demonstrate that this unified approach significantly outperforms existing baselines in both simulated and real-world environments, showing particular strength in tasks that require multi-stage reasoning and generalization to unseen instructions.

https://arxiv.org/pdf/2602.09849
https://cladernyjorn.github.io/BagelVLA.github.io

Large Language Lobotomy: Jailbreaking Mixture-of-Experts via Expert Silencing

Recent research identifies a critical vulnerability in Mixture-of-Experts (MoE) Large Language Models, where safety mechanisms such as refusal behaviors are concentrated within a small subset of specialized components rather than being distributed across the entire network. To exploit this, researchers developed Large Language Lobotomy (L3), a training-free attack framework that first identifies these safety-critical experts by analyzing routing patterns and then selectively silences them during the model's inference process. This technique effectively removes the model's ethical guardrails without significantly impairing its general language capabilities, as evidenced by experiments on eight open-source models where the average attack success rate rose from roughly 7% to over 70% while disabling fewer than 20% of the experts. These findings highlight a fundamental tension between the efficiency gains provided by sparse expert architectures and the robustness of safety alignment, suggesting that future MoE designs must intentionally distribute safety functions to prevent such targeted bypasses.

https://arxiv.org/pdf/2602.08741
https://github.com/jonatelintelo/LargeLanguageLobotomy

Can Bölük: I Improved 15 LLMs at Coding in One Afternoon. Only the Harness Changed.

Can Bölük's article asserts that the performance of Large Language Models (LLMs) in coding tasks is often limited not by the models' inherent intelligence, but by the harness or interface through which they interact with codebases. By replacing traditional, error-prone editing methods like patch application or string replacement with a novel Hashline technique—which tags code lines with pseudo-random identifiers for precise referencing—Bölük demonstrated significant efficiency gains and higher success rates across sixteen different models. This experiment highlights that mechanical failures in how models express edits often mask their actual coding abilities, with some weaker models seeing tenfold improvements using the new format. Furthermore, Bölük critiques major AI vendors for penalizing open-source developers who innovate these harnesses, arguing that collaborative optimization of these tools offers a high-leverage opportunity to improve model reliability universally without requiring additional training compute.

https://blog.can.ac/2026/02/12/the-harness-problem/
https://github.com/can1357/oh-my-pi

Self-Distillation Enables Continual Learning

Self-Distillation Fine-Tuning (SDFT) introduces a robust framework for continual learning in foundation models, addressing the critical limitation where acquiring new information via Supervised Fine-Tuning (SFT) causes the catastrophic forgetting of prior knowledge. Unlike SFT, which utilizes off-policy learning from static datasets, SDFT converts expert demonstrations into on-policy training signals by employing the model itself as both a student and a teacher. In this process, the teacher model is conditioned on demonstrations to produce high-quality target distributions that the student model—conditioned only on the task input—learns to approximate, effectively distilling the guidance from the demonstrations into its own parameters. This method enables the model to update its capabilities without regressing on previously learned tasks, demonstrating superior performance in sequential skill acquisition and knowledge integration compared to traditional fine-tuning methods. Additionally, the efficacy of SDFT improves with model scale, as larger models possess stronger in-context learning abilities necessary to generate accurate teacher signals.

https://arxiv.org/pdf/2601.19897
http://idanshenfeld.com/SDFT

Matt Shumer: Something Big Is Happening

Matt Shumer argues that society is currently oblivious to the magnitude of impending artificial intelligence disruptions, drawing a parallel to the public's lack of awareness in the weeks immediately preceding the COVID-19 pandemic. He contends that the release of new models in February 2026 marked a pivotal shift where AI gained the ability to autonomously execute complex technical tasks—such as writing, debugging, and refining software—with a level of judgment and taste that renders human intervention increasingly unnecessary. This rapid acceleration is driven by a recursive process in which current AI systems are now instrumental in building their more intelligent successors, leading industry experts to predict that half of all entry-level white-collar jobs could be eliminated within one to five years. Shumer urgently advises individuals to move beyond outdated perceptions based on free versions of AI and immediately adopt paid, state-of-the-art models to automate their work and build financial resilience, arguing that early adaptation is the only defense against this inevitable professional displacement.

https://shumer.dev/something-big-is-happening

From AI Burnout To AI Native: The 5-Level Blueprint To Actually Using Agents

Recent discourse on artificial intelligence highlights a tension between the promise of exponential productivity and the risk of cognitive burnout, a phenomenon described as the Productivity Paradox. To navigate this landscape, product leader Peter Yang proposes a five-level framework for AI adoption that guides users from basic inquiry to managing autonomous systems. While most users remain at Level 1 by simply replacing search engines with chatbots, substantial efficiency is gained at Level 2 through the integration of voice dictation and meeting assistants into daily workflows. The progression continues through Level 3, where users prototype functional artifacts instead of writing static specifications, and Level 4, which empowers non-developers to build full applications using plain language. The hierarchy culminates at Level 5 with the deployment of autonomous AI agents capable of executing complex tasks independently, signaling a paradigm shift where human value is defined not by execution, but by the judgment and curiosity required to direct these powerful tools.

https://www.theneuron.ai/explainer-articles/ai-burnout-to-native-5-level-blueprint

Fine-tune MoE Models 12x Faster with Unsloth

Unsloth has introduced significant optimizations for fine-tuning Mixture-of-Experts (MoE) architectures, achieving up to 12 times faster training speeds and reducing VRAM usage by over 35 percent compared to previous standards. By leveraging custom Triton kernels and a novel Split LoRA approach, the software reorders matrix multiplications to avoid the memory-intensive materialization of parameters typically required when adapting expert layers. These technical advancements, which integrate with PyTorch's grouped matrix multiplication functions, allow users to train models with six times longer context windows and are compatible with a broad spectrum of hardware ranging from consumer RTX cards to high-end data center GPUs. The update specifically targets models such as Qwen3, gpt-oss, and GLM 4.7, offering automatic backend selection to maximize efficiency based on the available computational resources.

DEV Community

Daily AI Rundown - February 15, 2026

Tech News

OpenAI

Other News

Biz News

Other News

Podcasts

Stay Connected

Top comments (0)