Zain Naboulsi

Posted on Feb 17 • Originally published at dailyairundown.substack.com

Daily AI Rundown - February 16, 2026

#ai #machinelearning #news #newsletter

This is the February 16, 2026 edition of the Daily AI Rundown newsletter. Subscribe on Substack for daily AI news.

Tech News

Other News

**[ð Qwen3.5-397B-A17B is here: The first open-weight model in the Qwen3.5 series.

Native multimoda...](https://x.com/Alibaba_Qwen/status/2023331062433153103)**

Alibaba’s Qwen team has launched Qwen3.5-397B-A17B, the first open-weight model in its new series, offering a native multimodal architecture optimized for real-world AI agents and coding. The model represents a significant leap in efficiency, utilizing a sparse Mixture-of-Experts (MoE) design to achieve up to 19 times faster decoding throughput than its predecessors while supporting over 200 languages. Released under an Apache 2.0 license, this development provides the open-source community with a high-performance, accessible alternative to proprietary frontier models through advanced reinforcement learning scaling.

Unitree Spring Festival Gala Robots — a Full Release of Additional Details

Unitree Robotics has announced a series of technological milestones following a high-profile demonstration at the Spring Festival Gala, featuring the world’s first fully autonomous humanoid robot cluster performance. The display, which involved dozens of G1 humanoid units executing coordinated Kung Fu maneuvers and the H2 model integrated with quadruped platforms, showcases significant advancements in motion control and large-scale autonomous synchronization. These achievements are considered newsworthy for setting multiple world records in robot cluster autonomy and highlighting the rapid evolution of humanoid technology in complex, public-facing applications.

I think it must be a very interesting time to be in programming languages and formal methods because...

AI researcher Andrej Karpathy asserts that Large Language Models (LLMs) are fundamentally altering the software engineering landscape by enabling the high-scale translation of legacy codebases into modern languages like Rust. Karpathy suggests that because translation is more reliable than de-novo generation, the industry is poised to rewrite a vast portion of global software, potentially necessitating the development of new programming languages optimized specifically for AI rather than human developers.

Prefer to listen? ReallyEasyAI on YouTube

Biz News

Other News

Fractal Analytics’ muted IPO debut signals persistent AI fears in India

Fractal Analytics, India’s first AI unicorn to go public, saw its shares close 7% below the issue price on Monday, ending its debut session with a market capitalization of approximately $1.6 billion. This valuation marks a notable decline from the company's $2.4 billion private-market high reached in mid-2025, despite the firm reporting a swing to profitability and a 26% year-over-year revenue increase. The subdued performance followed a strategic 40% reduction in the offering's size, signaling persistent investor caution toward AI-focused software stocks even as India hosts global summits to position itself as a premier technology hub. Fractal intends to use the IPO proceeds to fund research and development, expand its infrastructure, and facilitate international growth.

Something Big Is Happening

Artificial intelligence industry insiders are warning of an imminent and profound societal transformation, comparing the current technological climate to the period immediately preceding the 2020 global pandemic. Experts contend that a small cohort of researchers at firms like OpenAI and Google DeepMind have unlocked an exponential pace of development that has already fundamentally altered the professional landscape for those within the sector. Recent breakthroughs in model training have shifted the technology from steady improvement to a state of rapid-fire acceleration that is expected to restructure daily life and global economic systems. This warning suggests that while the broader public remains in a "pre-crisis" phase of normalcy, the underlying technology has already reached a critical inflection point of irreversible change.

Don't miss tomorrow's social media industry news

Meta has expanded the integration of Manus AI into its Ads Manager platform, granting advertisers direct access to automated agents designed for report building and audience research. The rollout, which follows Meta's acquisition of the AI startup last month, aims to streamline the advertising process through in-stream prompts and a dedicated entry in the platform's tools menu. By embedding these advanced capabilities, Meta intends to demonstrate the practical value of its multi-billion dollar AI investments to both ad partners and investors seeking a return on investment. This strategic deployment highlights the company’s push to prioritize advertising efficiencies as the primary monetization driver for its evolving artificial intelligence ecosystem.

Apple Faces Regulatory Scrutiny And AI Delays As Valuation Sits Near Fair Value

Apple Inc. is currently facing increased regulatory scrutiny regarding Apple News alongside significant delays in the rollout of its AI-powered Siri updates. These operational challenges coincide with a recent 6.9% decline in the company’s share price, which now sits near its estimated fair value amid a broader year-to-date dip of 5.6%. Investors are closely monitoring how these setbacks will affect the company’s brand ecosystem and competitive position relative to other major technology peers in the artificial intelligence sector. Despite these short-term pressures, the company maintains a solid balance sheet and a strong long-term performance record.

Flapping Airplanes on the future of AI: ‘We want to try really radically different things’

Flapping Airplanes, a newly launched artificial intelligence research lab, has secured $180 million in seed funding to develop models that prioritize data efficiency over sheer scale. Led by co-founders Ben Spector, Asher Spector, and Aidan Smith, the startup aims to replicate the human brain's ability to learn from limited information rather than relying on the massive datasets used by current industry leaders. By moving away from traditional transformer architectures and gradient descent, the team seeks to solve the high costs and technical limitations of contemporary AI training. This radical approach positions the lab as a specialized alternative to existing foundation model companies, focusing on fundamental algorithmic innovation to drive the next wave of AI capabilities.

Prefer to listen? ReallyEasyAI on YouTube

Podcasts

The U.S. Department of Labor’s Artificial Intelligence Literacy Framework

To support the integration of artificial intelligence competencies into the nation's workforce and education systems, the U.S. Department of Labor has issued the AI Literacy Framework as a voluntary resource for program design. Designed to advance the Administration's reindustrialization agenda, this framework defines AI literacy as the ability to responsibly use and evaluate AI technologies, with a specific emphasis on the generative AI tools currently reshaping the modern workplace. The document details five foundational content areas—ranging from understanding AI principles and directing systems effectively to evaluating outputs and ensuring responsible use—alongside seven delivery principles that prioritize experiential, contextualized, and agile learning methods to ensure training remains relevant as technology evolves. Ultimately, this initiative aims to equip diverse stakeholders, including workers, employers, and educators, with a flexible standard to enhance productivity and maintain economic competitiveness in an increasingly AI-driven economy.

https://www.dol.gov/sites/dolgov/files/ETA/advisories/TEN/2025/TEN%2007-25/TEN%2007-25%20%28complete%20document%29.pdf

https://www.dol.gov/newsroom/releases/eta/eta20260213

Google: Towards Autonomous Mathematics Research

Researchers at Google DeepMind have introduced Aletheia, an advanced artificial intelligence agent designed to transition from competition-level problem solving to professional mathematics research. By employing a system that generates, verifies, and revises proofs while utilizing external tools to minimize errors, Aletheia has demonstrated the ability to produce publishable research, including an independently generated paper on arithmetic geometry and collaborative work on particle systems. The project also involved testing the agent on the Erdős Conjectures, where it solved four open problems, though the authors caution that some of these were less complex than anticipated or had been solved previously without record. In light of these developments, the paper proposes a standardized system to classify AI contributions based on their autonomy and mathematical significance, concluding that while AI is becoming a potent assistant for discovery, it currently lacks the reliability to replace human mathematicians.

https://arxiv.org/pdf/2602.10177
https://github.com/google-deepmind/superhuman/tree/main/aletheia

Linguistic Indicators of Early Cognitive Decline in the Dementia Bank Pitt Corpus

This research investigates the use of computational linguistics to detect early signs of cognitive decline, specifically within the context of dementia. By analyzing speech transcripts from the DementiaBank Pitt Corpus, the authors employed both machine learning models and statistical tests to identify robust linguistic markers that differentiate between individuals with dementia and healthy controls. The study compared different methods of text representation, ranging from raw text to abstract grammatical patterns, and found that indicators such as reduced vocabulary variety, simplified sentence structures, and an increased reliance on pronouns and functional words were strong predictors of cognitive impairment. Significantly, the results demonstrated that syntactic and grammatical features alone retained high discriminative power even when specific lexical content was removed, suggesting that structural degradation in language is a core characteristic of early-stage dementia. These findings support the potential for developing transparent, non-invasive screening tools based on the objective analysis of spontaneous speech patterns.

https://arxiv.org/pdf/2602.11028

MERIT Feedback Elicits Better Bargaining in LLM Negotiators

The researchers address the current limitations of Large Language Models in navigating complex bargaining scenarios by introducing a utility feedback framework designed to enhance strategic depth and alignment with human values. They present AGORABENCH, a comprehensive benchmark capable of simulating diverse economic environments, such as monopolies, deceptive markets, and installment plans, which tests agents beyond simple profit maximization. To evaluate negotiation performance more accurately, the authors developed MERIT, a metric grounded in economic theory that aggregates consumer surplus, negotiation power, and the acquisition ratio of desired goods to capture the nuances of human preference. By utilizing MERIT for in-context learning and fine-tuning, the study demonstrates that LLMs can develop stronger opponent-aware reasoning and achieve superior deal rates compared to baseline strategies that rely on rigid tactics or shallow reasoning.

https://arxiv.org/pdf/2602.10467

Why Do AI Agents Systematically Fail at Cloud Root Cause Analysis?

This research paper investigates why Large Language Model (LLM) agents consistently struggle with Cloud Root Cause Analysis (RCA), a process critical for identifying the source of failures in large-scale web services. By analyzing 1,675 agent executions across five different models using the OpenRCA benchmark, the authors identified twelve distinct failure types, or "pitfalls," categorized into internal reasoning, communication between agents, and environmental interactions. The study reveals that the most common errors, such as hallucinating data interpretations and failing to explore enough data, occur across all models regardless of their capability, suggesting these are fundamental flaws in the shared agent architecture rather than specific model limitations. While attempting to fix these issues through prompt engineering proved ineffective, the researchers demonstrated that structural changes—specifically enriching the communication protocol between agents to include code and error outputs—significantly reduced failures and improved efficiency.

https://arxiv.org/pdf/2602.09937

Chatting with Images for Introspective Visual Thinking

To address the limitations of current Large Vision-Language Models (LVLMs) which suffer from information loss due to static single-pass encoding or rely on disjointed external tools, researchers have proposed a new reasoning paradigm called "chatting with images". This framework reframes visual manipulation as language-guided feature modulation, utilizing a novel architecture named VILAVT that features a dynamic vision encoder capable of jointly processing multiple image regions conditioned on expressive textual inquiries. By employing a two-stage training curriculum that combines supervised fine-tuning with reinforcement learning, VILAVT learns to introspectively query and re-encode visual data to retrieve fine-grained details that might otherwise be lost. Extensive empirical evaluations across eight benchmarks indicate that this approach yields strong, consistent improvements, achieving state-of-the-art performance on complex spatial reasoning tasks involving multiple images and videos by effectively bridging high-level linguistic intent with low-level visual feature processing.

https://arxiv.org/pdf/2602.11073
https://github.com/AntResearchNLP/ViLaVT

Image Quality in the Era of Artificial Intelligence

The rapid deployment of artificial intelligence in radiology aims to enhance image reconstruction and streamline workflows, yet it introduces distinct failure modes that can compromise patient care. While AI-enabled devices can produce images that appear visually superior—often sharper and smoother—there is frequently a disconnect between this perceived quality and the actual diagnostic information contained within the image. Research indicates that neural networks can introduce artifacts or hallucinations and even remove critical pathological features, such as lesions, which standard subjective and quantitative quality metrics often fail to identify. Although the FDA reviews these technologies for safety and effectiveness, devices often enter the market with general indications that may not account for performance deficits in specific clinical tasks. Ultimately, safe implementation requires the medical community to recognize that AI cannot generate authentic patient-specific data and to look beyond mere visual appeal when evaluating diagnostic utility.

https://arxiv.org/pdf/2602.09347

A Collaborative Safety Shield for Safe and Efficient CAV Lane Changes in Congested On-Ramp Merging

This research paper introduces a novel lane change controller called MARL-MASS designed to help Connected and Autonomous Vehicles (CAVs) safely merge into congested traffic. Existing autonomous driving systems often struggle to balance safety with efficiency in dense conditions, as they either prioritize caution to the point of causing delays or take risks that lead to collisions. To address this trade-off, the authors combined Multi-Agent Reinforcement Learning (MARL) with a collaborative safety mechanism known as the Multi-Agent Safety Shield (MASS), which uses Control Barrier Functions to override unsafe actions while allowing vehicles to coordinate their movements. The system also utilizes a graph-based interaction topology to map dependencies between vehicles and employs a customized reward function that encourages faster driving and efficient merging since the safety shield handles collision prevention. Simulation results demonstrated that MARL-MASS maintained strict safety guarantees with zero collisions while achieving higher average speeds and better merging rates than comparable safety-focused controllers.

https://arxiv.org/pdf/2602.10007
https://github.com/hkbharath/MARL-MASS

When the Prompt Becomes Visual: Vision-Centric Jailbreak Attacks for Large Image Editing Models

Recent advancements in large image editing models have enabled users to guide edits through visual cues, but this capability has introduced a significant security vulnerability known as Vision-Centric Jailbreak Attacks (VJA). Unlike traditional text-based attacks that are often caught by existing safeguards, VJA embeds malicious instructions directly into the input image using visual signals like arrows or text within the image, effectively bypassing safety mechanisms designed primarily for textual analysis. To systematically evaluate this threat, the authors introduced the Image Editing Safety Benchmark (IESBench), revealing that state-of-the-art commercial models such as Nano Banana Pro and GPT Image 1.5 are highly susceptible to these visual attacks, often executing harmful edits like evidence tampering or hate speech generation that they would refuse if requested via text. To mitigate this risk, the study proposes a training-free defense mechanism that utilizes introspective multimodal reasoning to trigger the model's internal safety awareness before generating an image, a method shown to significantly enhance robustness against these visual jailbreaks with negligible computational cost.

https://arxiv.org/pdf/2602.10179
https://csu-jpg.github.io/vja.github.io/

Would an LLM Pay Extra for a View? Inferring Willingness to Pay from Subjective Choices

Researchers investigated the capability of Large Language Models (LLMs) to function as autonomous travel assistants by presenting them with hotel booking dilemmas to calculate their implied Willingness to Pay (WTP) for various room attributes. By applying economic methodologies to analyze the choices of models like GPT-4o and Llama 3.3 70B, the study determined that while larger models can produce structured decision patterns, they consistently overestimate the monetary value of amenities compared to human benchmarks, particularly for features described in detail like club access. The experiments demonstrated that LLM preferences are highly malleable and sensitive to prompt engineering, as assigning business personas significantly inflated valuations while providing context examples of budget-friendly choices helped align the models' decisions more closely with realistic human preferences. The authors conclude that deploying LLMs for subjective decision-making requires significant caution because their outputs can be easily skewed by the framing of the options and the specific user personas assigned in the prompts.

https://arxiv.org/pdf/2602.09802
https://github.com/manon-reusens/WTP_LLMs

More AI paper summaries: AI Papers Podcast Daily on YouTube