Shawn

Posted on Jun 21

FutureX · Physical AI Daily — Issue 35 (06/22)

#ai #robotics #machinelearning #research

Today's Highlights

· China's Ministry of Industry and Information Technology (MIIT) and State-owned Assets Supervision and Administration Commission (SASAC) jointly launched the 2026 Humanoid Robot and Embodied Intelligence Real-World Training Special Action: by year-end, the initiative targets identifying over 100 high-value deployment scenarios and building 10,000-unit-scale deployment capacity, with encouragement to explore "Humanoid Robot as a Service" leasing models.

· T-Rex (UC Berkeley × Nvidia × Stanford et al.) — a large tactile-reactive dexterous hand model: 100 hours of real-hardware tactile data + a three-expert MoT architecture achieve an average success rate over 30% higher than the strongest baselines across 12 fine manipulation tasks; removing tactile signals drops performance by 23%.

· Spain's Theker closes a €73 million (~$85 million) Series A led by CRV, with Samsung and LVMH making their first-ever investments in a Spanish company, and Inditex adding to its stake — setting a new European robotics Series A record.

· A Stanford-led multi-institution position paper argues that general-purpose robots need more than "VLA + world-model policy scaling" — the real gap is grounding vast unstructured physical experience into robot supervision via four missing interfaces.

· Tesla's Austin robotaxi fleet has fewer than 60 vehicles in operation, with notably long passenger wait times — early-stage reality that contrasts with the company's stated expansion ambitions for the year.

I. Research Papers

T-Rex: A Tactile-Reactive Manipulation Foundation Model That Lets Dexterous Hands "Feel While Acting" · manipulation

Most VLAs today either ignore tactile feedback or use only static tactile encodings — essentially "working by sight." T-Rex turns high-frequency touch into a real-time reactive pathway of its own, directly addressing contact-intensive tasks like slip detection, force control, and soft-object grasping that vision alone handles poorly.

Zhuoyang Liu et al. (UC Berkeley · Nvidia · Stanford · Panasonic · Sapienza University of Rome, et al.) · arXiv 2606.17055 source · Commentary: 机器人解剖师 source (WeChat, CN)

The team collected 100 hours of real-hardware dexterous hand manipulation data covering 200+ everyday objects, 22 motor primitive types, and synchronized tactile signals, then trained a three-expert Mixture-of-Transformers architecture in stages: a Latent Expert serves as a world model, pre-trained on approximately 23,000 hours of egocentric human video to predict future latent states; an Action Expert generates coarse actions at 5 Hz from noise; and a Tactile Expert produces final actions at 20 Hz using spatiotemporally encoded tactile signals — with the three experts handling different denoising stages under flow matching. The low- and high-frequency experts run asynchronously, enabling tactile-driven fine-grained adjustments at high frequency. Across 12 tasks requiring precise force control and deformable object manipulation, the average success rate exceeded the strongest baselines (including Pi0.5) by over 30%; ablations show that removing tactile input causes roughly 23% performance degradation. The project page includes interactive demos such as screwing in a light bulb.

Robots Need More than VLA and World Models: The Missing Piece Is Grounding, Not Bigger Policies · vla

This position paper directly challenges the mainstream narrative that "more robot data + larger VLAs = general-purpose robots," proposing a "grounding-centric" framework as a pointed counterargument to the current VLA/world-model arms race.

Elis Karcini et al. (Motoniq.ai · Stanford · Italian Institute of Technology IIT · ETH Zurich · TU Darmstadt, et al.) · arXiv 2606.06556 source · Commentary: paper艾克赛 source (WeChat, CN)

The paper argues that language and vision foundation models succeeded because internet data is natively digital and densely labeled; "physical text" — human manipulation videos, motion capture, factory workflows — is abundant but lacks action labels, force signals, task semantics, and reward structure, making it unfit for direct consumption by robot policies. The authors reposition VLAs as a single policy interface within a larger "physical intelligence stack" and identify four missing components: a data interface for automatically annotating unstructured behavior; an embodiment interface for retargeting human actions to robot morphologies; a world model interface for physical 3D reasoning; and a reward interface for inferring task progress and success from video and language. Together these form a self-improving deployment loop in which every physical experience — including failures — becomes a supervisory signal. This is a conceptual position paper and contains no concrete algorithmic implementation.

VLA Survey: The Bottleneck Is Not Just the Model — It's Datasets, Benchmarks, and Data Engines · benchmark

A thorough mapping of the three infrastructure pillars underpinning VLAs from 2023–2025, with a clear diagnosis that existing benchmarks are largely limited to short tabletop tasks and under-evaluate long-horizon performance and error recovery — useful reading for anyone doing evaluation or model selection.

Survey · Commentary: 具身智能与空间感知 source (WeChat, CN)

The survey systematically reviews the three infrastructure pillars supporting VLA training and evaluation — datasets, benchmarks, and data engines — and notes that existing benchmarks typically measure capability via success rate while rarely covering long-horizon tasks, multi-step composition, cross-scene generalization, or error recovery, which are precisely the capabilities that matter most for real-world deployment.

Unified World Model Survey: Decomposing the Functional Modules from "Understanding" to "Acting" · world-model

Survey · Commentary: 具身智能排行榜 source (WeChat, CN)

The survey breaks modern world models into functional modules — encoder, dynamics prediction, reward/value estimation, and others — and discusses how to unify "understanding the world" and "acting within it" into a single framework as a foundational component for general embodied intelligence.

Other papers today: Fei-Fei Li's "A Functional Taxonomy of World Models" — clarifying the overloaded term "world model" with a unified functional classification source (WeChat, CN); "Granger Causal Discovery" for time series moves toward world models, emphasizing causality over pure prediction source (WeChat, CN); Nanyang Technological University introduces a 3D generative model with physics simulation support, with generated assets directly deployable in robot training source.

Open Source · Tools · Benchmarks

· DreamX-World 1.0: AMAP (Chinese mapping and navigation platform) has released an open-source version of its general interactive world model, with the official account claiming 16 FPS real-time generation (the model was originally published in mid-June; this release is the open-source version) source (WeChat, CN).

II. Funding & Deals

Theker (Spain) | Series A | €73 million (~$85 million) · industrial

Led by CRV, with participation from Samsung, LVMH (via Aglaé Ventures), Cathay Innovation, 20VC, and Henkel Ventures; existing investor Inditex (parent of Zara) added to its stake. This is the largest Series A in European robotics history, and simultaneously marks CRV's first investment in Spain and Samsung's and LVMH's first bets on a Spanish startup. Founded in 2022, Theker applies AI, computer vision, and deep learning to automate tasks in industrial environments; the funds will accelerate deployment with major industrial customers and expand the software and hardware teams. While robotics funding in China remains heavily concentrated on humanoid embodied AI, this deal signals that European industrial capital — fashion retail and consumer electronics — is bypassing the hardware platform race and investing directly in the AI software layer for industrial automation.Source: 六观阿尔法 source (WeChat, CN)

LISSOME (China) | Series A | tens of millions of RMB · adjacent

Led by Sequoia China and Brizan Ventures, with follow-on from existing investors and HKX. LISSOME positions itself as an AI kitchen robot company, having previously entered the consumer market with a compact "small, fast, clean" capsule dishwasher. Top-tier fund backing for a consumer kitchen robotics play reflects embodied AI deployment expanding beyond factories and logistics into high-frequency household use cases.Source: 硬氪 source (WeChat, CN)

Feikuo Technology (Hangzhou, China) | Series A | hundreds of millions of RMB · embodied

Led by Cybernaut Investment (Chinese PE firm), with co-investment from Dangvirtual Technology, Meigao Intelligent, and Shengao Technology. The company develops embodied intelligence solutions for robotics, and is one example in a wave of recent fundraising concentrated in China's robot "brain/solution" layer.Source: 高工人形机器人 source (WeChat, CN)

ANSCER Robotics (India) | Series A | $5.4 million · industrial

Funds will be used to scale its hybrid autonomous mobile robot platform for industrial material handling and expand into North American and global markets. This Indian AMR manufacturer differentiates itself through on-site integration capabilities that combine the flexibility of AMRs with the reliability of AGVs.Source: 中叉网 source (WeChat, CN)

Weekly Capital Summary · embodied

According to industry media aggregates, 15 embodied robotics companies in China collectively raised over RMB 6 billion in the past week. Laifu Harmonic (Chinese harmonic reducer manufacturer) passed its Hong Kong Stock Exchange listing hearing; the marine robotics firm Shihang Intelligent's Series A of over RMB 1 billion and Daka Robotics' latest round of several hundred million RMB were reported previously.Source: 高工机器人 source (WeChat, CN)

III. Commercial Deployment

Amazon's Next-Generation Warehouse Robot Understands Natural Language Commands · industrial

Amazon has reportedly introduced a new warehouse robot capable of understanding everyday spoken instructions, enabling pick-and-place operations via natural language without per-task preprogramming — a step toward direct integration of language commands into warehouse operations.Source: warpnews source

Cainiao ZeeBot Climbing Warehouse Robots Deployed Across Global Fulfillment Network · industrial

Cainiao's (Alibaba logistics arm) ZeeBot climbing robots, designed to operate in the vertical space of storage racks, are being deployed to transform picking operations across Cainiao's global warehouse network — differentiating from traditional floor-based AGV/AMR two-dimensional movement.Source: Pandaily source

Spain's Mercadona Opens Semi-Automated Warehouse: 70 Robots, ~5,000 Orders Per Day · industrial

Spain's largest supermarket chain Mercadona has opened a roughly 32,000-square-meter semi-automated "hive" warehouse in Vallecas, deploying 70 robots in a "goods-to-person" model that brings order processing capacity to approximately 5,000 orders per day. This is another example of a retailer building its own automated fulfillment center.Source: CPG Click source

Tesla Austin Robotaxi: Fewer Than 60 Vehicles in Operation, Long Wait Times · autonomy ⚠️ Operational status

Tesla's robotaxi fleet in Austin reportedly numbers fewer than 60 vehicles, with passengers facing long wait times. Viewed against Tesla's stated expansion ambitions, the actual operational capacity of its driverless taxi service remains at an early stage — a useful reality check against the current high-temperature robotaxi narrative.Source: MSN source

Volkswagen ID. Buzz Robotaxi Hits Los Angeles Streets · autonomy

Volkswagen's autonomous taxi based on the ID. Buzz has begun testing and operating on Los Angeles roads, continuing the established automaker strategy of using production vehicle platforms as the basis for robotaxi entry.Source: InsideEVs source

IV. Industry Developments

BAAI: None of Today's "World Models" Are Real — and We're Attacking the Problem Directly · world-model ⚠️ Institutional opinion

The Beijing Academy of Artificial Intelligence (BAAI, Chinese AI research institute) has publicly stated that none of the current "world models" — including those that have topped leaderboards and been called next-generation AI — qualify as genuine world models, and that the institute intends to tackle the problem head-on. At a time when the "world model" label is broadly inflated and products are crowding benchmark tables, this is a dissenting view; it points toward physical consistency and causal reasoning rather than visually convincing generation.Source: 冰镇车厘子 source (WeChat, CN)

Unitree Robotics (Chinese humanoid robot maker) IPO Approved; Prospectus Reveals Profitability and Scale Data · humanoid

Following approval of its STAR Market IPO (targeting ~RMB 4.2 billion in proceeds, previously reported), the prospectus discloses further details: 5,500 humanoid robots shipped in 2025, revenue of RMB 1.699 billion, year-on-year growth of 335%, and net profit of RMB 238 million — making Unitree one of the few profitable players in an industry that broadly burns cash. The overall valuation is estimated at approximately RMB 42 billion. Institutions including Nomura are discussing its cost control and scale-up trajectory on the basis of these figures.Source: 调研纪要更新 source (WeChat, CN)

LG Executives Visit U.S. to Meet with Nvidia, Deepening AI and Robotics Cooperation · industrial ⚠️ Meeting readout

South Korean media report that an LG executive delegation traveled to the U.S. for talks with Nvidia officials on AI and robotics, as a follow-up to the Koo (LG Chairman Koo Kwang-mo)-Huang (Jensen Huang) meeting two weeks prior, with a focus on manufacturing AI and robotization. Specific cooperation details have not been disclosed.Source: The Korea Herald source

VinFast Partners with Nvidia and Autobrains on Southeast Asia Robotaxi · autonomy ⚠️ Partnership announcement

Vietnamese automaker VinFast has announced a partnership with Nvidia and Autobrains to advance robotaxi services in Southeast Asian markets — another example of an emerging-market automaker leveraging chip and algorithm partners to enter autonomous mobility. The arrangement currently remains a cooperation framework.Source: MSN source

Sutton and Carmack Collaborate: Let Robots "Play Games" in the Real World for Reinforcement Learning · world-model

Reinforcement learning pioneer Richard Sutton and game technology legend John Carmack are collaborating on a vision of agents and robots learning continuously through game-like interaction in the real world, betting on online reinforcement learning rather than pure offline large-scale data — a different technical direction from the mainstream VLA scaling approach.Source: Sohu source

Zhiyuan Robotics (Chinese humanoid startup) Sets RMB 10 Billion Revenue Target for 2027, Opens World's First Flagship Store in Shanghai · humanoid ⚠️ Company guidance

Zhiyuan Robotics has publicly announced a RMB 10 billion revenue target for 2027 and opened its first global brick-and-mortar store in Shanghai. The revenue target is a strategic statement requiring delivery to validate; the physical store represents an early channel test for reaching consumer-facing audiences.Source: 1fyuhhh source (WeChat, CN)

Hardware · Supply Chain

· Dexterous Hand Pricing and Capacity: Linker Hand (Chinese dexterous hand maker) claims its mass-production price has fallen to the RMB 10,000-unit range, down roughly 99% from previous levels, with a 2026 delivery target of 50,000–100,000 units; Yinshi Robotics says its dexterous hand annual shipments have already exceeded 10,000 units. ⚠️ Vendor claims; delivery targets and cost reductions remain to be borne out in production source (WeChat, CN)source (WeChat, CN).

· GaN Chips for Robotics Scaling Up: ON Semiconductor has launched its GaNEXUS platform targeting AI data centers and robotics; Infineon is expanding into robotics GaN; Chinese GaN magnetic encoder chips for robotics are also entering mass production — gallium nitride is increasingly penetrating robot joint and drive applications source source source.

· Coreless Motors: Fengzhao Technology (Chinese motor controller firm) and Sanhua Holdings (Chinese thermal management manufacturer) have formed a joint venture to develop coreless motors and recently announced a joint technology breakthrough targeting cost reduction and localization of coreless motors for humanoid robot rotary joints source (WeChat, CN).

· Sona Comstar Enters Robotics Components: Indian automotive components manufacturer Sona Comstar has announced a move into robotics parts manufacturing — another instance of traditional automotive supply chains migrating toward humanoid robotics source.