AI Daily · April 20, 2026

1. DALM Replaces Flat Token Decoding with Algebraically Constrained Structured Generation Over a Domain Lattice.

A new paper from arXiv (cs.CL, April 17, 2026) introduces DALM, the Domain-Algebraic Language Model — a reframing of language generation that abandons unconstrained token-space decoding in favor of structured denoising over a mathematically defined domain lattice. The framework requires three ingredients: a lattice of domains with computable meet, join, and implication operations; a typing function over relations controlling inheritance across domains; and a fiber partition that localizes knowledge to domain-specific subsets. Given these, DALM follows a three-phase generation path — resolving domain uncertainty first, then relation uncertainty, then concept uncertainty — so each stage operates under explicit algebraic constraints. Cross-domain contamination is structurally prevented in closed-vocabulary mode and auditably bounded in open-vocabulary mode. A single query can produce a domain-indexed multi-perspective answer space. The framework is instantiated with CDC knowledge representation on crystal libraries, with training and evaluation conducted on validated domain-annotated datasets. The work reframes language generation as a domain-algebraic problem rather than a flat token prediction problem, a departure from the standard autoregressive paradigm.

Source: arXiv cs.CL | 2026-04-17

2. Supervised Fine-Tuning Induces Hallucinations via Overlapping Semantic Interference; Self-Distillation Mitigates Without Architectural Changes.

A multi-institution study (Guy Kaplan, Zorik Gekhman, Zhen Zhu, Lotem Rozner, Yuval Reif, Swabha Swayamdipta, Derek Hoiem, Roy Schwartz; arXiv cs.CL, April 16, 2026) identifies the root cause of SFT-induced hallucinations as interference among overlapping semantic representations — not capacity limits, not behavior cloning, but localized semantic interference. When models are fine-tuned on new factual information, overlapping knowledge from pre-training interferes with the updated weights, producing confident but incorrect outputs. The researchers tested three hypotheses — capacity limitations, behavior cloning, and localized interference — and found localized interference to be the dominant driver. Two mitigation strategies emerged from continual learning literature: a self-distillation-based SFT method that regularizes output-distribution drift to preserve pre-existing knowledge while absorbing new facts, and freezing parameter groups to suppress factual plasticity when new knowledge acquisition is unnecessary. Both approaches reduce hallucinations without modifying model architecture. The finding matters because SFT is the standard approach for adapting frontier models to domain-specific tasks, and the interference mechanism explains why domain fine-tuning so often degrades general capability.

Source: arXiv cs.CL | 2026-04-17

3. Claude Opus 4.7's New Tokenizer Inflates Token Counts by Up to 1.46x, Making the Model ~40% More Expensive at Identical Pricing.

Simon Willison's empirical analysis (April 20, 2026) documents that Claude Opus 4.7 — released by Anthropic on April 16, 2026 — is the first Claude model to ship a changed tokenizer. The same input text maps to significantly more tokens under the new tokenizer than under Opus 4.6. Willison's measurements: the Opus 4.7 system prompt consumed 7,335 tokens versus 5,039 under 4.6, a 1.46x inflation. For a 3.7MB high-resolution PNG image (3,456×2,234 pixels), Opus 4.7 reported 4,744 tokens versus 1,578 for 4.6 — a 3.01x increase, attributable to the model's new 3.75-megapixel maximum image resolution (3x the prior limit). For a 15MB 30-page text-heavy PDF, the multiplier was 1.08x. Despite the token inflation, Anthropic maintained identical pricing: $5 per million input tokens and $25 per million output tokens. The effective cost increase for typical text workloads is approximately 40%. The Opus 4.7 system prompt also adds Claude in PowerPoint as a named tool, expands child-safety conversation-level caution flags, introduces a tool_search mechanism requiring the model to check for available tools before claiming it lacks a capability, and adds anti-screenshot-attack guidance allowing Claude to decline yes/no answers on contested issues. The knowledge cutoff now reflects January 2026.

Source: Simon Willison | 2026-04-20

4. Physical Intelligence's pi_0.7 Achieves RL-Finetuned-Level Robotic Performance Out of the Box via Diverse Context Conditioning.

pi_0.7, released by Physical Intelligence on April 16, 2026 (arXiv cs.LG, 90+ authors including Chelsea Finn, Sergey Levine, Brian Ichter), is a steerable generalist robotic foundation model that achieves strong out-of-the-box performance across a wide range of manipulation and locomotion tasks without any per-task fine-tuning. The core innovation is diverse context conditioning: in addition to a language command describing what to do, the model is conditioned on multimodal metadata including task performance signals and subgoal images that describe the manner and strategy of execution. This enables pi_0.7 to absorb diverse training data including suboptimal autonomous data, failure data, and non-robot sources — inputs that standard RL pipelines exclude. Demonstrated capabilities include zero-shot cross-embodiment generalization (folding laundry without having seen the task), and espresso machine operation at a level matching specialized RL-finetuned models. The 0.7 release continues Physical Intelligence's progression toward a generalist robotic policy that subsumes multiple specialized RL workflows through prompt-based steering alone.

Source: arXiv cs.LG | 2026-04-17

5. DeepSeek Seeks First External Funding at a $100B Valuation, Targeting at Least $300M Raise.

DeepSeek, the Chinese AI laboratory behind the R1 and V3 models that disrupted the global AI landscape in early 2025, is engaging investors for the first time in its history. Two sources familiar with the matter told 36Kr (April 20, 2026) that DeepSeek is seeking to raise at least $300 million at a target valuation of at least $100 billion (approximately CNY 681.77 billion). The company had previously declined investment overtures from leading Chinese venture capital firms and technology companies. A spokesperson at a large state-backed equity firm described the fundraising reports as likely accurate, noting DeepSeek had become effectively inaccessible to external investors. The capital raise is intended to fund the high cost of training and developing frontier AI models, where compute requirements have grown substantially. DeepSeek's open-source model releases have repeatedly forced recalibrations of the competitive landscape, and the fundraising — if completed at the reported valuation — would make it one of the most valuable private AI companies globally.

Source: 36Kr | 2026-04-20

6. A Humanoid Robot Beats the Human Half-Marathon World Record, While Another Top Contender Falls During the Race.

At the 2026 Beijing Yizhuang Humanoid Robot Half-Marathon on April 19, 2026, the robot named "闪电" (Lightning), fielded by the Honor Incredible Monkey team, completed the 21.0975km course in 50 minutes and 26 seconds — beating the human half-marathon world record of 56 minutes and 42 seconds. The robot features Honor's self-developed liquid cooling thermal management system (换热流量 exceeding 4 liters per minute), integrated joint modules with peak torque of 400 Nm, and a high-dynamics motion control algorithm with multi-sensor fusion enabling real-time terrain adaptation. In the same event, Unitree's H1 robot — considered a top contender after breaking the human 1500m world record on a proportional basis during qualifying runs on April 16 — suffered a fall during the race and was carried off on a stretcher. The contrast between Lightning's record-setting run and H1's failure under race conditions highlights the divergence in stability, durability, and real-world robustness across humanoid robot platforms, even among top-tier designs.

Source: 36Kr | 2026-04-19

7. NBER Study Finds Nearly 90% of 6,000 Executives Report Zero AI Impact on Employment or Productivity Over Three Years.

A National Bureau of Economic Research study surveyed 6,000 executives — CEOs, CFOs, and senior leaders — across the United States, United Kingdom, Germany, and Australia, finding that nearly 90% reported AI had produced no measurable impact on employment or productivity over the preceding three years. Two-thirds of respondents said they used AI, but average usage was approximately 1.5 hours per week, and one quarter reported no workplace AI use at all. Corporate AI investments exceeded $250 billion in 2024, yet Apollo chief economist Torsten Slok observed that "AI is everywhere except in the incoming macroeconomic data." A Boston Consulting Group study found that productivity gains were achievable when using three or fewer AI tools, but that using four or more tools produced a self-reported productivity collapse — a phenomenon labeled "AI brain fry." MIT Nobel laureate Daron Acemoglu estimated AI would drive only a 0.5% productivity increase over the next decade, describing the outcome as "disappointing relative to industry promises." The one dissenting data point: Erik Brynjolfsson attributed a 2.7% U.S. productivity jump last year to AI transition, suggesting the aggregate impact may be beginning to materialize unevenly.

Source: VentureBeat / Fortune | 2026-04-19

8. Vercel Discloses Supply-Chain Breach via Compromised Context.ai Google Workspace OAuth, 580 Employee Records Exposed.

Vercel confirmed a security incident on April 19, 2026, after a threat actor claiming affiliation with ShinyHunters posted stolen data on a hacking forum. The initial access vector was a breach at Context.ai, a third-party AI analytics platform: a Vercel employee's Google Workspace account was compromised through Context.ai's OAuth application, then escalated into Vercel's internal environments. The attacker accessed environment variables not flagged as sensitive — Vercel encrypts sensitive variables at rest by default but leaves non-sensitive ones unencrypted. From there, the actor enumerated further access. Stolen data advertised for sale included access keys, source code, database data, internal deployment credentials, and API keys including NPM tokens and GitHub tokens. The attacker published 580 Vercel employee records (names, email addresses, account status, activity timestamps) as proof. Vercel CEO Guillermo Rauch confirmed the attack path and advised Google Workspace administrators to revoke OAuth App ID 110671459871-30f1spbu0hptbs60cb4vsmv79i7bbvqj.apps.googleusercontent.com. Vercel stated that Next.js, Turbopack, and its other open-source projects remain unaffected. This represents the first documented major supply-chain breach in which a third-party AI analytics platform served as the initial entry point into developer infrastructure.

Source: Hacker News / Bleeping Computer | 2026-04-19