DEV Community

Paperium profile picture

Paperium

Paperium AI Analysis & Review of Latest Scientific Research Articles

Joined Joined on  Personal website https://paperium.net
Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall

Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall

Comments
1 min read
HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents inHierarchical Rule Application

HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents inHierarchical Rule Application

Comments
2 min read
DeepWideSearch: Benchmarking Depth and Width in Agentic Information Seeking

DeepWideSearch: Benchmarking Depth and Width in Agentic Information Seeking

Comments
1 min read
Text or Pixels? It Takes Half: On the Token Efficiency of Visual Text Inputs inMultimodal LLMs

Text or Pixels? It Takes Half: On the Token Efficiency of Visual Text Inputs inMultimodal LLMs

Comments
1 min read
Accelerating Vision Transformers with Adaptive Patch Sizes

Accelerating Vision Transformers with Adaptive Patch Sizes

Comments
1 min read
SAVANT: Semantic Analysis with Vision-Augmented Anomaly deTection

SAVANT: Semantic Analysis with Vision-Augmented Anomaly deTection

Comments
1 min read
Machine Text Detectors are Membership Inference Attacks

Machine Text Detectors are Membership Inference Attacks

Comments
1 min read
DeLeaker: Dynamic Inference-Time Reweighting For Semantic Leakage Mitigation inText-to-Image Models

DeLeaker: Dynamic Inference-Time Reweighting For Semantic Leakage Mitigation inText-to-Image Models

Comments
2 min read
What Questions Should Robots Be Able to Answer? A Dataset of User Questions forExplainable Robotics

What Questions Should Robots Be Able to Answer? A Dataset of User Questions forExplainable Robotics

Comments
2 min read
RIR-Mega: a large-scale simulated room impulse response dataset for machinelearning and room acoustics modeling

RIR-Mega: a large-scale simulated room impulse response dataset for machinelearning and room acoustics modeling

Comments
1 min read
See the Text: From Tokenization to Visual Reading

See the Text: From Tokenization to Visual Reading

Comments
1 min read
When Do Transformers Learn Heuristics for Graph Connectivity?

When Do Transformers Learn Heuristics for Graph Connectivity?

Comments
2 min read
Learning from the Best, Differently: A Diversity-Driven Rethinking on DataSelection

Learning from the Best, Differently: A Diversity-Driven Rethinking on DataSelection

Comments
1 min read
ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer andJudge

ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer andJudge

Comments
1 min read
Steering Autoregressive Music Generation with Recursive Feature Machines

Steering Autoregressive Music Generation with Recursive Feature Machines

Comments
1 min read
MINED: Probing and Updating with Multimodal Time-Sensitive Knowledge for LargeMultimodal Models

MINED: Probing and Updating with Multimodal Time-Sensitive Knowledge for LargeMultimodal Models

Comments
2 min read
From Charts to Code: A Hierarchical Benchmark for Multimodal Models

From Charts to Code: A Hierarchical Benchmark for Multimodal Models

Comments
1 min read
NeuroAda: Activating Each Neuron's Potential for Parameter-Efficient Fine-Tuning

NeuroAda: Activating Each Neuron's Potential for Parameter-Efficient Fine-Tuning

Comments
1 min read
TheMCPCompany: Creating General-purpose Agents with Task-specific Tools

TheMCPCompany: Creating General-purpose Agents with Task-specific Tools

Comments
1 min read
ColorAgent: Building A Robust, Personalized, and Interactive OS Agent

ColorAgent: Building A Robust, Personalized, and Interactive OS Agent

Comments
1 min read
OmniNWM: Omniscient Driving Navigation World Models

OmniNWM: Omniscient Driving Navigation World Models

Comments
1 min read
Are they lovers or friends? Evaluating LLMs' Social Reasoning in English andKorean Dialogues

Are they lovers or friends? Evaluating LLMs' Social Reasoning in English andKorean Dialogues

Comments
1 min read
KORE: Enhancing Knowledge Injection for Large Multimodal Models viaKnowledge-Oriented Augmentations and Constraints

KORE: Enhancing Knowledge Injection for Large Multimodal Models viaKnowledge-Oriented Augmentations and Constraints

Comments
1 min read
Directional Reasoning Injection for Fine-Tuning MLLMs

Directional Reasoning Injection for Fine-Tuning MLLMs

Comments
1 min read
FinSight: Towards Real-World Financial Deep Research

FinSight: Towards Real-World Financial Deep Research

Comments
1 min read
Decomposed Attention Fusion in MLLMs for Training-Free Video ReasoningSegmentation

Decomposed Attention Fusion in MLLMs for Training-Free Video ReasoningSegmentation

Comments
1 min read
olmOCR 2: Unit Test Rewards for Document OCR

olmOCR 2: Unit Test Rewards for Document OCR

Comments
1 min read
Unified Reinforcement and Imitation Learning for Vision-Language Models

Unified Reinforcement and Imitation Learning for Vision-Language Models

Comments
1 min read
Attention Sinks in Diffusion Language Models

Attention Sinks in Diffusion Language Models

Comments
1 min read
Language Models are Injective and Hence Invertible

Language Models are Injective and Hence Invertible

Comments
1 min read
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing

Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing

Comments
1 min read
VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos

VideoAgentTrek: Computer Use Pretraining from Unlabeled Videos

Comments
1 min read
Towards Faithful and Controllable Personalization via Critique-Post-EditReinforcement Learning

Towards Faithful and Controllable Personalization via Critique-Post-EditReinforcement Learning

Comments
2 min read
Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1

Human-Agent Collaborative Paper-to-Page Crafting for Under $0.1

Comments
1 min read
ImpossibleBench: Measuring LLMs' Propensity of Exploiting Test Cases

ImpossibleBench: Measuring LLMs' Propensity of Exploiting Test Cases

Comments
1 min read
GigaBrain-0: A World Model-Powered Vision-Language-Action Model

GigaBrain-0: A World Model-Powered Vision-Language-Action Model

Comments
1 min read
DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile PhoneAgents

DaMo: Data Mixing Optimizer in Fine-tuning Multimodal LLMs for Mobile PhoneAgents

Comments
1 min read
BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced PolicyOptimization with Adaptive Clipping

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced PolicyOptimization with Adaptive Clipping

Comments
1 min read
Every Attention Matters: An Efficient Hybrid Architecture for Long-ContextReasoning

Every Attention Matters: An Efficient Hybrid Architecture for Long-ContextReasoning

Comments
1 min read
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts

LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts

Comments
1 min read
When Correct Is Not Safe: Can We Trust Functionally Correct Patches Generatedby Code Agents?

When Correct Is Not Safe: Can We Trust Functionally Correct Patches Generatedby Code Agents?

Comments
1 min read
Pruning Overparameterized Multi-Task Networks for Degraded Web Image Restoration

Pruning Overparameterized Multi-Task Networks for Degraded Web Image Restoration

Comments
1 min read
PokeeResearch: Effective Deep Research via Reinforcement Learning from AIFeedback and Robust Reasoning Scaffold

PokeeResearch: Effective Deep Research via Reinforcement Learning from AIFeedback and Robust Reasoning Scaffold

Comments
2 min read
Static Sandboxes Are Inadequate: Modeling Societal Complexity RequiresOpen-Ended Co-Evolution in LLM-Based Multi-Agent Simulatio

Static Sandboxes Are Inadequate: Modeling Societal Complexity RequiresOpen-Ended Co-Evolution in LLM-Based Multi-Agent Simulatio

Comments
1 min read
Predicting the Unpredictable: Reproducible BiLSTM Forecasting of Incident Countsin the Global Terrorism Database (GTD)

Predicting the Unpredictable: Reproducible BiLSTM Forecasting of Incident Countsin the Global Terrorism Database (GTD)

Comments
1 min read
Unimedvl: Unifying Medical Multimodal Understanding And Generation ThroughObservation-Knowledge-Analysis

Unimedvl: Unifying Medical Multimodal Understanding And Generation ThroughObservation-Knowledge-Analysis

Comments
1 min read
Planned Diffusion

Planned Diffusion

Comments
1 min read
Expanding the Action Space of LLMs to Reason Beyond Language

Expanding the Action Space of LLMs to Reason Beyond Language

Comments
1 min read
Any-Depth Alignment: Unlocking Innate Safety Alignment of LLMs to Any-Depth

Any-Depth Alignment: Unlocking Innate Safety Alignment of LLMs to Any-Depth

Comments
1 min read
Think with 3D: Geometric Imagination Grounded Spatial Reasoning from LimitedViews

Think with 3D: Geometric Imagination Grounded Spatial Reasoning from LimitedViews

Comments
1 min read
DeepSeek-OCR: Contexts Optical Compression

DeepSeek-OCR: Contexts Optical Compression

Comments
1 min read
Is Multilingual LLM Watermarking Truly Multilingual? A Simple Back-TranslationSolution

Is Multilingual LLM Watermarking Truly Multilingual? A Simple Back-TranslationSolution

Comments
1 min read
GAS: Improving Discretization of Diffusion ODEs via Generalized AdversarialSolver

GAS: Improving Discretization of Diffusion ODEs via Generalized AdversarialSolver

Comments
1 min read
Efficient Long-context Language Model Training by Core Attention Disaggregation

Efficient Long-context Language Model Training by Core Attention Disaggregation

Comments
1 min read
EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning

EvoSyn: Generalizable Evolutionary Data Synthesis for Verifiable Learning

Comments
1 min read
Extracting alignment data in open models

Extracting alignment data in open models

Comments
1 min read
AlphaQuanter: An End-to-End Tool-Orchestrated Agentic Reinforcement LearningFramework for Stock Trading

AlphaQuanter: An End-to-End Tool-Orchestrated Agentic Reinforcement LearningFramework for Stock Trading

Comments
1 min read
PRISMM-Bench: A Benchmark of Peer-Review Grounded Multimodal Inconsistencies

PRISMM-Bench: A Benchmark of Peer-Review Grounded Multimodal Inconsistencies

Comments
1 min read
Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposureMonocular Videos

Mono4DGS-HDR: High Dynamic Range 4D Gaussian Splatting from Alternating-exposureMonocular Videos

Comments
2 min read
Video Reasoning without Training

Video Reasoning without Training

Comments
1 min read
loading...