AI Research Trends 

VIKI-R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning

This work introduces VIKI-Bench, a hierarchical benchmark tailored for embodied multi-agent cooperation, along with VIKI-R, a framework that enhances multi-agent cooperation through reinforcement learning and demonstrates significantly improved performance across diverse tasks.

Read more

Cosmos-Drive-Dreams: Scalable Synthetic Driving Data Generation with World Foundation Models

The paper introduces Cosmos-Drive-Dreams, a synthetic data generation pipeline that addresses challenges in real-world driving data collection by generating high-fidelity driving scenarios to facilitate autonomous vehicle training.

Read more

AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions

AbstentionBench provides a large-scale benchmark for evaluating LLMs’ abilities to correctly abstain from answering unanswerable questions, revealing significant gaps in performance across different models.

Read more

FZOO: Fast Zeroth-Order Optimizer for Fine-Tuning Large Language Models towards Adam-Scale Speed

FZOO offers a Fast Zeroth-Order Optimizer that achieves Adam-scale speed for fine-tuning large language models by significantly reducing the number of required forward passes for convergence.

Read more

Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning

Router-R1 presents a reinforcement learning framework for optimizing the routing of user queries among multiple language models, demonstrating improved performance in multi-hop QA benchmarks.

Read more

Learning to Reason Across Parallel Samples for LLM Reasoning

This study proposes a novel model, Sample Set Aggregator (SSA), for aggregating answers from multiple sampled outputs to improve reasoning accuracy in language models.

Read more

When Simple Model Just Works: Is Network Traffic Classification in Crisis?

This paper investigates the efficacy of a simple k-NN baseline for network traffic classification, revealing issues with current practices and redundancy in labeled datasets.

Read more

TokenBreak: Bypassing Text Classification Models Through Token Manipulation

TokenBreak introduces a novel attack method that can bypass text classification models by exploiting the tokenization strategy, revealing vulnerabilities that may endanger systems relying on NLP technologies.

Read more

LoRMA: Low-Rank Multiplicative Adaptation for LLMs

LoRMA proposes a new approach to low-rank adaptation in language models, shifting from additive to multiplicative updates to enhance training efficiency and performance.

Read more