Academic

Latest AI Research

Stay ahead of the curve with our curated collection of the most impactful Artificial Intelligence research papers.

Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction

Agentic web search increasingly faces two distinct demands: deep reasoning over a single target, and structured aggregation across many entities and heterogeneous sources. Current systems struggle on both fronts.

Tue 10 Feb 2026

Authors: Yuxuan Huang, Yihang Chen, Zhiyuan He, Yuxiang Chen, Ka Yiu Lee, Huichi Zhou, Weilin Luo, Meng Fang, Jun Wang

Toward Personalized Digital Twins for Cognitive Decline Assessment: A Multimodal, Uncertainty-Aware Framework

Read

Cognitive decline is highly heterogeneous across individuals, which complicates prognosis, trial design, and treatment planning. We present the Personalized Cognitive Decline Assessment Digital Twin (PCD-DT), a multimodal and uncertainty-aware framework for modeling patient-specific disease trajectories from sparse, noisy, and irregular longitudinal data.

Mon 9 Feb 2026

Authors: Bulent Soykan, Gulsah Hancerliogullari Koksalmis, Hsin-Hsiung Huang, Laura J. Brattain

Evaluating TabPFN for Mild Cognitive Impairment to Alzheimer's Disease Conversion in Data Limited Settings

Read

Accurate prediction of conversion from Mild Cognitive Impairment (MCI) to Alzheimers Diseases (AD) is essential for early intervention, however, developing reliable conversion predictive models is difficult to develop due to limited longitudinal data availability We evaluate TabPFN (Tabular Pre-Trained Foundation Network) against traditional machine learning methods for predicting 3 year MCI to AD conversion using the TADPOLE dataset derived from ADNI. Using multimodal biomarker features extracted from demographics, APOE4, MRI volumes, CSF markers, and PET imaging, we conducted an experimental comparison across varying training set sizes (N=50 to 1000) and models including XGBoost, Random Forest, LightGBM, and Logistic Regression.

Sun 8 Feb 2026

Authors: Brad Ye, Bulent Soykan, Gulsah Hancerliogullari Koksalmis, Hsin-Hsiung Huang, Laura J. Brattain

Interval Orders, Biorders and Credibility-limited Belief Revision

Read

Rational belief revision is commonly viewed as being based on a preference order between possible worlds, with the resulting new belief set being those sentences true in all the most preferred models of the incoming new information. Usually, such a preference order is taken to be a total preorder.

Sat 7 Feb 2026

Authors: Richard Booth, Ivan Varzinczak

Step-level Optimization for Efficient Computer-use Agents

Read

Computer-use agents provide a promising path toward general software automation because they can interact directly with arbitrary graphical user interfaces instead of relying on brittle, application-specific integrations. Despite recent advances in benchmark performance, strong computer-use agents remain expensive and slow in practice, since most systems invoke large multimodal models at nearly every interaction step.

Fri 6 Feb 2026

Authors: Jinbiao Wei, Kangqi Ni, Yilun Zhao, Guo Gan, Arman Cohan

Optimal Stop-Loss and Take-Profit Parameterization for Autonomous Trading Agent Swarm

Read

Autonomous crypto trading systems often spend most of their design effort on finding entries, while exits are left to fixed rules that are rarely tested in a systematic way. This paper examines whether better stop-loss and take-profit settings can improve the performance of an autonomous trading agent swarm.

Thu 5 Feb 2026

Authors: Nathan Li, Aikins Laryea, Yigit Ihlamur

Unpacking Vibe Coding: Help-Seeking Processes in Student-AI Interactions While Programming

Read

Generative AI is reshaping higher education programming through vibe coding, where students collaborate with AI via natural language rather than writing code line-by-line. We conceptualize this practice as help-seeking, analyzing 19,418 interaction turns from 110 undergraduate students.

Wed 4 Feb 2026

Authors: Daiana Rinja, Eduardo Araujo Oliveira, Sonsoles López-Pernas, Mohammed Saqr, Marcus Specht, Kamila Misiejuk

TRUST: A Framework for Decentralized AI Service v.0.1

Read

Large Reasoning Models (LRMs) and Multi-Agent Systems (MAS) in high-stakes domains demand reliable verification, yet centralized approaches suffer four limitations: (1) Robustness, with single points of failure vulnerable to attacks and bias; (2) Scalability, as reasoning complexity creates bottlenecks; (3) Opacity, as hidden auditing erodes trust; and (4) Privacy, as exposed reasoning traces risk model theft. We introduce TRUST (Transparent, Robust, and Unified Services for Trustworthy AI), a decentralized framework with three innovations: (i) Hierarchical Directed Acyclic Graphs (HDAGs) that decompose Chain-of-Thought reasoning into five abstraction levels for parallel distributed auditing; (ii) the DAAN protocol, which projects multi-agent interactions into Causal Interaction Graphs (CIGs) for deterministic root-cause attribution; and (iii) a multi-tier consensus mechanism among computational checkers, LLM evaluators, and human experts with stake-weighted voting that guarantees correctness under 30% adversarial participation.

Tue 3 Feb 2026

Authors: Yu-Chao Huang, Zhen Tan, Mohan Zhang, Pingzhi Li, Zhuo Zhang, Tianlong Chen

Unsupervised Electrofacies Classification and Porosity Characterization in the Offshore Keta Basin Using Wireline Logs

Read

This study presents an unsupervised machine learning workflow for electrofacies analysis in the offshore Keta Basin, Ghana, where core data are scarce. Six standard wireline logs from Well~C were analysed over a depth interval comprising approximately $11{,}195$ samples.

Mon 2 Feb 2026

Authors: Hamdiya Adams, Theophilus Ansah-Narh, Daniel Kwadwo Asiedu, Bruce Kofi Banoeng-Yakubo, Marcellin Atemkeng, Thomas Armah, Richmond Opoku-Sarkodie, Rebecca Davis, Ezekiel Nii Noye Nortey

Think it, Run it: Autonomous ML pipeline generation via self-healing multi-agent AI

Read

The purpose of our paper is to develop a unified multi-agent architecture that automates end-to-end machine learning (ML) pipeline generation from datasets and natural-language (NL) goals, improving efficiency, robustness and explainability. A five-agent system is proposed to handle profiling, intent parsing, microservice recommendation, Directed Acyclic Graph (DAG) construction and execution.

Sun 1 Feb 2026

Authors: Adela Bara, Gabriela Dobrita, Simona-Vasilica Oprea

End-to-end autonomous scientific discovery on a real optical platform

Read

Scientific research has long been human-led, driving new knowledge and transformative technologies through the continual revision of questions, methods and claims as evidence accumulates. Although large language model (LLM)-based agents are beginning to move beyond assisting predefined research workflows, none has yet demonstrated end-to-end autonomous discovery in a real physical system that produces a nontrivial result supported by experimental evidence.

Sat 31 Jan 2026

Authors: Shuxing Yang, Fujia Chen, Rui Zhao, Junyao Wu, Yize Wang, Haiyao Luo, Ning Han, Qiaolu Chen, Yuze Hu, Wenhao Li, Mingzhu Li, Hongsheng Chen, Yihao Yang

When Your LLM Reaches End-of-Life: A Framework for Confident Model Migration in Production Systems

Read

We present a framework for migrating production Large Language Model (LLM) based systems when the underlying model reaches end-of-life or requires replacement. The key contribution is a Bayesian statistical approach that calibrates automated evaluation metrics against human judgments, enabling confident model comparison even with limited manual evaluation data.

Fri 30 Jan 2026

Authors: Emma Casey, David Roberts, David Sim, Ian Beaver

Binary Spiking Neural Networks as Causal Models

Read

We provide a causal analysis of Binary Spiking Neural Networks (BSNNs) to explain their behavior. We formally define a BSNN and represent its spiking activity as a binary causal model.

Thu 29 Jan 2026

Authors: Aditya Kar (CNRS, IRIT), Emiliano Lorini (CNRS, IRIT), Timothée Masquelier (CNRS, CERCO UMR5549)

Compositional Meta-Learning for Mitigating Task Heterogeneity in Physics-Informed Neural Networks

Read

Physics-informed neural networks (PINNs) approximate solutions of partial differential equations (PDEs) by embedding physical laws into the loss function. In parameterized PDE families, variations in coefficients or boundary/initial conditions define distinct tasks.

Wed 28 Jan 2026

Authors: Beomchul Park, Minsu Koh, Heejo Kong, Seong-Whan Lee

Computing Equilibrium beyond Unilateral Deviation

Read

Most familiar equilibrium concepts, such as Nash and correlated equilibrium, guarantee only that no single player can improve their utility by deviating unilaterally. They offer no guarantees against profitable coordinated deviations by coalitions.

Tue 27 Jan 2026

Authors: Mingyang Liu, Gabriele Farina, Asuman Ozdaglar

PhyCo: Learning Controllable Physical Priors for Generative Motion

Read

Modern video diffusion models excel at appearance synthesis but still struggle with physical consistency: objects drift, collisions lack realistic rebound, and material responses seldom match their underlying properties. We present PhyCo, a framework that introduces continuous, interpretable, and physically grounded control into video generation.

Mon 26 Jan 2026

Authors: Sriram Narayanan, Ziyu Jiang, Srinivasa Narasimhan, Manmohan Chandraker

FlexiTac: A Low-Cost, Open-Source, Scalable Tactile Sensing Solution for Robotic Systems

Read

We present FlexiTac, a low-cost, open-source, and scalable piezoresistive tactile sensing solution designed for robotic end-effectors. FlexiTac is a practical "plug-in" module consisting of (i) thin, flexible tactile sensor pads that provide dense tactile signals and (ii) a compact multi-channel readout board that streams synchronized measurements for real-time control and large-scale data collection.

Sun 25 Jan 2026

Authors: Binghao Huang, Yunzhu Li

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

Read

LLM agents are expected to complete end-to-end units of work across software tools, business services, and local workspaces. Yet many agent benchmarks freeze a curated task set at release time and grade mainly the final response, making it difficult to evaluate agents against evolving workflow demand or verify whether a task was executed.

Sat 24 Jan 2026

Authors: Chenxin Li, Zhengyang Tang, Huangxin Lin, Yunlong Lin, Shijue Huang, Shengyuan Liu, Bowen Ye, Rang Li, Lei Li, Benyou Wang, Yixuan Yuan

Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes

Read

Autonomous agents act through sandboxed containers and microVMs whose state spans filesystems, processes, and runtime artifacts. Checkpoint and restore (C/R) of this state is needed for fault tolerance, spot execution, RL rollout branching, and safe rollback-yet existing approaches fall into two extremes: application-level recovery preserves chat history but misses OS-side effects, while full per-turn checkpointing is correct but too expensive under dense co-location.

Fri 23 Jan 2026

Authors: Tianyuan Wu, Chaokun Chang, Lunxi Cao, Wei Gao, Wei Wang

Latent Adversarial Detection: Adaptive Probing of LLM Activations for Multi-Turn Attack Detection

Read

Multi-turn prompt injection follows a known attack path -- trust-building, pivoting, escalation but text-level defenses miss covert attacks where individual turns appear benign. We show this attack path leaves an activation-level signature in the model's residual stream: each phase shift moves the activation, producing a total path length far exceeding benign conversations.

Thu 22 Jan 2026

Authors: Prashant Kulkarni

1 2