Junyao Yang(杨竣尧)

👋

About Me

🤠 Hi there, this is Junyao Yang. I am a graduate student at the School of Computing, National University of Singapore (NUS), where I am pursuing a specialization in Artificial Intelligence. My research interests lie in Natural Language Processing, Explainable Artificial Intelligence and Trustworthy Machine Learning.

🧐 My research story revolves around the Underlying Principles and Understanding of Artificial Intelligence, particularly focusing on how to enhance the "Robustness" and "Safety" of LLM-generated information and understand the Interpretability of model mechanisms, which connects to related areas such as Trustworthy LLM [ACL 2025 Main, EMNLP 2025 Main] and Agent [Agentic Attribution, AgentDoG], Reasoning Model Merging [AAAI 2026, ReasonAny] and Malicious Attacks [ACL 2025 Main].

🔥

News

2026.02 ✍️ Blog post: The Entropy-Gradient Inversion. R1/o1-like reasoning models exhibit significant negative correlations between gradient strength and token entropy, emerging rapidly within the first 200 steps of SFT. This provides a new interpreable approach to the model's reasoning capability.
2026.01 🚀 Please check our latest tech report: AgentDoG! It introduces a state-of-the-art diagnostic guardrail framework utilizing a three-dimensional taxonomy, featuring an Agentic XAI attribution module I contributed to for diagnosing the internal drivers of risky actions.
2026.01 🚀 Please check our latest paper: Agentic Attribution! A hierarchical framework utilize temporal likelihood and perturbation-based analysis to unveil internal factors driving LLM-based agent actions.
2026.01 🏄‍♂️ I will attend AAAI 2026 at Singapore during Jan 20-27, 2026. Let’s have fun!
2026.01 🚀 Please check our latest paper: ReasonAny! ReasonAny employs contrastive gradient identification to resolve destructive performance collapse, effectively merging reasoning capabilities into domain-specific models!
2025.11 🎉 First-Author paper RCP-Merging has been accepted to AAAI 2026 Main Track! See you in Singapore!
2025.08 🎉 RewardDS has been accepted to EMNLP 2025 Main!
2025.08 🥳 I joined Shanghai AI Lab as a Research Intern, advised by Dongrui Liu.
2025.08 🚀 Check out my latest work: RCP-Merging! This novel framework integrates long CoT capability into domain-specific LLMs without sacrificing their performance in the original domain!
2025.05 🎉 Successfully passed my undergraduate thesis defense!
2025.05 🎉 Co-First-Author paper PrivacyRestore has been accepted to ACL 2025 Main! Deeply grateful to my mentor Ziqian and collaborator Jianwei! See you in Vienna!
2025.02 🚀 Please check our newest papers: RewardDS and PrivacyRestore! Thanks to the help of other collaborators.
2024.07 🥳 I joined ZeroNLP as a Research Assistant, advised by Prof. Ziqian Zeng.
2024.07 🥳 I spent a wonderful time at Tencent as a machine learning intern!
2024.07 🚀 Contextless CS is available now, which reaches 20,000 DAU! Check my work here!
2024.04 🥳 I joined Tencent as a machine learning intern.
2024.03 🥳 I spent a wonderful time at ShenZhen Stock Exchange as a machine learning intern!

📝

Publications & Preprints

arXiv Preprint ReasonAny: Incorporating Reasoning Capability to Any Model via Simple and Effective Model Merging

Junyao Yang, Chen Qian, Dongrui Liu^†, Wen Shen, Yong Liu^†, Jing Shao^†

TL;DR: Merging robust chain-of-thought capabilities into domain-specific models (Safety, Biomedicine) using Contrastive Gradient Identification.

Paper Code

arXiv Preprint The Why Behind the Action: Unveiling Internal Drivers via Agentic Attribution

Chen Qian, Peng Wang, Dongrui Liu^†, Junyao Yang, Dadi Guo, Ling Tang, Jilin Mei, Qihan Ren, Shuai Shao, Yong Liu, Jie Fu, Jing Shao, Xia Hu

TL;DR: A hierarchical framework for agentic attribution, using temporal likelihood and perturbation-based analysis to unveil internal factors driving LLM-based agent actions.

Paper Code

AAAI 2026 Main Track RCP-Merging: Merging Long Chain-of-Thought Models with Domain-Specific Models by Considering Reasoning Capability as Prior

Junyao Yang, Jianwei Wang, Huiping Zhuang, Cen Chen, Ziqian Zeng*^†

TL;DR: Enhancing domain performance while preserving chain-of-thought reasoning abilities by treating reasoning as a prior.

Paper Code

ACL 2025 Main PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration

Ziqian Zeng*^†, Jianwei Wang*, Junyao Yang*, Zhengdong Lu, Haoran Li, Huiping Zhuang, Cen Chen

TL;DR: Protecting privacy via activation steering using a protected meta-vector without retraining.

Paper Code

EMNLP 2025 Main RewardDS: Privacy-Preserving Fine-Tuning for Large Language Models via Reward Driven Data Synthesis

Jianwei Wang, Chengming Shi, Junyao Yang, Haoran Li, Qianli Ma, Huiping Zhuang, Cen Chen, Ziqian Zeng^†

TL;DR: Using client-side reward models to filter synthetic data, mitigating noise while protecting privacy.

Paper Code

📁

Tech Reports & Projects

Tech Report AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

Shanghai Artificial Intelligence Laboratory (Contributor)

TL;DR: A state-of-the-art diagnostic guardrail framework utilizing a unified three-dimensional taxonomy to provide fine-grained monitoring and root-cause analysis of AI agent safety risks.

Paper Code Model

🤗#1 Paper of the day ⚙️机器之心 📕小红书 Twitter

✍️

Blogs

The Entropy-Gradient Inversion: A New Perspective on LLM Reasoning Capabilities

TL;DR: We discover that reasoning models exhibit a unique "fingerprint": a significant negative correlation between gradient strength and token entropy, which contradicts traditional base models. This capability emerges rapidly within the first 200 steps of SFT.

🎓

Education

M.S. in AI

National University of Singapore

2025 - 2027 (Expected)

B.S. in CS (with honor)

South China University of Technology

2021 - 2025

High School

Shenzhen Experimental School

2018 - 2021

💻

Experience

Shanghai AI Lab

Research Intern | 2025.06 - Present

South China University of Technology

Research Intern | 2024.07 - 2025.06

Tencent

Machine Learning Intern | 2024.04 - 2024.07

SZSE

Machine Learning Intern | 2024.01 - 2024.04

🏆

Honor & Awards

Excellent Graduation Thesis (2025.06)
Outstanding Student Leader (2022-2024)
Second-Class Scholarship of SCUT (2024.10)
Second-Class Award in CUMCM at Guangdong Province (2022.09)

🌏

About Me

News

Publications & Preprints

Tech Reports & Projects

Blogs

Education

Experience

Honor & Awards

Page Views