Chenyang An

Ph.D. Candidate @ University of California, San Diego

prof_pic.jpg

HSS Building UC San Diego

9500 Gilman Dr

La Jolla, California 92092

email: cya.portfolio at gmail dot com

I am a 5-th year Ph.D. student in the Mathematics Department, UC San Diego, where I am advised by Prof. Sam Buss and co-advised by Prof. Jingbo Shang. Prior to my Ph.D. study, I completed my B.S. in Applied Mathematics and B.A. in Economics at UC San Diego.

My current research focuses on Large Language Model (LLM) Reasoning and in natural language and formalized environments, and LLM post-training using supervised fine-tuning and reinforcement learning. I believe that mathematics will fall well within the capabilities of LLMs probably soon.

My prior research focused on 2D Quantum Gravity and mathematical physics, studying the interplay between algebra, geometry and physics.

I interned at Microsoft Research in Seattle during the summer of 2024, focusing on improving the training efficiency of large language models (LLMs) for reasoning tasks. I also worked part-time at Scale AI as an AI Consultant, contributing to the development of LLM-based web agents and scalable verification systems for reasoning data. In Spring 2025, I joined Amazon AWS Neurosymbolic as an Applied Scientist Intern, where I designed a new reinforcement learning pipeline incorporating a diversity-based reward to encourage the generation of varied chains of thought (CoTs), along with the supporting data preprocessing framework.

If you are interested in any of the topics above, feel free to drop me an email!

news

Apr 01, 2025 I’m excited to share that I’ll be joining Amazon AWS Neurosymbolic as an Applied Scientist Intern, where I’ll be working on LLM reasoning in both natural and formal language!
Jan 22, 2025 I’m thrilled to see our publication on ACL 2024 “Learn from failure: Fine-tuning LLMs with Trial-and-Error Data for Intuitionistic Propositional Logic Proving” is featured by Neptunes News Agency! News link.
Jan 22, 2025 I’m thrilled to share that our paper, “Correlation and Navigation in the Vocabulary Key Representation Space of Language Models”, has been accepted to the International Conference on Learning Representations (ICLR)! This work studies spurious correlation existing in the vocabulary key space of LLMs, and proposes a novel in-context learning method (called In-Context Navigation) to sample high quality results from the key space of the LLMs that otherwise cannot be obtained through the usual top-k inference.
Oct 01, 2024 I’m excited to share that I will be joining Scale AI as an AI Consultant, working on fine-tuning LLMs for real-world applications.
Jun 04, 2024 I’m excited to share that I will be joining Microsoft as a research intern in ML and Generative AI in the summer of 2024 in Redmond, Seattle.

selected publications

  1. Arxiv
    The Price of Format: Diversity Collapse in LLMs
    May 2025
  2. Arxiv
    Linear Correlation in LM’s Compositional Generalization and Hallucination
    Feb 2025
  3. Arxiv
    Next-Token Prediction Task Assumes Optimal Data Ordering for LLM Training in Proof Generation
    Oct 2024
  4. ICLR
    Correlation and Navigation in the Vocabulary Key Representation Space of Language Models
    Jan 2025
  5. ACL
    Learn from Failure: Fine-Tuning LLMs with Trial-and-Error Data for Intuitionistic Propositional Logic Proving
    May 2024