Homepage - Jiaxin Shi

I am a research scientist at Google DeepMind. I work on machine learning, with an emphasis on:

Generative modeling: using probabilistic methods to capture the structure and uncertainty inherent in data
Algorithmic modeling: building models that have long-term memory, efficiently encode dependencies, and extract meaningful representations

I am particularly interested in tackling challenging inferential questions at the intersection of these two areas. These encompass topics such as variational inference, sampling, gradient estimation, and score-based modeling.

Github Twitter

News

I’m co-organizing the ICLR 2025 Workshop on World Models.
Talk at TTIC on discrete generative modeling with masked diffusions. [slides]
We have opensourced our code of masked diffusion models (with data & model parallel support). Try our Text and ImageNet examples!
I am area chair for AISTATS 2025 and NeurIPS 2025.
Talk at Imperial College London on masked diffusions models.
Talk at GenU 2024: Discrete generative modeling with masked diffusions. [slides]
I am an area chair for NeurIPS 2024.
Talk at FIMI 2024 on designing sequence models with wavelets and MultiresConv [slides]
Talk at Optimal Transport Berlin 2024 on Stein’s method for machine learning [slides]
Check out our new SOTA convolutional sequence modeling architecture.
I am an area chair for NeurIPS 2023.
Our work on gradient estimation for discrete distributions won the NeurIPS 2022 Outstanding Paper Award!
I am an area chair for AISTATS 2023.
I am a top reviewer for NeurIPS 2022.

Selected Publications

Diffusion and Score-Based Modeling

Simplified and Generalized Masked Diffusion for Discrete Data

Jiaxin Shi*, Kehang Han*, Zhe Wang, Arnaud Doucet, Michalis K. Titsias.

NeurIPS 2024. [pdf] [abs] [code] [slides]

Nonparametric Score Estimators

Yuhao Zhou, Jiaxin Shi, Jun Zhu.

ICML 2020. [pdf] [abs] [code] [slides]

Sliced Score Matching: A Scalable Approach to Density and Score Estimation

Yang Song*, Sahaj Garg*, Jiaxin Shi, Stefano Ermon.

UAI 2019. [pdf] [abs] [code] [video] [blog]

Oral Presentation (top 8.7%).

A Spectral Approach to Gradient Estimation for Implicit Distributions

Jiaxin Shi, Shengyang Sun, Jun Zhu.

ICML 2018. [pdf] [abs] [code] [slides]

Sequence Modeling

Learning-Order Autoregressive Models with Application to Molecular Graph Generation

Zhe Wang, Jiaxin Shi, Nicolas Heess, Arthur Gretton, Michalis K. Titsias

ICML 2025 [pdf] [abs]

Sequence Modeling with Multiresolution Convolutional Memory

Jiaxin Shi, Ke Alexander Wang, Emily B. Fox.

ICML 2023. [pdf] [abs] [code] [slides]

Probabilistic Inference and Gradient Estimation

A Finite-Particle Convergence Rate for Stein Variational Gradient Descent

Jiaxin Shi, Lester Mackey.

NeurIPS 2023. [pdf] [abs]

Gradient Estimation with Discrete Stein Operators

Jiaxin Shi, Yuhao Zhou, Jessica Hwang, Michalis K. Titsias, Lester Mackey.

NeurIPS 2022. [pdf] [abs] [code]

NeurIPS 2022 Outstanding Paper Award.

Double Control Variates for Gradient Estimation in Discrete Latent Variable Models

Michalis K. Titsias, Jiaxin Shi.

AISTATS 2022. [pdf] [abs] [code]

Sampling with Mirrored Stein Operators

Jiaxin Shi, Chang Liu, Lester Mackey.

ICLR 2022. [pdf] [abs] [code] [slides]

Spotlight Presentation (top 5.1%).

Representation Learning

NeuralEF: Deconstructing Kernels by Deep Neural Networks

Zhijie Deng, Jiaxin Shi, Jun Zhu.

ICML 2022. [pdf] [abs] [code]

Neural Eigenfunctions Are Structured Representation Learners

Zhijie Deng*, Jiaxin Shi*, Hao Zhang, Peng Cui, Cewu Lu, Jun Zhu.

Preprint, 2022. [pdf] [abs]

Neural Networks as Inter-domain Inducing Points

Shengyang Sun*, Jiaxin Shi*, Roger Grosse.

AABI Symposium, 2020. [pdf] [slides] [video]

Predictive Uncertainty Estimation

Sparse Orthogonal Variational Inference for Gaussian Processes

Jiaxin Shi, Michalis K. Titsias, Andriy Mnih.

AISTATS, 2020. [pdf] [abs] [code] [slides]

Best Student Paper Runner-Up at AABI Symposium, 2019.

Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition

Shengyang Sun, Jiaxin Shi, Andrew Gordon Wilson, Roger Grosse.

ICML, 2021. [pdf] [abs] [code]

Functional Variational Bayesian Neural Networks

Shengyang Sun*, Guodong Zhang*, Jiaxin Shi*, Roger Grosse.

ICLR 2019. [pdf] [abs] [code] [video]

Software

During my PhD studies I led the development of ZhuSuan [github] [doc] [arxiv], an open-source differentiable probabilistic programming project based on Tensorflow.