I am a research scientist at Google DeepMind. I work on machine learning, with an emphasis on:
- Generative modeling: using probabilistic methods to capture the structure and uncertainty inherent in data
- Algorithmic modeling: building models that have long-term memory, efficiently encode dependencies, and extract meaningful representations
I am particularly interested in tackling challenging inferential questions at the intersection of these two areas. These encompass topics such as variational inference, sampling, gradient estimation, and score-based modeling.
News
- We have opensourced our code on Masked Diffusion Models (with data & model parallel support). Try our examples on OpenWebText and ImageNet64!
- Talk at GenU 2024 about discrete generative modeling with Masked Diffusion Models. [slides]
- I am an area chair for NeurIPS 2024.
- Talk at FIMI 2024 on designing sequence models with wavelets and MultiresConv [slides]
- Talk at Optimal Transport Berlin 2024 about our recent developments of Stein’s method for machine learning [slides]
- Check out our new SOTA convolutional sequence modeling architecture.
- I am an area chair for NeurIPS 2023.
- Our work on gradient estimation for discrete distributions won the NeurIPS 2022 Outstanding Paper Award!
- I am an area chair for AISTATS 2023.
- I am a top reviewer for NeurIPS 2022.
Selected Publications
Diffusion and Score-Based Modeling
Simplified and Generalized Masked Diffusion for Discrete Data
Jiaxin Shi*, Kehang Han*, Zhe Wang, Arnaud Doucet, Michalis K. Titsias.
NeurIPS 2024. [pdf] [abs] [code] [slides]
Nonparametric Score Estimators
Yuhao Zhou, Jiaxin Shi, Jun Zhu.
ICML 2020. [pdf] [abs] [code] [slides]
Sliced Score Matching: A Scalable Approach to Density and Score Estimation
Yang Song*, Sahaj Garg*, Jiaxin Shi, Stefano Ermon.
UAI 2019. [pdf] [abs] [code] [video] [blog]
Oral Presentation (top 8.7%).
A Spectral Approach to Gradient Estimation for Implicit Distributions
Jiaxin Shi, Shengyang Sun, Jun Zhu.
ICML 2018. [pdf] [abs] [code] [slides]
Sequence Modeling and MultiresConv Architecture
Sequence Modeling with Multiresolution Convolutional Memory
Jiaxin Shi, Ke Alexander Wang, Emily B. Fox.
ICML 2023. [pdf] [abs] [code] [slides]
Probabilistic Inference and Gradient Estimation
A Finite-Particle Convergence Rate for Stein Variational Gradient Descent
Jiaxin Shi, Lester Mackey.
Gradient Estimation with Discrete Stein Operators
Jiaxin Shi, Yuhao Zhou, Jessica Hwang, Michalis K. Titsias, Lester Mackey.
NeurIPS 2022. [pdf] [abs] [code]
NeurIPS 2022 Outstanding Paper Award.
Double Control Variates for Gradient Estimation in Discrete Latent Variable Models
Michalis K. Titsias, Jiaxin Shi.
AISTATS 2022. [pdf] [abs] [code]
Sampling with Mirrored Stein Operators
Jiaxin Shi, Chang Liu, Lester Mackey.
ICLR 2022. [pdf] [abs] [code] [slides]
Spotlight Presentation (top 5.1%).
Representation Learning
NeuralEF: Deconstructing Kernels by Deep Neural Networks
Zhijie Deng, Jiaxin Shi, Jun Zhu.
Neural Eigenfunctions Are Structured Representation Learners
Zhijie Deng*, Jiaxin Shi*, Hao Zhang, Peng Cui, Cewu Lu, Jun Zhu.
Neural Networks as Inter-domain Inducing Points
Shengyang Sun*, Jiaxin Shi*, Roger Grosse.
AABI Symposium, 2020. [pdf] [slides] [video]
Predictive Uncertainty Estimation
Sparse Orthogonal Variational Inference for Gaussian Processes
Jiaxin Shi, Michalis K. Titsias, Andriy Mnih.
AISTATS, 2020. [pdf] [abs] [code] [slides]
Best Student Paper Runner-Up at AABI Symposium, 2019.
Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition
Shengyang Sun, Jiaxin Shi, Andrew Gordon Wilson, Roger Grosse.
ICML, 2021. [pdf] [abs] [code]
Functional Variational Bayesian Neural Networks
Shengyang Sun*, Guodong Zhang*, Jiaxin Shi*, Roger Grosse.
ICLR 2019. [pdf] [abs] [code] [video]
Software
During my PhD studies I led the development of ZhuSuan [github] [doc] [arxiv], an open-source differentiable probabilistic programming project based on Tensorflow.