Data assimilation for weather forecasting is commonly achieved by combining model state "trajectories" for example, daily temperature series with observations. However, for processes requiring long- term statistics, such as climate projections in climate models, the emphasis is on capturing the long-term state distribution (i.e., frequency and possible values of temperature over a range) rather than individual trajectories. This study introduces a probabilistic framework for parameter inference based on the probability distribution functions of state variables, using the Lorenz '96 (L96) system as a toy model to demonstrate the proof of concept. We develop a distribution emulator for the L96 system using conditional normalizing flow models. This emulator replicates the state distributions without the need to unroll the entire series of state trajectories. Building on this emulator, a distribution-driven framework for model parameter inference is presented, including uncertainty quantification. Finally, an application of the proposed framework for distribution matching, extreme value estimation and joint quantile analysis is discussed, highlighting its potential for current climate models and data analysis.

The paper is an endeavour from Shawn Li at Columbia University. It is entitled Probabilistic Data Assimilation for Ensemble Distribution Projections With Generative Machine Learning: A Lorenz '96 Proof-of-Concept, and is published in Geophysical Research Letters.