Improve Diffusion Model by using a better learning rate scheduler (including warm-up period) and using exponential moving average.