Kuo Wang, Zhanqiang Zhang, Keqilao Meng, Pengbing Lei, Rui Wang, Wenlu Yang, Zhihua Lin; Optimal energy scheduling for microgrid based on GAIL with Wasserstein distance. AIP Advances 1 August 2024; 14 (8): 085013. https://doi.org/10.1063/5.0207444 Contact online >>
Kuo Wang, Zhanqiang Zhang, Keqilao Meng, Pengbing Lei, Rui Wang, Wenlu Yang, Zhihua Lin; Optimal energy scheduling for microgrid based on GAIL with Wasserstein distance. AIP Advances 1 August 2024; 14 (8): 085013. https://doi /10.1063/5.0207444
This paper is organized as follows: in Sec. II, the microgrid model is described. Section III outlines the energy optimal scheduling problem as a Markov decision process. Section IV presents the proposed methodology for solving the energy optimal scheduling problem. Our case study and results are discussed in Sec. V. Finally, our conclusions are summarized in Sec. VI.
Microgrids typically consist of renewable energy generation units, traditional fossil energy generation units, energy storage devices, and user loads. The microgrid system model is illustrated in Fig. 1. In the grid-connected mode, the microgrid interacts with the main grid (MG), and according to the power generated by the distributed power sources and the power used by the power loads, it carries out a reasonable power distribution, ultimately achieving optimal energy scheduling.
This paper proposes a WGAIL algorithm for optimal energy scheduling in microgrids, based on the constructed MDP problem. The algorithm is structured by expert policy data, a generator network, and a discriminator network. The policy network of the agent is updated using a reinforcement learning algorithm. After iterative updates with feedback from the discriminator, the optimal decision for the energy scheduling problem in the LDR scenario is finally obtained. The GAIL method, combining the Wasserstein distance and PPO algorithm, is shown in Algorithm 1.ALGORITHM 1.
Generative adversarial imitation learning with Wasserstein distance.
Structure of the WGAIL algorithmic framework.
Imitation learning requires fitting expert datasets, and the quality of the expert data determines how well the agent learns. In this paper, an expert data collection method is designed to train in a microgrid environment using the PPO algorithm. The trained model serves as the expert policy model, and state-action expert data samples are collected through this model. The process of capturing expert experience is illustrated in Fig. 3.
Capturing the experience of experts.
Effect of advantage functions on strategy updating.
The PPO algorithm is a strategy gradient method based on the actor–critic framework, which contains two types of neural networks: the policy network (actor) and the value network (critic). The actor network takes the environment state as input and outputs the probability of each action. The critic network evaluates the current state and outputs an evaluation value. Generally, both networks share a common feature extraction layer.
In this paper, the state space is three-dimensional and the action space is four-dimensional, and we construct the neural network based on this setup. The parameters of the strategy network (actor) and the value network (critic) are shown in Tables I and II.
Actor network structure and parameterization.
Critic network structure and parameterization.
The discriminator is a neural network with fully connected layers, including input, hidden, and output layers, used to differentiate between real and generated data. The parameters of the discriminator network are shown in Table III. The inputs to the input layer are state–action pairs, so the number of neurons in the input layer is the sum of the state space dimension and the action space dimension, totaling seven neurons. The hidden layer has four layers, each with 100 neurons and a tanh activation function. The output layer has only one neuron.
Discriminator network structure and parameterization.
The reward values computed about the generated data are fed back into the generator and stored for subsequent computation of the advantage function.
In this paper, we use the open-source dataset from CASIO 2019 for model training. After training, the model is tested using a regional real-world dataset. The microgrid environment configuration used in this study is detailed in Table IV, and the hyperparameters of the WGAIL algorithm are listed in Table V.
Operational parameters of each output device of the microgrid.
Hyperparameters of the WGAIL algorithm.
Average reward curves for three different algorithms.
Algorithmic policy loss function curves.
The value loss function, shown in Fig. 7, which represents the dynamic difference between the fitted output of the value neural network and the actual value, can be seen from the curve, which shows a downward trend, indicating that the output of the neural network has a large error between the output of the neural network and the actual value at the beginning, and with the constant learning of the agent, the curve gradually slows down around the 700 rounds, which indicates that the intelligent body has already learned a good strategy.
The delta error value of the value loss function, i.e., the difference between time t and t + 1, is shown in Fig. 8. The error is large initially, indicating that the agent is in the learning phase, and the curve flattens out by about 700 rounds.
To further verify the method''s effectiveness, we tested the algorithm using data from a specific day. The scheduling results are shown in Fig. 9. The PV output power and load demand are shown in Fig. 9(a).
About Mongolia microgrid energy storage
As the photovoltaic (PV) industry continues to evolve, advancements in Mongolia microgrid energy storage have become critical to optimizing the utilization of renewable energy sources. From innovative battery technologies to intelligent energy management systems, these solutions are transforming the way we store and distribute solar-generated electricity.
When you're looking for the latest and most efficient Mongolia microgrid energy storage for your PV project, our website offers a comprehensive selection of cutting-edge products designed to meet your specific requirements. Whether you're a renewable energy developer, utility company, or commercial enterprise looking to reduce your carbon footprint, we have the solutions to help you harness the full potential of solar energy.
By interacting with our online customer service, you'll gain a deep understanding of the various Mongolia microgrid energy storage featured in our extensive catalog, such as high-efficiency storage batteries and intelligent energy management systems, and how they work together to provide a stable and reliable power supply for your PV projects.