改进使模型训练数据稳定,但是造成污染前后训练结果差别不大

This commit is contained in:
MuJ 2024-01-09 09:42:01 +08:00
parent 58e266e184
commit 31f4fbc323
2 changed files with 87 additions and 55 deletions

430
1711.02278.pdf Normal file
View File

@ -0,0 +1,430 @@
Modeling and Optimization of Complex Building
Energy Systems with Deep Neural Networks
Yize Chen, Yuanyuan Shi and Baosen Zhang
Department of Electrical Engineering, University of Washington, Seattle, WA, USA
{yizechen, yyshi, zhangbao}@uw.edu
arXiv:1711.02278v1 [math.OC] 7 Nov 2017 Abstract—Modern buildings encompass complex dynamics of of building systems. While in [8], [9], reinforcement learning
multiple electrical, mechanical, and control systems. One of the was proposed to learn control policies without any explicit
biggest hurdles in applying conventional model-based optimiza- modeling assumptions, but computational costs for searching
tion and control methods to building energy management is through large state and action spaces is hight. Ill-defined
the huge cost and effort of capturing diverse and temporally reward functions (e.g., sparse, noisy and delayed rewards)
correlated dynamics. Here we propose an alternative approach could also prevent reinforcement learning algorithm finding
which is model-free and data-driven. By utilizing high volume of the optimal control solutions [10]. Furthermore, large com-
data coming from advanced sensors, we train a deep Recurrent mercial buildings may have quality of service constraints that
Neural Networks (RNN) which could accurately represent the op- prevent the deep exploration of some states in reinforcement
erations temporal dynamics of building complexes. The trained learning.
network is then directly fitted into a constrained optimization
problem with finite horizons. By reformulating the constrained (a). RNN Fitting Building Deep Recurrent Neural Network
optimization as an unconstrained optimization problem, we use Running Pt = fRNN ( Xt T,..., Xt)
iterative gradient descents method with momentum to find P t-n
optimal control inputs. Simulation results demonstrate proposed Pro<72>ile X t P t-n+1
methods improved performances over model-based approach on
both building system modeling and control. RNN T ( X t , H t ) Energy
Consumption
Index Terms—Building energy management, deep learning,
gradient algorithms, HVAC systems Pt
I. INTRODUCTION 
According to a recent United Nations Environment Pro- P t+n-1
gramme (UNEP) report, buildings are responsible for 40% of
the global energy consumption [1]. Consequently, managing Xt c P t+n
the energy consumption of buildings has significant econom- OriginalControl Inputs
ical, social, and environmental impacts, and has received RNN T ( X t , H t )
much attention from researchers. Many approaches have been (b). Inputs Optimization
proposed to control building systems (e.g., commercial and Constrained Optimization on Xtc
office buildings, data centers) for energy efficiency, such as
nonlinear adaptive control, Model Predictive Control (MPC) Pt*
and decentralized control for building heating, ventilation, Optimal Control Inputs
and air conditioning (HVAC) systems [2], [3], [4]. However,
most previous research on building energy management are X c*
either based on the detailed physics model of buildings [5] or t
simplified RC circuit models [2], [3], [6]. The former often
involves tedious and complex modeling processes with a huge Fig. 1. Our model architecture for building energy system modeling
number of variables and parameters, whereas the latter cannot and optimization based on a deep Recurrent Neural Networks (RNN).
fully capture the long term dynamics of large commercial
buildings. In this work, we address these challenges by proposing
a data-driven method which closes the loop for accurate
With the advance of sensing, communication and com- predictive model and real-time control. The method is based
puting, detailed operation data are being collected for many on deep recurrent neural networks that leverage rich volumes
buildings. These data along with future weather forecasts can of sensor data [11]. Though neural network has previously
be utilized for data-driven real-time optimization approaches. been adopted as an approach for designing controllers, the lack
In [7], the authors developed a data predictive control method of large datasets and computation capabilities have prevented
to replace the traditional MPC controller by using data to it from being deployed in real-time applications [12]. Firstly
build a regression tree that represent the dynamical model in a supervised learning manner, our Recurrent Neural Net-
for a building. However, regression trees still results in a works (RNN) firstly learns the complex temporal dynamics
linear model that can be far away from the true dynamics mapping from various measurements of building operation
profiles to energy consumption. Next we formulate an op-
timization problem with the objective of minimizing build-
ing energy consumption, which is subject to RNN-modeled model with known parameters representing buildings physical
building dynamics as well as physical constraints over a finite dynamics. f (·) maps past T timesteps running profile to
horizon of time. To solve the constrained optimization problem energy consumption at timestep t.
in a block-splitting approach, we take iterative gradient descent
steps on the set of controllable inputs (e.g., zone temperature With a model f (·) representing the building dynamics, we
setpoints, heat rejected/added into each zone) at the current formulate an optimal finite-horizon predictive control prob-
timestep. It thus finds the control inputs for each timestep. lem, and propose an efficient algorithm to find the group of
Fig. 1 illustrates our model framework. Our approach does optimal control inputs Xct. At timestep t, the control input
not need analysis on complex interactions within conduction, Xct minimizes the energy consumption of the building for
convection or radiation processes. In addition, it can be easily future T steps. Meanwhile, previous T steps control inputs
scaled up to large buildings and distributed algorithms. would affect current energy consumption. The objective of
the controller is to minimize the energy consumption with a
The main contributions of our paper are: rolling horizon T , while maintaining some variables within
comfortable intervals. Mathematically, we formulate the gen-
• We model the building energy dynamics using recurrent eral control problem as
neural networks, which leverages large volumes of data
to represent the complex dynamics of buildings. T
• We propose an input/output optimization algorithm which minimize Pt2+τ (1a)
efficiently find the optimal control inputs for the model
represented by RNN. Xtc ,...,Xct+T τ =0
• The proposed modeling and optimization approaches subject to Pt+τ = f (XtT +τ , ..., Xt+τ ), ∀τ (1b)
open door to the integration of complex system dynamics
modeling and decision-making. Xtc+τ ≤ Xtc+τ ≤ Xtc+τ , ∀τ (1c)
The contents of the paper are as follows. The rolling horizon Xut+c τ ≤ Xtu+c τ ≤ Xtu+c τ , ∀τ (1d)
control problem formulation and model-based method are
firstly presented in Section. II. In Section. III we show the Xtu+c τ = h(XtT +τ , ..., Xt1+τ , Xct+τ , Xtp+hyτ ),
design of a deep RNN which models the dynamics of complex
building systems. We then reformulate the control problem ∀τ
as an unconstrained optimization problem, and propose the
algorithm to find optimal control inputs in Section. IV. Fi- (1e)
nally, simulation results on large building HVAC system are
evaluated and compared with model-based control method in where (1b) h(·) denotes the rolling horizon predictive model;
Section. V. (1c) and (1d) are the constraints on controllable and uncontrol-
lable variables respectively; the h(·) in (1e) denotes a rolling
horizon predictive function for uncontrollable variables based
on past T steps observations as well as current step control
inputs and physical forecasts.
II. PROBLEM FORMULATION & PRELIMINARIES B. First-Order Thermal Dynamic Model
A. Problem Formulation
We consider a building energy system which includes sev- For building HVAC system, one popular method used in
finite-horizon MPC to model the thermal dynamics is the
eral subsystems and zones with potentially complex interac- reduced Resistance-Capacitance (RC) model [2], [3], [6]. Here
we use a rolling horizon MPC controller as a benchmark for
tions between them. No information about the exact system comparison.
dynamics is known. At time t, we are provided with the Denote N (i) as the neighboring zones for zone i, the first-
buildings running profile Xt := [Xut c, Xtc, Xtphy]T , where order RC model modeling HVAC dynamics is formulated as
Xut c denotes a collection of uncontrollable measurements such
as zone temperature measurements, system node temperature
measurements, lighting schedule, in-room appliances schedule, CiT˙i,t = To,t Ti,t + Tj,t Ti,t + Pi,t (2)
room occupancies and etc; and Xtc denotes a collection of Ri Rij
j∈N (i)
controllable measurements such as zone temperature setpoints,
appliances working schedule and etc; Xtphy denotes the set of where Ci, Ti are the thermal capacitance and room temperature
for each zone i, while To is the outside dry bulb temperature,
physical measurements or forecasts values, such as dry bulb and Ri, Rij are the thermal resistance for zone i against the
outside and the neighboring zone j. The schematic of RC
temperature, humidity and radiation volume. There are some
network for modeling HVAC system is shown in Fig. 2.
physical constraints on some of Xtc and Xut c, for example
the temperature setpoints as well as real measurements should Once we find Ci, Ri, Rij for all the zones, we have a 1st-
order system to model the thermal dynamics. Since Ti ∈
not fall out of users comfort regions. Without loss of gen- Xuc, To ∈ Xphy, by reformulating (2) and taking a sum of
Xct and Pi for all zones, we reformulate and write the building overall
erality, we denote the constraints as Xct ≤ Xtc ≤ a group thermal dynamics
Xtuc ≤ Xtuc ≤ Xut c. Building System operators have
of past running profile X = {Xt} along with the collection
of energy consumption metering at each time step P = {Pt}.
We are interested in firstly learning a model
f (XtT , ..., Xt) = Pt, where f (·) denotes the predictive Pt = fRC (XtT , ..., Xt) (3)
Tj Ci
Qradiation Rij
Ti
Ri
To
Ci
Pi
Fig. 2. RC network with thermal exchange between different comp.
which is further used in the optimal control problem defined
in (1a)-(1e). MPC for building HVAC system under different
model settings has been implemented in [2], [3]. We focus
on the performance comparison of RC model to our proposed
method in both model fitting and optimization tasks.
III. RECURRENT NEURAL NETWORKS Fig. 3. A graphical model illustrating the RNN which is used for
modeling T -length input-output sequential data. θh,t, θo,t, θx,t are
Since the 1st-order thermal dynamic model defined in (2) the neural weights associated with hidden states ht+1, output oˆt, and
does not either capture complicated nonlinear dynamics, nor input xt respectively.
model the long-term temporal dependencies of building HVAC
system, the deep RNN model becomes a good replacement. to be the set of neurons used in modeling the T -length
temporal data, and wrap up all neural-composed functions of
RNN is a class of artificial neural networks specially de- {f , f } θx,t,θo,t θh,t,θx,t to get the overall function fRNN , which
signed for sequential data modeling. Unlike fully-connected utilizes θ to find the output predictions with length T time-
neural networks where inputs are fed into the neural networks series input:
as a full vector, RNN feeds input sequentially into a neural
network with directed connections. It uses its internal memory oˆT = fRNN (x0, ..., xT ) (6)
to process time-series inputs. In Fig. 3 we show the structure
of an RNN model. We set up the RNN model and initialize neuron weights θ
by sampling from a normal distribution. During batch-training
We specifically design the RNN model to solve a time- process, with a group of sequential input xt, t = 0, ..., T ,
dependent regression problem. That is to say, we want RNN oˆT is firstly computed, and by doing back-propagation using
automatically learn the relationship between sequential input stochastic gradient descent (SGD) with respect to all neu-
xt, t = 0, , ..., T and output oT . At timestep t, RNN is rons [13], θ is optimized to minimize the regression loss
provided with hidden state vector ht and input vector xt, defined in mean-square-error (MSE) form:
and outputs its computation vector oˆt. The t-step RNN cell
is composed of three group of neurons, θx,t, θh,t, θo,t. They Ltraining(θ) = ||oˆT oT ||22, (7a)
are associated with input, hidden state and output respectively,
and are organized in function f , f θx,t, θo,t θh,t,θx,t to complete θ∗ = arg min Ltraining(θ) (7b)
the following computations:
oˆt = fθx,t,θo,t (xt, ht), (4a) We then set up a length-T RNN accordingly for our
ht+1 = fθh,t,θx,t (xt, ht) (4b) building dynamics modeling problem. With the training sets
of input vectors of historical building operating profiles
where oˆt is the RNNs prediction output, while ht+1 is {XtT , ..., Xt} and an output energy consumption Pt, our
passed into next neuron group and takes part in t + 1 steps RNN model fRNN is trained to represent the system dynamics
computation.
Pˆt = fRNN (XtT , ..., Xt) (8)
After concatenating all the neurons cells from 0 to T , we
get the chain function to compute ht. Thus the RNN compute Our RNN is totally data-driven, and can process and repre-
the final prediction value oˆT : sent temporal dependencies. With a rich volume of historical
building operating data X and P provided as the training
oˆT = fθx,T , θo,T (xT , hT ) (5) datasets, we train a deep RNN, which accurately models the
nonlinear, complex temporal dynamics of building system. We
Since ht captures information from past inputs xt1, we will show in Section V that our deep RNN model outperforms
trace hidden states back into functions of previous steps RC model in fitting the dynamics of a large-scale building
hidden states and inputs. Thus final output oˆT is eventually HVAC system.
a function of the sequential inputs xt, t = 0, ..., T . For
simplicity, lets denote θ = {θh,t, θo,t, θx,t}, t = 0, ..., T
IV. INPUTS OPTIMIZATION FOR BUILDING CONTROL of controllable inputs for the finite horizon optimal control
problem at timestep t.
In this section we describe our control algorithm which is
based on our pre-trained deep learning model. We demonstrate The k-step gradient descent method is working as follows:
how it is able to incorporate (8) into the optimization problem
(1). We also illustrate how to solve such optimization problem gt+τ,k = η∇Xtc+τ,k Lopti(Xct,k1, ..., Xtc+T,k1) (11a)
to find a collection of optimal control sequential inputs. Xtc+τ,k = Xtc+τ,k1 gt+τ,k, τ = 0, ..., T (11b)
By substituting f (·) in (1) with fRNN , and denote Xvt ar = where η is the learning rate, and Xtc+τ,k denotes the value for
[Xct , Xtuc],the finite horizon control problem for building en- Xtc+τ after k steps update.
ergy management is written as
Throughout our modeling and optimization approach, we do
T not make any physical model assumptions, and directly utilize
a deep RNN to extract the model dynamics as well as finding
minimize Pt2+τ (9a) the optimal actions to take at each time step to cut down energy
consumption. We summarize the proposed method in Algo-
Xtc ,...,Xtc+T τ =0 rithm 1, which closes the loop for building dynamics modeling
and control inputs optimization. In our implementation, we
subject to Pt+τ = fRNN (XtT +τ , ..., Xt+τ ), ∀τ (9b) improve the algorithm performance by adding momentum to
gradient descents (MomentumGD), which is shown to get
Xtv+arτ ≤ Xvt+arτ ≤ Xvt+arτ , ∀τ (9c) over some local minima during optimization iterations as well
as accelerating the convergence [14]. The MomentumGD is
Xtu+c τ = h(XtT +τ , ..., Xt1+τ , Xct+τ , Xtp+hyτ ), realized as follows:
∀τ
(9d)
Since Xtu+cτ , τ = 1, ..., T is directly controlled by control in- gt+τ,k = γgt,k1 + η∇Xtc+τ,k Lopti(Xct,k1, ..., Xct+T,k1)
puts of previous time. For all the uncontrollable variables with (12a)
constraints we model, they also possess pairing controllable
variables, e.g., the temperature measurements-temperature set- Xtc+τ,k = Xct+τ,k1 gt+τ,k, τ = 0, ..., T (12b)
points. We then choose Xtu+cτ = Xtc1+τ , τ = 1, ..., T , since
such uncontrollable values are the control outputs correspond- where γ is a momentum term determining how much previous
ing to the previous steps control inputs. Thus we diminish gradients are incorporated into current steps update.
constraint (9d).
Algorithm 1 Input Optimization for Building Control
Since the constrained optimization problem (9) includes a
non-convex deep neural network in the constraints, we use log Input: Pre-trained RNN fRNN , learning rate η, momentum
barriers functions to rewrite the problem in an unconstrained
form: γ, input optimization iterations Niter
min Lopti(Xct , ..., Xtc+T ) = Input: Control window-size T
Input: Sensor measurements Xut c, weather forecasts Xtphy
Xtc ,...,Xtc+T Initialize: Xt, ..., Xt+T
Initialize: Optimal control inputs Xt ← ∅
T
for iteration= 0, ..., Niter do
fR2 NN (XtT +τ , ..., Xt+τ ) Update Xct using gradient descent:
for τ = 0, ..., T do
τ =0 gt+τ ← ∇Xct+τ Lopti(Xct , ..., Xtc+T )
Xct+τ ← Xtc+τ η · M omentumGD(Xct+τ , gt+τ , γ)
T (10) end for
Update Xut c using gradient descent:
λ log(Xtv+arτ Xtv+arτ ) for τ = 0, ..., T do
Xut+c τ = Xtc+τ 1
τ =0 end for
T end for
Xt .insert(Xct )
λ log(Xtv+arτ Xtv+arτ )
τ =0
where λ is a tuning parameter, and Lopti(Xtc, ..., Xct+T ) V. CASE STUDY
defines a loss function with inputs Xct , ..., Xct+T . We solve
this loss minimization problem by iteratively taking gradient In this section, we set up a realistic model in standard
building simulation software EnergyPlus [15]. We demonstrate
descents of (10). Note that during RNN model training, we the effectiveness of our data-driven approach for both system
dynamics modeling and building energy management. In order
are taking gradients ∇θLtraining(θ) with respect to all the
neurons. Once training is done, Ltraining(θ) is converged.
The RNN model serves as the temporal physical model, and
is always modeling the building system dynamics accurately.
Here we are taking gradients with this fixed, pre-trained RNN
model, and find gradients ∇Xct+τ Lopti(Xct , ..., Xtc+T ), τ =
0, ..., T with respect to the group of controllable variables.
Once Lopti(Xtc, ..., Xct+T ) is converged, and we find Xct
that is a local optimal solution. Xtc is also the solution
to compare with the model-based approach, we focus on the RNN model is able to fit noon values given past 4 hours
HVAC system for a large building complex. But our method input measurements. Moreover, RC model performs poorly on
is a general regression and optimization approach, which weekend regression task, which hardly represents the HVAC
could be easily applied to overall building energy management dynamics. This inaccurate model would make subsequent
problem. MPC algorithm fail to operate on correct model space.
A. Experimental Setup 9 × 108
We set up our EnergyPlus simulations using a 12-storey
8 Measurements RC RNN
large office building (in Fig. 4) listed in the commercial
reference buildings from U.S. Department of Energy (DoE Energy Consumption [J] 7
CRB) [16]. The building has a total floor area of 498, 584
square feet which is divided into 16 separate zones. We 6
simulate the building running through the year of 2004 in
Seattle, WA, and record (Xt, Pt) with a resolution of 10 5
minutes. We shuffle and separate 2 months data as our
stand-alone testing dataset for both regression and control 4
performance evaluation, while the remaining 10 months data
is used to for RNN training. The processed datasets have 55 3
input features, which include controllable variables such as
zone temperature setpoints, and uncontrollable variables such 2
as zone occupancies and temperature measurements. Output
is a single feature for energy consumption at each timestep. 1
We directly use historical weather data records into both RC
model and RNN model. For future work, the forecasts model 0              
should also be considered into the pipeline. A finite horizon
of 4 hours is set for both MPC method and proposed method. Days
We set up our deep learning model using Tensorflow, a Fig. 5. Comparison of buildings real energy consumption measure-
Python open-source package. Our RNN model is composed of ments(real), Recurrent Neural Networks predicted energy consump-
1 recurrent layer with 3 subsequent fully-connected layers. We tion (blue) and RC model prediction (green) on a week of testing
adopt rectified linear unit (ReLU) activation functions, dropout data.
layers and Stochastic Gradient Descent (SGD) optimizer to
improve our neural network training. Next we show the constrained optimization problem for-
mulated in (10) is efficient in finding optimal inputs Xct for
Fig. 4. Schematic diagram of simulated large commercial building. the HVAC system. In Fig. 6 we show a group of 3 plots cor-
responding to different zone temperature setpoint constraints.
B. Simulation Results We keep setpoint constraints the same for all the 16 zones.
We first compare the model fitting performance for 1st order Compare the results of Xct ∈[18°C,26°C] and the results of
Xct ∈[19°C,24°C], we observe that our approach is able to
model and RNN model, and the fitting result for two weeks find sharper control inputs with less energy consumption when
energy consumption is shown in Fig. 5. To quantitatively constraint intervals are bigger. When there is no constraint
compare the model fitting error, we calculate the Root-Mean- on temperature, our approach simply finds extreme control
Square-Error (RMSE) value for normalized energy consump- inputs such that the energy consumption is nearly same as the
tion on test dataset. RMSE for the first-order RC model midnight consumption.
is 0.240. The RNN model improves RC model by 68.33%
with an RMSE of 0.076. It is also interesting to notice that We then compare the optimization performance for RC
this large office building actually has an energy consumption model and RNN model. Fig. 7 illustrates a Monday-Friday
dropdown on weekdays noon due to the occupancy schedule. energy consumption profile with temperature setpoint con-
RC model fails to capture this dynamic characteristics, while straints Xtc ∈[18°C,26°C]. By using RNN model and taking
the gradient steps, we find a sequence of control inputs that
could reduce 30.74% of energy consumption. On the other
hand, the solution found by RC model only gives us a 4.07%
reduction of energy consumption. This furthur illustrates that
RC model is not good at modeling large-scale building system
dynamics.
Fig. 8 demonstrates how our proposed approach is able to
find a group of control inputs for the building system globally.
All of four zones setpoint schedule exhibit daily patterns.
Yet they are set to different values and evolution patterns.
These setpoint schedule can provide to building operators, and
it remains to be examined in real buildings if such optimized
schedules could benefit the complex system as a whole.
Energy Consumption [J]9x 108 Measurements Temperature [C] Temperature [C]
8 L19H24
7 L18H26
6 Unconstrained
5 basement bottom_core
top_core top_sub4
4
3 Temperature [C] Temperature [C]
2
1
0 0 6 Time o1f2the day(h) 18 24
Fig. 6. Effects of constraints interval on optimization performance.
9x 108 Measurements RC-Opti RNN-Opti Fig. 8. One week temperature setpoint profile for 4 different zones
8 of the office building.
Energy Consumption [J] 7 [3] X. Zhang, W. Shi, B. Yan, A. Malkawi, and N. Li, “Decentralized
and distributed temperature control via hvac systems in energy efficient
6 buildings,” arXiv preprint arXiv:1702.03308, 2017.
5 [4] Y. Shi, B. Xu, B. Zhang, and D. Wang, “Leveraging energy storage to
optimize data center electricity cost in emerging power markets,” arXiv
4 preprint arXiv:1606.01536, 2016.
3 [5] M. Trcˇka and J. L. Hensen, “Overview of hvac system simulation,”
Automation in Construction, vol. 19, no. 2, pp. 9399, 2010.
2
[6] Y. Ma, F. Borrelli, B. Hencey, A. Packard, and S. Bortoff, “Model pre-
1 dictive control of thermal energy storage in building cooling systems,”
in Decision and Control, 2009 held jointly with the 2009 28th Chinese
0 Mon Tue Wed Thu Fri Control Conference. CDC/CCC 2009. Proceedings of the 48th IEEE
Conference on. IEEE, 2009, pp. 392397.
Days
[7] A. Jain, M. Behl, and R. Mangharam, “Data predictive control for
Fig. 7. Comparison of optimization results of RC model (green) and building energy management,” in BuildSys, 2017.
RNN model (blue) with respect to original measurements (red).
[8] L. Yang, Z. Nagy, P. Goffin, and A. Schlueter, “Reinforcement learning
VI. CONCLUSION for optimal control of low exergy buildings,” Applied Energy, vol. 156,
pp. 577586, 2015.
In this work, we are exploiting Recurrent Neural Networks
ability of learning complex temporal interactions among high- [9] T. Wei, Y. Wang, and Q. Zhu, “Deep reinforcement learning for building
dimensional building dynamics. Our proposed method consists hvac control,” in The Design and Automation Conference, 2017.
Recurrent Neural Networks regression and sequence opti-
mization steps, which could both be solved efficiently. Our [10] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wier-
proposed approach is easily to be deployed for any building stra, and M. Riedmiller, “Playing atari with deep reinforcement learn-
unit provided with rich historical running data. Simulation ing,” arXiv preprint arXiv:1312.5602, 2013.
results show that our method outperforms existing ones both
in capturing the thermal dynamics of the building as well as [11] K.-i. Funahashi and Y. Nakamura, “Approximation of dynamical systems
providing effective control solutions. by continuous time recurrent neural networks,” Neural networks, vol. 6,
no. 6, pp. 801806, 1993.
REFERENCES
[12] D. H. Nguyen and B. Widrow, “Neural networks for self-learning control
[1] C.-C. Cheng, S. Pouffary, N. Svenningsen, and J. M. Callaway, “The systems,” IEEE Control systems magazine, vol. 10, no. 3, pp. 1823,
kyoto protocol, the clean development mechanism and the building and 1990.
construction sector: A report for the unep sustainable buildings and
construction initiative,” 2008. [13] L. Bottou, “Large-scale machine learning with stochastic gradient de-
scent,” in Proceedings of COMPSTAT2010. Springer, 2010, pp. 177
[2] Y. Ma, A. Kelman, A. Daly, and F. Borrelli, “Predictive control for 186.
energy efficient buildings with thermal storage: Modeling, stimulation,
and experiments,” IEEE Control Systems, vol. 32, no. 1, pp. 4464, [14] I. Sutskever, J. Martens, G. Dahl, and G. Hinton, “On the importance
2012. of initialization and momentum in deep learning,” in International
conference on machine learning, 2013, pp. 11391147.
[15] D. B. Crawley, L. K. Lawrie, F. C. Winkelmann, W. F. Buhl, Y. J.
Huang, C. O. Pedersen, R. K. Strand, R. J. Liesen, D. E. Fisher, M. J.
Witte et al., “Energyplus: creating a new-generation building energy
simulation program,” Energy and buildings, vol. 33, no. 4, pp. 319331,
2001.
[16] M. Deru, K. Field, D. Studer, K. Benne, B. Griffith, P. Torcellini, B. Liu,
M. Halverson, D. Winiarski, M. Rosenberg et al., “Us department of
energy commercial reference building models of the national building
stock,” 2011.

126
main.py
View File

@ -2,6 +2,10 @@ import pandas as pd
import numpy as np import numpy as np
import tensorflow as tf import tensorflow as tf
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
from keras import regularizers
from keras.callbacks import TensorBoard, LearningRateScheduler
from keras.layers import Dropout
from sklearn.preprocessing import MinMaxScaler
def data_read(data_address): def data_read(data_address):
@ -16,8 +20,7 @@ def data_read(data_address):
X = np.array(df['Voltage']) X = np.array(df['Voltage'])
Y = np.array(df['Problem']) Y = np.array(df['Problem'])
# 归一化处理 X = tf.nn.relu(X)
X = (X - np.min(X)) / (np.max(X) - np.min(X))
# 转换为时间序列数据格式 # 转换为时间序列数据格式
time_steps = 34 time_steps = 34
@ -26,7 +29,7 @@ def data_read(data_address):
X_series.append(X[i:(i + time_steps)]) X_series.append(X[i:(i + time_steps)])
Y_series.append(Y[i + time_steps - 1]) Y_series.append(Y[i + time_steps - 1])
return np.array(X_series).reshape(-1, time_steps, 1), np.array(Y_series) return np.array(X_series), np.array(Y_series)
X_train, Y_train = data_read( X_train, Y_train = data_read(
@ -34,39 +37,67 @@ X_train, Y_train = data_read(
X_test, Y_test = data_read( X_test, Y_test = data_read(
'Liu/data/VOLTAGE-QUALITY-CLASSIFICATION-MODEL--main/Voltage Quality Test.csv') 'Liu/data/VOLTAGE-QUALITY-CLASSIFICATION-MODEL--main/Voltage Quality Test.csv')
# 归一化
sc = MinMaxScaler(feature_range=(0, 1))
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
X_train.reshape(-1, 34, 1)
X_test.reshape(-1, 34, 1)
np.random.seed(7)
np.random.shuffle(X_train)
np.random.seed(7)
np.random.shuffle(Y_train)
tf.random.set_seed(7)
# 获取类别数量 # 获取类别数量
n_classes = len(np.unique(Y_train)) n_classes = len(np.unique(Y_train))
# 构建模型
model = tf.keras.models.Sequential([
tf.keras.layers.LSTM(50, return_sequences=True, input_shape=(34, 1)),
tf.keras.layers.LSTM(50),
tf.keras.layers.Dense(n_classes, activation='softmax') # 修改为适应多分类
])
# 编译模型
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# 训练模型
model.fit(X_train, Y_train, epochs=10, validation_split=0.2)
# 评估模型
loss, accuracy = model.evaluate(X_test, Y_test)
print(X_train.shape)
# 制作扰动数据
# 损失函数 # 损失函数
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
# 定义损失函数 # 构建使用 RNN 的模型
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy() model = tf.keras.models.Sequential([
tf.keras.layers.SimpleRNN(100, return_sequences=True, input_shape=(34, 1)),
Dropout(0.2),
tf.keras.layers.SimpleRNN(100),
Dropout(0.2),
tf.keras.layers.Dense(n_classes, activation='relu',
kernel_regularizer=regularizers.l2(0.3)) # 适应多分类
])
# 编译模型
model.compile(
optimizer='SGD',
loss=loss_fn,
metrics=['accuracy'])
# 定义学习率指数递减的函数
def lr_schedule(epoch):
initial_learning_rate = 0.01
decay_rate = 0.1
decay_steps = 250
new_learning_rate = initial_learning_rate * decay_rate ** (epoch / decay_steps)
return new_learning_rate
# 定义学习率调度器
lr_scheduler = LearningRateScheduler(lr_schedule)
# Save initial weights
initial_weights = model.get_weights()
# TensorBoard 回调
log_dir = "logs/fit"
tensorboard_callback = TensorBoard(log_dir=log_dir, histogram_freq=1)
# 制作扰动数据
# 转换X_train和Y_train为TensorFlow张量 # 转换X_train和Y_train为TensorFlow张量
X_train_tensor = tf.convert_to_tensor(X_train, dtype=tf.float32) X_train_tensor = tf.convert_to_tensor(X_train, dtype=tf.float64)
Y_train_tensor = tf.convert_to_tensor(Y_train, dtype=tf.int32) Y_train_tensor = tf.convert_to_tensor(Y_train, dtype=tf.int32)
# 使用tf.GradientTape来计算梯度 # 使用tf.GradientTape来计算梯度
@ -81,35 +112,34 @@ with tf.GradientTape() as tape:
# 计算关于输入X的梯度 # 计算关于输入X的梯度
gradients = tape.gradient(loss, X_train_tensor) gradients = tape.gradient(loss, X_train_tensor)
# 计算每个输入的梯度的L2范数
gradient_magnitudes = tf.norm(gradients, axis=1)
# 创建每个gamma对应的准确率的字典 # 创建每个gamma对应的准确率的字典
accuracy_per_gamma = {} accuracy_per_gamma = {}
# 平坦化梯度 # 平坦化梯度
flattened_gradients = tf.reshape(gradients, [-1]) flattened_gradients = tf.reshape(gradients, [-1])
# 选择最大的γ * |X|个梯度 # 选择最大的γ * |X|个梯度
for gamma in [0.05, 0.1, 0.2, 0.4]: for gamma in [0.05, 0.1, 0.2, 0.4]:
num_gradients_to_select = int(gamma * tf.size(flattened_gradients, out_type=tf.dtypes.float32)) num_gradients_to_select = int(
top_gradients_indices = tf.argsort(flattened_gradients, direction='DESCENDING')[:num_gradients_to_select] gamma * tf.size(flattened_gradients, out_type=tf.dtypes.float32))
top_gradients_indices = tf.argsort(flattened_gradients, direction='DESCENDING')[
:num_gradients_to_select]
# 创建一个新的梯度张量,初始化为原始梯度的副本 # 创建一个新的梯度张量,初始化为原始梯度的副本
updated_gradients = tf.identity(flattened_gradients) updated_gradients = tf.identity(flattened_gradients)
# 创建一个布尔掩码其中选定的最大梯度为False其他为True # 创建一个布尔掩码其中选定的最大梯度为False其他为True
mask = tf.ones_like(updated_gradients, dtype=bool) mask = tf.ones_like(updated_gradients, dtype=bool)
mask = tf.tensor_scatter_nd_update(mask, tf.expand_dims(top_gradients_indices, 1), tf.zeros_like(top_gradients_indices, dtype=bool)) mask = tf.tensor_scatter_nd_update(mask, tf.expand_dims(
top_gradients_indices, 1), tf.zeros_like(top_gradients_indices, dtype=bool))
# 使用这个掩码更新梯度 # 使用这个掩码更新梯度
updated_gradients = tf.where(mask, tf.zeros_like(updated_gradients), updated_gradients) updated_gradients = tf.where(mask, tf.zeros_like(
updated_gradients), updated_gradients)
# 将梯度重构为原始形状 # 将梯度重构为原始形状
updated_gradients = tf.reshape(updated_gradients, tf.shape(gradients)) updated_gradients = tf.reshape(updated_gradients, tf.shape(gradients))
# 创建准确率列表 # 创建准确率列表
accuracy_list = [] accuracy_list = []
@ -118,18 +148,16 @@ for gamma in [0.05, 0.1, 0.2, 0.4]:
# 应用学习率到梯度 # 应用学习率到梯度
scaled_gradients = learning_rate * updated_gradients scaled_gradients = learning_rate * updated_gradients
# 使用缩放后的梯度更新X_train_tensor # 使用缩放后的梯度更新X_train_tensor
X_train_updated = X_train_tensor - scaled_gradients X_train_updated = tf.add(X_train_tensor, scaled_gradients)
tf.reshape(X_train_updated, (3332,34,1))
X_train_updated = X_train_updated.numpy() X_train_updated = X_train_updated.numpy()
# 编译模型 # Reset model weights to initial weights
model.compile( model.set_weights(initial_weights)
optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
# 训练模型 # 训练模型,添加 TensorBoard 回调
model.fit(X_train_updated, Y_train, epochs=1500, validation_split=0.2) history = model.fit(X_train_updated, Y_train, epochs=1500,
batch_size=32, callbacks=[tensorboard_callback, lr_scheduler])
# 评估模型 # 评估模型
loss, accuracy = model.evaluate(X_test, Y_test) loss, accuracy = model.evaluate(X_test, Y_test)
@ -137,9 +165,11 @@ for gamma in [0.05, 0.1, 0.2, 0.4]:
# 记录准确率 # 记录准确率
accuracy_list.append(accuracy) accuracy_list.append(accuracy)
# 记录该gamma下的准确率 # 记录该gamma下的准确率
accuracy_per_gamma[gamma] = accuracy_list accuracy_per_gamma[gamma] = accuracy_list
# 学习率样本 # 学习率样本
learning_rates = [0.1, 0.2, 0.3, 0.4, 0.5] learning_rates = [0.1, 0.2, 0.3, 0.4, 0.5]
@ -147,11 +177,12 @@ learning_rates = [0.1, 0.2, 0.3, 0.4, 0.5]
gammas = [0.05, 0.1, 0.2, 0.4] gammas = [0.05, 0.1, 0.2, 0.4]
# 创建图像 # 创建图像
plt.figure(figsize=(10, 6)) last_plt = plt.figure(figsize=(10, 6))
# 为每个gamma值绘制曲线 # 为每个gamma值绘制曲线
for gamma in gammas: for gamma in gammas:
plt.plot(learning_rates, accuracy_per_gamma[gamma], marker='o', label=f'Gamma={gamma}') plt.plot(learning_rates,
accuracy_per_gamma[gamma], marker='o', label=f'Gamma={gamma}')
# 添加标题和标签 # 添加标题和标签
plt.title('Accuracy vs Learning Rate for Different Gammas') plt.title('Accuracy vs Learning Rate for Different Gammas')
@ -162,3 +193,4 @@ plt.legend()
# 显示图像 # 显示图像
plt.show() plt.show()
# tensorboard --logdir logs/fit