Compare commits
12 Commits
Author | SHA1 | Date |
---|---|---|
MuJ | 2edab9976a | |
MuJ | e3d153646f | |
MuJ | 5726c2a45e | |
MuJ | 111afc5b4c | |
MuJ | b4431d20ab | |
MuJ | 1aa5d08034 | |
MuJ | 31f4fbc323 | |
MuJ | 58e266e184 | |
MuJ | 19a079f76a | |
MuJ | cf0b038283 | |
MuJ | f7cce10e97 | |
MuJ | f93459c4d0 |
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,459 @@
|
|||
Is Machine Learning in Power Systems Vulnerable?
|
||||
|
||||
Yize Chen∗, Yushi Tan∗, and Deepjyoti Deka†
|
||||
∗Department of Electrical Engineering, University of Washington, Seattle, USA
|
||||
|
||||
{yizechen, ystan}@uw.edu
|
||||
†Theory Division, Los Alamos National Laboratory, Los Alamos, USA
|
||||
|
||||
deepjyoti@lanl.gov
|
||||
|
||||
arXiv:1808.08197v2 [cs.SY] 27 Aug 2018 Abstract—Recent advances in Machine Learning (ML) have compared to traditional model-based methods, or have proven
|
||||
led to its broad adoption in a series of power system applica- to be computationally more efficient. These progresses have
|
||||
tions, ranging from meter data analytics, renewable/load/price shown the great potential of applying ML in power systems.
|
||||
forecasting to grid security assessment. Although these data-
|
||||
driven methods yield state-of-the-art performances in many tasks, However, since power systems are at the core of critical
|
||||
the robustness and security of applying such algorithms in infrastructures, we are taking a step back cautiously, and
|
||||
modern power grids have not been discussed. In this paper, asking ourselves two simple yet not-answered questions:
|
||||
we attempt to address the issues regarding the security of
|
||||
ML applications in power systems. We first show that most “Is ML in power systems vulnerable to data attacks?
|
||||
of the current ML algorithms proposed in power systems are Are vulnerabilities of ML-integrated power systems easy to
|
||||
vulnerable to adversarial examples, which are maliciously crafted deciper by an adversary?”
|
||||
input data. We then adopt and extend a simple yet efficient
|
||||
algorithm for finding subtle perturbations, which could be used Raw Data Renewables Forecasts
|
||||
for generating adversaries for both categorical (e.g., user load
|
||||
profile classification) and sequential applications (e.g., renewables
|
||||
generation forecasting). Case studies on classification of power ML Model
|
||||
quality disturbances and forecast of building loads demonstrate
|
||||
the vulnerabilities of current ML algorithms in power networks Outage Detection
|
||||
under our adversarial designs. These vulnerabilities call for Load Forecasting
|
||||
design of robust and secure ML algorithms for real world Power System
|
||||
applications. Tasks
|
||||
Algorithm ---
|
||||
I. INTRODUCTION Input --- --- --- ---
|
||||
|
||||
The modern power systems, with deeper penetration of ---
|
||||
renewable generation and higher level of demand-side par-
|
||||
ticipation, are faced with increasing degree of complexities
|
||||
and uncertainties [1], [2]. Reliable operation of the grid in Tasks
|
||||
this context calls for improved techniques in system modeling, Fail
|
||||
assessment and decision making [3], [4], [5]. On the one hand, Smart Meter Data
|
||||
smart meters and advanced sensing technologies have made Classiciation
|
||||
the collection of fine-grained electricity data, both historical
|
||||
and streaming, available to system operators [6]. On the other
|
||||
hand, there is an urgent need of efficient and near real-time
|
||||
algorithms to analyze and make better use of these available
|
||||
data. IAndvpeurtsaries
|
||||
Planning and Control
|
||||
Recent advancements on Machine Learning (ML) algo-
|
||||
rithms, especially the giant leaps on deep learning, make ML Adversarial Data Adversarial ML Model
|
||||
a good candidate in solving a series of data-driven problems
|
||||
in power systems [7]. To name a few, ML methods such as Craft
|
||||
Recurrent Neural Networks (RNN) can find its straightforward Adversaries --- Normal Operation
|
||||
applications in wind/solar power and building load forecast- --- --- --- ---
|
||||
ing [8], [9], [10]. In [4], [11], ML algorithms are applied on Attacks on Learning
|
||||
power grid outage detection; while in [6], deep convolutional --- Algorithms
|
||||
neural networks are adopted for classifying user load profiles.
|
||||
Planning and control problems in power systems, such as
|
||||
HVAC control and grid protection policy-making, can also be
|
||||
solved via ML approaches [9], [12]. All of the algorithms
|
||||
mentioned above have achieved either better performances
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
Fig. 1: The schematic of the proposed attack on ML in
|
||||
power systems. (Black) Normal ML operations, which learn
|
||||
from the given raw data and has various applications in
|
||||
power systems; (Red) without knowing any knowledge of
|
||||
targeted ML model (Blue), attackers could generate adversarial
|
||||
examples by only using raw data. Such adversaries would
|
||||
exploit the vulnerabilities of the targeted ML models.
|
||||
|
||||
Unfortunately, in this paper, we answer both these questions
|
||||
affirmatively. By adopting and extending the algorithms pro-
|
||||
posed in [13], [14], we show that most of the ML algorithms
|
||||
designed for power systems are vulnerable to adversarial
|
||||
data manipulation, often under very weak assumptions on
|
||||
adversarial ability. As depicted in Fig. 1, attackers do not
|
||||
need any access to the operating ML model itself. Using
|
||||
limited access to the input data, one can generate adversarial
|
||||
data by injecting designed perturbations to the original data.
|
||||
The operating ML model’s performance (e.g., classification
|
||||
accuracy) is greatly impaired with these adversarial inputs.
|
||||
|
||||
To demonstrate that such vulnerabilities broadly exist in
|
||||
currently proposed ML algorithms for power systems, we
|
||||
show two typical cases on categorical and time-series cases
|
||||
respectively. In the first case, we successfully attack a power
|
||||
quality disturbances classifier [4], [11], which leads to a signals, while Y are the one-hot encoded vectors of respective
|
||||
misclassification of over 70% of given adversarial voltage sig- labels [4]. The ML model aims to learn a function fθ(X) that
|
||||
nals (e.g., label sage signals as normal). In the second case, we maps from X to Y with model parameters θ. For convenience,
|
||||
consider an RNN-based building load forecasting model [9], we sometimes suppress the θ symbol. In order to find such
|
||||
[15]. After imposing crafted perturbations on input variables mapping, we consider the general algorithm,
|
||||
such as the temperature setpoints and building occupancies,
|
||||
the attack results in a significant performance degradation in θ∗ = arg min L(fθ(X), Y ) (1)
|
||||
the sense that the predicting accuracy drops by a factor of ten.
|
||||
The adversaries in both cases thus exhibit detrimental impacts θ
|
||||
on power system operations.
|
||||
where L(·, ·) is a pre-defined loss function. For instance, L2
|
||||
A. Contributions loss can be directly used to determine the distance, which is
|
||||
commonly used in LASSO along with L1 regularization on
|
||||
In the area of computer vision, researchers have found that θ; while in the case of classification using Neural Networks,
|
||||
Neural Network models behave poorly on some crafted images we may choose cross-entropy for L(·, ·), and in the case of
|
||||
created by simply adding noises to clean images [16]. This regression via Neural Networks, an L2 loss is feasible to
|
||||
kind of misbehavior on noisy input may be more hazardous determine the deviation of model’s outputs from true values.
|
||||
for highly automated power systems, since one single wrong
|
||||
decision made by the ML model could undermine the secure Since many of the ML applications [12], [9], [4] have fo-
|
||||
operation and lead to a large scale blackout. In light of the cused on utilizing the learning and representation capabilities
|
||||
criticality of secure sensing and estimation in power grids, provided by Neural Networks, here we briefly illustrate the
|
||||
this paper includes the following key contributions: learning procedure on Neural Networks’ parameters (neurons’
|
||||
weights). Neural Networks are composed of stacked, differ-
|
||||
• We highlight and discuss the general security issues of entiable “neuronal” layers, such as fully connected layers,
|
||||
ML algorithms in power systems. convolutional layers and activation functions. It is powerful
|
||||
in learning tasks with high-dimensional X and Y . Though
|
||||
• We propose an efficient attacking strategy, which could there are many variants of the iterative steps, the standard
|
||||
find the vulnerabilities of ML algorithms in both static back-propagation procedure via gradient descent for updating
|
||||
and transient cases. model weights is summarized as follows,
|
||||
|
||||
• We provide detailed numerical simulations of the pro- θi+1 = θi − η∇θL(fθ(X), Y ) (2)
|
||||
posed adversarial algorithm design, which reveal the
|
||||
vulnerabilities of current ML approaches. We also open- where η is the learning rate, and the subscripts on θ denote
|
||||
source our code for reproducing the results and testing the iteration steps on the weight parameters of the Neural
|
||||
the security of other physical-ML integrated systems1. Networks. Once the model is trained using X, Y via (1) and
|
||||
(2), we get an accurate model fθ∗ . Recent progresses on deep
|
||||
The rest of the paper is organized as follows. In Section II learning have enabled Neural Networks composed of millions
|
||||
we discuss the general model setup for learning problems in of neurons to outperform all other algorithms in many real-
|
||||
power systems; in Section III we describe our implementations world applications [7].
|
||||
of attacks on ML models; in Section IV we show two
|
||||
representative cases of vulnerabilities on current algorithms; B. RNN Model
|
||||
we draw conclusions in Section V with more discussions on
|
||||
the security and robustness of ML model in power systems. In many cases, the states in power systems are not static,
|
||||
but rather evolve in a sequential manner. For instance, the
|
||||
II. MACHINE LEARNING IN POWER SYSTEMS future solar power and wind generation have temporal and
|
||||
spatial correlations. Under this scenario, Recurrent Neural
|
||||
In this section, we briefly review the ML models of interest, Network (RNN) becomes a good fit as its structure allows
|
||||
along with the specific model architecture in the case of Neural it to model temporal dependencies for sequence data [17].
|
||||
Networks. We also introduce the model setup of Recurrent
|
||||
Neural Networks (RNN), which is a powerful modeling and Modeling via RNN requires a group of sequential input
|
||||
learning algorithm for sequential data. sample x = {x0, ..., xT }, where T is the memory length.
|
||||
The weight coefficient of RNN consists of three subsets:
|
||||
A. Learning Task θin,t, θout,t and θhiddent,t. RNN also allows for linked neurons
|
||||
between neighboring timesteps. The t-timestep RNN cell is
|
||||
Machine Learning provides tools for learning the patterns using inputs from hidden state ht and input xt, and delivers
|
||||
or relationship in available data, which can be generalized to outputs yˆt as well as next step’s hidden state ht+1. The t-step
|
||||
the future operation and decision-making in power systems. RNN cell then completes the following computations:
|
||||
The supervised learning setup is normally considered, where
|
||||
a paired training dataset X, Y is given. X, Y are vectors yˆt = fθin,t,θout,t (xt, ht) (3a)
|
||||
of fixed dimensions. For instance, in the case of power ht+1 = fθhidden,t,θin,t (xt, ht) (3b)
|
||||
quality classification, X are the collected fixed-length power
|
||||
By stacking such cells over time, the hidden state can be
|
||||
1Code repository: https://github.com/chennnnnyize/PowerAdversary used to store and transfer the input information from previous
|
||||
t-1 t t+1 t+2 for X∗. Different choices of d for the norm of δX lead to
|
||||
different constraints on adversarial manipulation:
|
||||
θ out Input X
|
||||
Output Y • d = 0 : (4) has similar objective as the Grad0 attack
|
||||
... θ hidden ... Neurons proposed by [14], where γ denotes how many dimensions
|
||||
of input data is allowed to be modified.
|
||||
θ in
|
||||
• d = ∞ : (4) has similar objective as the Fast Gradient
|
||||
t-1 t t+1 t+2 Sign (FGS) attack proposed by [16], where γ denotes the
|
||||
maximum level of noise allowed on each dimension of
|
||||
Fig. 2: Basic RNN structure composed of hidden, input and δX .
|
||||
output neurons. The output yˆT is a function of sequential input
|
||||
x0, ..., xT . with a memory length T . We also observe an interesting connection between (2)
|
||||
(operator) and (4) (adversary), where the ML training al-
|
||||
steps. With a memory length of T , the output yˆT is essentially gorithm is essentially training over model parameters θ to
|
||||
a function of x0, ..., xT . We can then conclude that RNN’s minimize model loss; while the adversaries’ task is quite
|
||||
modeling and learning strategies also take the form of (1) opposite: to optimize over model inputs X to maximize model
|
||||
and (2), where X is composed of the sequential vectors loss. Specifically, we look into the case of Neural Networks
|
||||
{xt, t = 0, ..., T }. involving highly non-convex model in terms of both X and θ
|
||||
that have been shown to achieve state of the art performance
|
||||
in several power system applications. Since solving (2) always
|
||||
yields an accurate model, we are interested in finding ways to
|
||||
solve (4) which would provide insights on the vulnerabilities
|
||||
of Neural Networks used in power systems.
|
||||
|
||||
B. Crafting Adversarial Examples
|
||||
|
||||
III. CRAFTING ATTACKS FOR ML In this sub-section, we propose an efficient attack algorithm
|
||||
which can incorporate the constraints (4c) with d = 0 and
|
||||
In this section, we first give mathematical definitions on d = ∞ and exploit the vulnerabilities for both normal Neural
|
||||
adversarial examples which exploit ML’s vulnerabilities. We Networks and sequential models like RNN.
|
||||
then propose an algorithm, which is a variant of the Neural
|
||||
Networks attack approach proposed in [16]. Our proposed 1) Adversarial Samples without d = 0 Constraint: Since
|
||||
algorithm produces adversarial examples for both normal the optimization problem (4) itself is highly nonconvex and
|
||||
Neural Networks and sequential models such as RNN. high-dimensional, it is intractable to achieve the global optimal
|
||||
solution X∗. Alternatively, since the gradients of L(fθ(X), Y )
|
||||
A. Adversarial Examples encode the loss landscape, we propose a gradient ascent
|
||||
method on the loss function with respect to X to acquire the
|
||||
Consider any given supervised ML model fθ with corre- small perturbations which would increase L(fθ(X), Y ):
|
||||
sponding paired dataset X, Y . We assume that an attacker has
|
||||
no access to the model f and cannot modify it. Instead we X∗ = X + δX = X + ∇X L(fθ(X), Y ) (5)
|
||||
consider the mild setting that the attacker can only change the
|
||||
input samples X to X∗ ytto the model to modify its output where controls the noise level added to the clean samples.
|
||||
fθ(X∗) such that is not accurate compared to the ground Crafting attacks following (5) exactly follows the FGS attack
|
||||
truth Y . Moreover, to avoid detection by the system operator, strategy, which has found vulnerabilities for ML models used
|
||||
the attacker ensures that adversarial input X∗ is close to the in computer vision. Yet this attack has no constraint on
|
||||
true inputs X. For instance, an attacker tries to modify the ||δX ||0 ≤ γ · |X|, so attacker has the control and access
|
||||
system voltage wavelet signals such that ML-based power to modify every entry of X, which adds relatively large
|
||||
quality classifier would classify falsely, while making sure that perturbations to the input.
|
||||
such changes on signals would not be observed by the system
|
||||
operator. Formally, the attacker would craft an adversary via 2) Adversarial Samples with d = 0 Constraint: We now
|
||||
solving the following optimization problem: discuss the constraint on the number of entries the attacker
|
||||
is allowed to modify. The attacker shall only change the γ ·
|
||||
max L(fθ(X∗), Y ) (4a) |X| input entries, which have the most impact on L(fθ(X +
|
||||
δX δX ), Y ). Formally, let A define the set of largest γ ·|X| entries
|
||||
of ∇X L(fθ(X + δX ), Y ), while let S denote the entire set
|
||||
s.t. X∗ = X + δX (4b) of entries. Then we propose the following operation to get
|
||||
adversarial samples with constraint ||δX ||0 ≤ γ · |X|:
|
||||
||δX ||d ≤ γ · |X|
|
||||
(4c)
|
||||
|
||||
where δX in (4b) is the perturbation we add to the clean δXA = ∇XA L(fθ(X), Y ) (6a)
|
||||
samples X; (4c) constrains the level of perturbation γ allowed
|
||||
δXS\A = 0 (6b)
|
||||
Adversarial Test Samples Clean Test Samples Normal|Normal Sag|Sag Impulse|Impulse Ground Truth|NN Classification Results
|
||||
Distortion|Distortion
|
||||
Sampling Steps
|
||||
Voltage (p.u.) Normal|Sag Voltage (p.u.) Voltage (p.u.) Voltage (p.u.)
|
||||
|
||||
Sampling Steps Sampling Steps Sampling Steps
|
||||
Sag|Normal Impulse|Normal Distortion|Normal
|
||||
|
||||
Voltage (p.u.) Voltage (p.u.) Voltage (p.u.) Voltage (p.u.)
|
||||
|
||||
Sampling Steps Sampling Steps Sampling Steps Sampling Steps
|
||||
|
||||
Fig. 3: Case studies on power quality signal classification with randomly selected clean samples from our test sets (top) versus corresponding
|
||||
adversarial samples crafted by Algorithm. 1 (bottom). The original Neural Networks could accurately classify four classes of power signals,
|
||||
yet it fails to classify adversarial samples with high probability.
|
||||
|
||||
S\A denotes the complement set of input entries. The final Algorithm 1 Crafting Adversarial Examples
|
||||
adversarial examples can still be generated via X∗ = X + δX .
|
||||
Since all the ML models considered in this paper, including Input: Clean pairing training data X, Y , input entries set A
|
||||
normal Neural Networks and RNN structures are all differen- Input: Training iterations Niter, number of adversarial exam-
|
||||
tiable with respect to input, we highlight the universality of
|
||||
the proposed algorithm on finding the vulnerabilities of any ples Nadv
|
||||
trained models. Input: Clean testing samples {xi, yi}, i = 1, ..., Nadv
|
||||
Initialize: Attacker surrogate ML model fθ
|
||||
Note that even though (5) only implements once in our Initialize: Adversarial examples set X∗ ← ∅
|
||||
proposed algorithm without any iterative optimization on δX ,
|
||||
we show in Section IV that the trained, unknown model is # Training the surrogate ML model
|
||||
vulnerable to such attacks. for iteration= 0, ..., Niter do
|
||||
|
||||
We also distinguish our work from previous attack and de- Update θ using gradient descent (2) on X, Y
|
||||
fense research in power systems [18], [19]. Previous research end for
|
||||
only exploited the vulnerabilities of state estimation, while # Find adversarial examples using clean data {xi, yi}
|
||||
we found weaknesses of general ML tasks in power systems. for iteration = 0, ..., Nadv do
|
||||
Moreover, the proposed algorithm works under the black-box
|
||||
setting. To put it in other words, the attacker only needs to train Calculate gradients w.r.t xi: δxi = ∇xi L(fθ (xi), yi)
|
||||
its own version of surrogate ML model fθ without knowing Find set A: the largest γ · |X| gradients of δxi
|
||||
any knowledge of fθ. By finding adversarial examples X∗ of δxSi\A = 0
|
||||
fθ, X∗ can then be used for attacking unknown ML model fθ xi∗ = xi + δxi
|
||||
operating in power systems. We summarize the algorithm in end for
|
||||
Algorithm 1. X ∗ .insert(xi∗ )
|
||||
|
||||
IV. CASE STUDIES Neural Networks model. Two Nvidia Geforce GTX TITAN
|
||||
X GPUs are used for training acceleration and the average
|
||||
We evaluate the proposed algorithm’s performance on two training times of both tasks are within 10 seconds.
|
||||
tasks: power quality assessment via classifying the voltage
|
||||
signals by feed-forward Neural Network [4], [11], and short- A. Power Quality Classification
|
||||
term building load forecasting via RNN [9]. We set up the
|
||||
deep learning models using Tensorflow and Keras, two Python In this task, we would like to investigate if ML model could
|
||||
open-source packages. We adopt rectified linear unit (ReLU) detect the power quality disturbances in the waveform signals.
|
||||
activation functions, dropout layers and Stochastic Gradient Past research claim that Neural Networks based classifier
|
||||
Descent, a variant of (2) to improve the performance of our would detect those disturbances in signals, which would then
|
||||
avoid damages and improve the power quality [4], [11]. Here,
|
||||
we attempt to add slight perturbations to the input signals and
|
||||
see if such classifier would fail to classify these disturbances.
|
||||
100 γ=5% (a) 23 Temperature Setpoints [C] (b) 140 Human Occupancy
|
||||
80 γ=10%
|
||||
60 γ=20% 22 120
|
||||
γ=40% 21 100
|
||||
20 80
|
||||
Test Accuracy 40 19 Original Signal 60 Original Signal
|
||||
18 Adversarial Signal 40 Adversarial Signal
|
||||
17 20
|
||||
16 Wed Thu Fri Mon Tue Wed Thu Fri
|
||||
15 0
|
||||
20 Tue
|
||||
Mon Tue Wed
|
||||
0 0.1 0.2 0.3 0.4 0.5 (c) 9 × 108 Energy Consumption [J]
|
||||
|
||||
value 8
|
||||
7
|
||||
Fig. 4: Voltage signal classification accuracy with varying 6 Ground Truth
|
||||
5 Original Prediction
|
||||
noise level and input perturbation percentage γ for adver- 4 Adversarial Prediction
|
||||
saries X∗. 3
|
||||
2 Fri
|
||||
1) Data Description: We consider four types of wave 1 Thu
|
||||
0
|
||||
|
||||
Mon
|
||||
|
||||
signals as illustrated in the first row of Fig. 3, with one Fig. 5: Building forecasts results under = 0.03 and γ = 10.
|
||||
(a) and (b) the data profiles for one week’s sub-region tem-
|
||||
group of normal signals, and three types of disturbances: sags, perature setpoints and occupancy level before and after the
|
||||
attack; (c) the ground truth of one week’s energy consumption,
|
||||
impulses and distortion. We construct a labeled dataset with predicted energy consumption using clean testing data and
|
||||
predicted result after injecting adversarial data profiles.
|
||||
200 signals from each class, with each signal of fixed length.
|
||||
|
||||
After shuffling and separating 1 of the data as testing set,
|
||||
4
|
||||
|
||||
we construct a 3−layer fully connected Neural Networks to
|
||||
|
||||
classify these signals into their respective class.
|
||||
|
||||
2) Simulation Results: We firstly observe the Neural Net- Department of Energy (DoE CRB) [20]. The building has
|
||||
|
||||
works classifier is powerful in classifying wave signals with a total floor area of 498, 584 square feet which is divided
|
||||
|
||||
different source of disturbances. The model achieves 97.5% into 16 separate zones. We simulate the building running
|
||||
|
||||
testing accuracy on the split test data. through the year of 2004 in Seattle, WA, and record xt,
|
||||
yt with a resolution of 10 minutes, where xt includes data
|
||||
Then we test if such trained classifier is able to correctly coming from various sensors, such as building occupancy,
|
||||
|
||||
classify the adversarial signals crafted by Algorithm 1. As
|
||||
|
||||
shown in Fig. 3, with = 0.03 and γ = 10%, the temperature setpoint and temperature measurements, and yt
|
||||
is the building energy consumption. We shuffle and separate
|
||||
black-box classifier wrongly classifies the adversarial signals.
|
||||
|
||||
Specifically, the adversarial impulse and distortion signals 2 months’ data as our stand-alone testing dataset for both
|
||||
|
||||
look similar to corresponding clean signals, and can be still predictive accuracy validation and vulnerabilities testing. The
|
||||
|
||||
classified as impulse and distortion signals by a technician, RNN model is composed of 1 recurrent layer and 2 subsequent
|
||||
|
||||
yet the ML model incorrectly regards them as normal signals. fully-connected layers with a memory length of 2 hours. Our
|
||||
|
||||
As shown in Fig. 4, we qualitatively test the adversaries’ ML model is also easy to extend to Long Short-Term Memory
|
||||
|
||||
performance by evaluating the Neural Networks’ classification network (LSTM) or any other variants of RNN structure. Since
|
||||
|
||||
result on adversarial examples. The model’s classification all these architectures are differentiable w.r.t xt, they would
|
||||
exhibit similar vulnerabilities to proposed adversaries.
|
||||
accuracy drops drastically with higher level of and γ, which
|
||||
|
||||
meets our assumption. When γ = 40% in which our algorithm MAPE Temp Occupancy Prediction
|
||||
Deviation Deviation Error
|
||||
changes 40% entries of the input signal, by only injecting a = 0.0 5.29%
|
||||
= 0.01 0% 0%
|
||||
small perturbation = 0.1, the ML model can only classify = 0.03 0.35% 2.44% 25.90%
|
||||
= 0.05 1.07% 6.94% 31.55%
|
||||
67.5% of the test samples. 1.86% 12.36% 55.37%
|
||||
|
||||
B. Building Load Forecasting TABLE I: The building load forecasting performance using
|
||||
adversarial examples with varying noise level under γ =
|
||||
In this example, we first train an RNN model, which could 10%. Note when = 0 it is the case with clean testing data.
|
||||
forecast building load accurately by using input features such
|
||||
as temperature measurements, building occupancy and solar 2) Simulation Results: We use the Mean Absolute Per-
|
||||
radiation. We then construct sequential adversarial inputs by centage Error (MAPE) to evaluate both the forecasting error
|
||||
using a surrogate model and evaluate the vulnerabilities on the and the input feature deviation caused by adding adversarial
|
||||
original load forecasting model. perturbations:
|
||||
|
||||
1) Data Description: We set up our building simulation M AP E(var∗, var) = 1 N |var∗ − var|
|
||||
platforms using EnergyPlus’s 12-storey large office build- × 100%
|
||||
ing listed in the commercial reference buildings from U.S. N var (7)
|
||||
|
||||
i=1
|
||||
where var represents either input feature or output energy con- on the features involved that may no be modifiable. Thus the
|
||||
sumption, while var∗ represents the corresponding adversarial
|
||||
feature or output energy consumption prediction. We test the defense against such intrusion attacks on ML algorithms in
|
||||
ML model performance by using the same one-week of testing
|
||||
data with different level of on adversarial data. power systems is still a urgent yet open problem.
|
||||
|
||||
As can be seen in Table. I, the model performs well with REFERENCES
|
||||
only 5.29% MAPE by using clean data. However, by only
|
||||
injecting δX with = 0.01, the model’s forecast has a 25.90% [1] M. H. Albadi and E. F. El-Saadany, “A summary of demand response
|
||||
deviation from the ground truth. The results worsen with more in electricity markets,” Electric power systems research, vol. 78, no. 11,
|
||||
intense level of noise injected. Meanwhile, the input features pp. 1989–1996, 2008.
|
||||
have little deviation from the clean data. We can also visually
|
||||
inspect the vulnerabilities of the RNN model in Fig. 5 when [2] M. R. Patel, Wind and solar power systems: design, analysis, and
|
||||
we only change 10% of the input features with noise level = operation. CRC press, 2005.
|
||||
0.03. The output prediction jumps a lot compared to previous
|
||||
forecasts, which is not informative for building operators. [3] Y. Chen, Y. Wang, D. Kirschen, and B. Zhang, “Model-free renewable
|
||||
scenario generation using generative adversarial networks,” IEEE Trans-
|
||||
V. CONCLUSION AND DISCUSSION actions on Power Systems, vol. 33, no. 3, pp. 3265–3275, 2018.
|
||||
|
||||
In this work, we look into the security and vulnerability of [4] M. Valtierra-Rodriguez, R. de Jesus Romero-Troncoso, R. A. Osornio-
|
||||
Machine Learning algorithms in power systems to adversaries. Rios, and A. Garcia-Perez, “Detection and classification of single and
|
||||
We propose an attack algorithm that universally exploits the combined power quality disturbances using neural networks,” IEEE
|
||||
vulnerabilities of ML in power systems, especially Neural Transactions on Industrial Electronics, vol. 61, no. 5, pp. 2473–2482,
|
||||
Network based algorithms. The adversarial strategy is practical 2014.
|
||||
as it does not change the system operator’s ML engine but ma-
|
||||
nipulates only input data. Case studies on two representative [5] P. Li, H. Wang, and B. Zhang, “A distributed online pricing strategy for
|
||||
power system examples reveal the vulnerability of proposed demand response programs,” arXiv preprint arXiv:1702.05551, 2017.
|
||||
ML algorithms. As researchers haven’t looked into such vul-
|
||||
nerabilities in current algorithm design, we hope our work [6] Y. Wang, Q. Chen, D. Gan, J. Yang, D. S. Kirschen, and C. Kang, “Deep
|
||||
will stimulate future discussions to increase the robustness of learning-based socio-demographic information identification from smart
|
||||
current ML algorithms in power systems to data manipulation. meter data,” IEEE Transactions on Smart Grid, 2018.
|
||||
Going forward, the following directions regarding secure ML
|
||||
applications are worth investigating, and we are also interested [7] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” nature, vol. 521,
|
||||
in investigating security issues of broader estimation/learning no. 7553, p. 436, 2015.
|
||||
algorithms in power system operation and control.
|
||||
[8] T. Hong, P. Pinson, S. Fan, H. Zareipour, A. Troccoli, and R. J.
|
||||
A. Adversaries in Learning Hyndman, “Probabilistic energy forecasting: Global energy forecasting
|
||||
competition 2014 and beyond,” 2016.
|
||||
In this work, we only discuss the scenario that ML models
|
||||
in power systems output inaccurate results with adversarial [9] Y. Chen, Y. Shi, and B. Zhang, “Modeling and optimization of complex
|
||||
inputs. Stronger attacks such as targeted attack can be consid- building energy systems with deep neural networks,” in Signals, Systems,
|
||||
ered where instead of solely maximizing the predicted loss, the and Computers, 2017 51st Asilomar Conference on. IEEE, 2017, pp.
|
||||
attacker can also add perturbations to falsify trained models to 1368–1373.
|
||||
a new adversarial objective. Such attacks are discussed in our
|
||||
previous work [14] which can be extended to the case in power [10] Y. Chen, X. Wang, and B. Zhang, “An unsupervised deep learning ap-
|
||||
systems. Moreover, there are also vulnerabilities to the model proach for scenario forecasts,” Power Systems Computation Conference
|
||||
itself. For instance, attacker would hack into the operation (PSCC), 2018.
|
||||
room to change the weights of trained model. Even though
|
||||
there is a line of work addressing the security issues in the [11] R. Eskandarpour and A. Khodaei, “Machine learning based power grid
|
||||
control, communication and infrastructure of power systems, outage prediction in response to extreme events,” IEEE Transactions on
|
||||
there is scope for work to address the security of learning in Power Systems, vol. 32, no. 4, pp. 3315–3316, 2017.
|
||||
power and cyber-physical systems.
|
||||
[12] C. Lassetter, E. Cotilla-Sanchez, and J. Kim, “Learning schemes for
|
||||
B. Defense for ML Algorithms in Power Systems power system planning and control.” in System Sciences (HICSS), 51st
|
||||
Hawaii International Conference on, 2018.
|
||||
Up to now, there has been some work on defending ML
|
||||
attacks in the research of computer vision. Yet most of them [13] N. Papernot, P. McDaniel, A. Swami, and R. Harang, “Crafting ad-
|
||||
operate on the ensemble or filtering of input images [21], versarial input sequences for recurrent neural networks,” in Military
|
||||
which may not be applicable for power systems as most of Communications Conference, MILCOM 2016-2016 IEEE. IEEE, 2016,
|
||||
applications involved in power have clear physical definitions pp. 49–54.
|
||||
|
||||
[14] H. Hosseini, Y. Chen, S. Kannan, B. Zhang, and R. Poovendran,
|
||||
“Blocking transferability of adversarial examples in black-box learning
|
||||
systems,” arXiv preprint arXiv:1703.04318, 2017.
|
||||
|
||||
[15] H. Hahn, S. Meyer-Nieberg, and S. Pickl, “Electric load forecasting
|
||||
methods: Tools for decision making,” European journal of operational
|
||||
research, vol. 199, no. 3, pp. 902–907, 2009.
|
||||
|
||||
[16] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing
|
||||
adversarial examples,” arXiv preprint arXiv:1412.6572, 2014.
|
||||
|
||||
[17] T. Mikolov, M. Karafia´t, L. Burget, J. Cˇ ernocky`, and S. Khudanpur,
|
||||
“Recurrent neural network based language model,” in Eleventh Annual
|
||||
Conference of the International Speech Communication Association,
|
||||
2010.
|
||||
|
||||
[18] Y. Liu, P. Ning, and M. K. Reiter, “False data injection attacks against
|
||||
state estimation in electric power grids,” ACM Transactions on Informa-
|
||||
tion and System Security (TISSEC), vol. 14, no. 1, p. 13, 2011.
|
||||
|
||||
[19] Y. Huang, M. Esmalifalak, H. Nguyen, R. Zheng, Z. Han, H. Li,
|
||||
and L. Song, “Bad data injection in smart grid: attack and defense
|
||||
mechanisms,” IEEE Communications Magazine, vol. 51, no. 1, pp. 27–
|
||||
33, 2013.
|
||||
|
||||
[20] P. Torcellini, M. Deru, B. Griffith, K. Benne, M. Halverson,
|
||||
D. Winiarski, and D. Crawley, “Doe commercial building benchmark
|
||||
models,” in Proceeding of, 2008, pp. 17–22.
|
||||
|
||||
[21] F. Trame`r, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and
|
||||
P. McDaniel, “Ensemble adversarial training: Attacks and defenses,”
|
||||
arXiv preprint arXiv:1705.07204, 2017.
|
||||
|
|
@ -0,0 +1,79 @@
|
|||
from data_load import data_format
|
||||
import tensorflow as tf
|
||||
import numpy as np
|
||||
from keras.layers import Dropout
|
||||
from keras import regularizers
|
||||
from keras.callbacks import TensorBoard, LearningRateScheduler
|
||||
import keras
|
||||
|
||||
|
||||
def model_train(X_train, X_test, Y_train, Y_test):
|
||||
"""_summary_
|
||||
Args:
|
||||
X_train (np.array): _description_
|
||||
X_test (np.array): _description_
|
||||
Y_train (np.array): _description_
|
||||
Y_test (np.array): _description_
|
||||
|
||||
"""
|
||||
# 数据随机化
|
||||
np.random.seed(7)
|
||||
np.random.shuffle(X_train)
|
||||
np.random.seed(7)
|
||||
np.random.shuffle(Y_train)
|
||||
tf.random.set_seed(7)
|
||||
|
||||
# 构建模型
|
||||
model = tf.keras.models.Sequential([
|
||||
tf.keras.layers.LSTM(100, return_sequences=True), # 第一层
|
||||
Dropout(0.2),
|
||||
tf.keras.layers.LSTM(80), # 第二层
|
||||
Dropout(0.2),
|
||||
tf.keras.layers.Dense(
|
||||
1, kernel_regularizer=regularizers.l2(0.01))
|
||||
])
|
||||
|
||||
# 损失函数
|
||||
loss_fn = tf.keras.losses.MeanSquaredError()
|
||||
|
||||
# 编译模型
|
||||
model.compile(
|
||||
optimizer='SGD',
|
||||
loss=loss_fn,
|
||||
metrics=[tf.keras.metrics.MeanAbsolutePercentageError()]
|
||||
)
|
||||
|
||||
# 定义学习率指数递减的函数
|
||||
def lr_schedule(epoch):
|
||||
initial_learning_rate = 0.01
|
||||
decay_rate = 0.1
|
||||
decay_steps = 2000
|
||||
new_learning_rate = initial_learning_rate * \
|
||||
decay_rate ** (epoch / decay_steps)
|
||||
return new_learning_rate
|
||||
|
||||
# 定义学习率调度器
|
||||
lr_scheduler = LearningRateScheduler(lr_schedule)
|
||||
|
||||
# TensorBoard 回调
|
||||
log_dir = "logs/fit"
|
||||
tensorboard_callback = TensorBoard(log_dir=log_dir, histogram_freq=1)
|
||||
|
||||
# 训练模型,添加 TensorBoard 回调
|
||||
model.fit(X_train, Y_train, epochs=6000,
|
||||
callbacks=[tensorboard_callback, lr_scheduler], batch_size=256)
|
||||
|
||||
loss, mape = model.evaluate(X_test, Y_test)
|
||||
print("Test loss:", loss,)
|
||||
print("test mape:", mape)
|
||||
|
||||
# 保存模型
|
||||
keras.models.save_model(model, 'model')
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
X_train, X_test, Y_train, Y_test = data_format(
|
||||
'data/archive/PowerQualityDistributionDataset1.csv', md = 1)
|
||||
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
|
||||
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], 1)
|
||||
model_train(X_train, X_test, Y_train, Y_test)
|
Binary file not shown.
Binary file not shown.
|
@ -0,0 +1,72 @@
|
|||
import tensorflow as tf
|
||||
|
||||
|
||||
def craft_adv(X, Y, gamma, learning_rate, model, loss_fn, md = 0):
|
||||
|
||||
# 将测试数据转换为TensorFlow张量
|
||||
X_test_tensor = tf.convert_to_tensor(X, dtype=tf.float64)
|
||||
if md == 0:
|
||||
Y_test_tensor = tf.convert_to_tensor(Y, dtype=tf.int32)
|
||||
elif md == 1:
|
||||
Y_test_tensor = tf.convert_to_tensor(Y, dtype=tf.float64)
|
||||
|
||||
# 初始化更新后的数据集
|
||||
X_train_updated = []
|
||||
|
||||
for i in range(X_test_tensor.shape[0]):
|
||||
# 对每个样本使用GradientTape
|
||||
with tf.GradientTape() as tape:
|
||||
# 监视当前样本
|
||||
current_sample = X_test_tensor[i:i+1]
|
||||
tape.watch(current_sample)
|
||||
|
||||
# 对当前样本进行预测并计算损失
|
||||
predictions = model(current_sample)
|
||||
loss = loss_fn(Y_test_tensor[i:i+1], predictions)
|
||||
|
||||
# 计算关于输入的梯度
|
||||
gradients = tape.gradient(loss, current_sample)
|
||||
|
||||
# 平坦化梯度以便进行处理
|
||||
flattened_gradients = tf.reshape(gradients, [-1])
|
||||
|
||||
# 选择最大的γ * |X|个梯度
|
||||
num_gradients_to_select = int(gamma * tf.size(flattened_gradients, out_type=tf.dtypes.float32))
|
||||
top_gradients_indices = tf.argsort(flattened_gradients, direction='DESCENDING')[:num_gradients_to_select]
|
||||
|
||||
# 创建新的梯度张量,初始值为原始梯度
|
||||
updated_gradients = tf.identity(flattened_gradients)
|
||||
|
||||
# 创建布尔掩码,用于选择特定梯度
|
||||
mask = tf.ones_like(updated_gradients, dtype=bool)
|
||||
mask = tf.tensor_scatter_nd_update(mask, tf.expand_dims(top_gradients_indices, 1), tf.zeros_like(top_gradients_indices, dtype=bool))
|
||||
|
||||
# 应用掩码更新梯度
|
||||
updated_gradients = tf.where(mask, tf.zeros_like(updated_gradients), updated_gradients)
|
||||
|
||||
# 将梯度恢复到原始形状
|
||||
updated_gradients = tf.reshape(updated_gradients, tf.shape(gradients))
|
||||
|
||||
# 应用学习率到梯度
|
||||
scaled_gradients = learning_rate * updated_gradients
|
||||
|
||||
# 更新当前样本
|
||||
current_sample_updated = tf.add(current_sample, scaled_gradients)
|
||||
|
||||
# 将更新后的样本添加到列表中
|
||||
X_train_updated.append(current_sample_updated.numpy())
|
||||
|
||||
# 将列表转换为张量
|
||||
X_train_updated = tf.concat(X_train_updated, axis=0)
|
||||
|
||||
# 评估更新后的模型
|
||||
if md == 1:
|
||||
loss, mape = model.evaluate(X_train_updated, Y)
|
||||
print(f"Accuracy gamma: {gamma},learning:{learning_rate}", loss)
|
||||
|
||||
return X_train_updated, loss, mape
|
||||
elif md == 0:
|
||||
loss, accuracy = model.evaluate(X_train_updated, Y)
|
||||
print(f"Accuracy gamma: {gamma},learning:{learning_rate},accuracy{accuracy}" )
|
||||
|
||||
return X_train_updated, accuracy
|
|
@ -0,0 +1,125 @@
|
|||
import pandas as pd
|
||||
import numpy as np
|
||||
from sklearn.preprocessing import MinMaxScaler
|
||||
from sklearn.model_selection import train_test_split
|
||||
|
||||
|
||||
def data_format(data_path, is_column=False, rate=0.25, md=0):
|
||||
"""_summary_
|
||||
|
||||
Args:
|
||||
data_path (_type_): 数据路径
|
||||
is_column (bool, optional): 是否为列数据. Defaults to False.
|
||||
rate (float, optional): 实验集划分的比例. Defaults to 0.25.
|
||||
md:模式,0为分类,1为预测
|
||||
|
||||
Returns:X_train, X_test, Y_train, Y_test
|
||||
_type_: np.array
|
||||
"""
|
||||
if md == 0:
|
||||
# 读入数据
|
||||
X, Y = data_load_classify(data_path, is_column)
|
||||
|
||||
# 归一化数据
|
||||
sc = MinMaxScaler(feature_range=(-1, 1))
|
||||
X = sc.fit_transform(X)
|
||||
elif md == 1:
|
||||
# 读入数据
|
||||
X = data_load_forecast(data_path, is_column)
|
||||
|
||||
# 归一化数据
|
||||
sc = MinMaxScaler(feature_range=(-1, 1))
|
||||
X = sc.fit_transform(X)
|
||||
|
||||
# 分离Y
|
||||
# 分离第 128 个元素
|
||||
Y = X[:, -1]
|
||||
# 分离前 127 个元素
|
||||
X = X[:, :-1]
|
||||
|
||||
# 划分数据集,75%用于训练,25%用于测试
|
||||
X_train, X_test, Y_train, Y_test = train_test_split(
|
||||
X, Y, test_size=rate, random_state=7)
|
||||
|
||||
return X_train, X_test, Y_train, Y_test
|
||||
|
||||
|
||||
def data_load_classify(data_path, is_column=False):
|
||||
"""
|
||||
数据加载
|
||||
data_path: 数据路径
|
||||
is_column: 是否是列数据
|
||||
return:X,Y
|
||||
"""
|
||||
# 读取csv文件
|
||||
df = pd.read_csv(data_path)
|
||||
|
||||
# 进行数据清洗
|
||||
data_clean(df, is_column)
|
||||
|
||||
# 去除第一列
|
||||
df = df.drop(df.columns[0], axis=1)
|
||||
|
||||
# 初始化X,Y
|
||||
X, Y = [], []
|
||||
|
||||
# 遍历DataFrame的每一行
|
||||
for index, row in df.iterrows():
|
||||
# 获取前128个数据项
|
||||
X.append(row.iloc[0:128])
|
||||
Y.append(int(row.iloc[128]))
|
||||
|
||||
return np.array(X), np.array(Y)
|
||||
|
||||
|
||||
def data_load_forecast(data_path, is_column=False):
|
||||
"""
|
||||
数据加载
|
||||
data_path: 数据路径
|
||||
is_column: 是否是列数据
|
||||
return:X,Y
|
||||
"""
|
||||
# 读取csv文件
|
||||
df = pd.read_csv(data_path)
|
||||
|
||||
# 进行数据清洗
|
||||
data_clean(df, is_column)
|
||||
df = df[df['output'] == 1]
|
||||
|
||||
# 去除第一列
|
||||
df = df.drop(df.columns[0], axis=1)
|
||||
|
||||
# 初始化X,Y
|
||||
X= []
|
||||
|
||||
# 遍历DataFrame的每一行
|
||||
for index, row in df.iterrows():
|
||||
# 获取前127个数据项
|
||||
X.append(row.iloc[0:128])
|
||||
|
||||
return np.array(X)
|
||||
|
||||
|
||||
def data_clean(data, is_column=False):
|
||||
"""_summary_
|
||||
|
||||
Args:
|
||||
data (_type_): csv数据
|
||||
is_column (bool, optional): 清除含有NaN数据的列. Defaults to False.即清除含有NaN数据的行
|
||||
|
||||
Returns:
|
||||
_type_: 清洗过的数据
|
||||
"""
|
||||
if not is_column:
|
||||
data = data.dropna(axis=0)
|
||||
return data
|
||||
else:
|
||||
data = data.dropna(axis=1)
|
||||
return data
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 加载数据
|
||||
X_train, X_test, Y_train, Y_test = data_format(
|
||||
'data/archive/PowerQualityDistributionDataset1.csv', md = 1)
|
||||
print(Y_train)
|
|
@ -0,0 +1,71 @@
|
|||
import matplotlib.pyplot as plt
|
||||
import tensorflow as tf
|
||||
import numpy as np
|
||||
import keras
|
||||
from data_load import data_format
|
||||
from attack_craft import craft_adv
|
||||
|
||||
md = 0
|
||||
print("请输入:0或1\n0为攻击全连接层模型的结果\n1为攻击LSTM(RNN)模型的结果")
|
||||
md = int(input())
|
||||
|
||||
|
||||
# 加载数据集
|
||||
X_train, X_test, Y_train, Y_test = data_format(
|
||||
'data/archive/PowerQualityDistributionDataset1.csv', md=md)
|
||||
|
||||
# 设置随机种子以确保重现性
|
||||
np.random.seed(7)
|
||||
np.random.shuffle(X_test)
|
||||
np.random.seed(7)
|
||||
np.random.shuffle(Y_test)
|
||||
tf.random.set_seed(7)
|
||||
|
||||
if md == 1:
|
||||
# 加载训练好的模型
|
||||
model = keras.models.load_model('model_rnn')
|
||||
|
||||
# 定义损失函数
|
||||
loss_fn = tf.keras.losses.MeanSquaredError()
|
||||
elif md == 0:
|
||||
# 加载训练好的模型
|
||||
model = keras.models.load_model('model_normal')
|
||||
|
||||
# 定义损失函数
|
||||
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
|
||||
|
||||
# 用于存储不同gamma值下的准确率
|
||||
accuracy_per_gamma = {}
|
||||
|
||||
# 遍历不同的gamma值
|
||||
for gamma in [0.05, 0.1, 0.2, 0.4]:
|
||||
# 遍历不同的学习率
|
||||
# 用于存储不同学习率下的准确率
|
||||
accuracy_list = []
|
||||
for learning_rate in [0.1, 0.2, 0.3, 0.4, 0.5]:
|
||||
if md == 1:
|
||||
x_adv, loss, mape = craft_adv(
|
||||
X_test, Y_test, gamma, learning_rate, model, loss_fn, md = 1)
|
||||
accuracy_list.append(100 - mape)
|
||||
elif md == 0:
|
||||
x_adv, accuracy = craft_adv(
|
||||
X_test, Y_test, gamma, learning_rate, model, loss_fn)
|
||||
accuracy_list.append(accuracy)
|
||||
# 存储每个gamma值下的准确率
|
||||
accuracy_per_gamma[gamma] = accuracy_list
|
||||
|
||||
# 定义学习率和gamma值
|
||||
learning_rates = [0.1, 0.2, 0.3, 0.4, 0.5]
|
||||
gammas = [0.05, 0.1, 0.2, 0.4]
|
||||
|
||||
# 创建并绘制结果图
|
||||
plt.figure(figsize=(10, 6))
|
||||
for gamma in gammas:
|
||||
plt.plot(learning_rates,
|
||||
accuracy_per_gamma[gamma], marker='o', label=f'Gamma={gamma}')
|
||||
|
||||
plt.title('Accuracy vs Learning Rate for Different Gammas')
|
||||
plt.xlabel('Learning Rate')
|
||||
plt.ylabel('Accuracy')
|
||||
plt.legend()
|
||||
plt.show()
|
|
@ -0,0 +1,75 @@
|
|||
from data_load import data_format
|
||||
import tensorflow as tf
|
||||
import numpy as np
|
||||
from keras.layers import Dropout
|
||||
from keras import regularizers
|
||||
from keras.callbacks import TensorBoard, LearningRateScheduler
|
||||
import keras
|
||||
|
||||
|
||||
def model_train(X_train, X_test, Y_train, Y_test):
|
||||
"""_summary_
|
||||
Args:
|
||||
X_train (np.array): _description_
|
||||
X_test (np.array): _description_
|
||||
Y_train (np.array): _description_
|
||||
Y_test (np.array): _description_
|
||||
|
||||
"""
|
||||
# 数据随机化
|
||||
np.random.seed(7)
|
||||
np.random.shuffle(X_train)
|
||||
np.random.seed(7)
|
||||
np.random.shuffle(Y_train)
|
||||
tf.random.set_seed(7)
|
||||
|
||||
# 构建模型
|
||||
model = tf.keras.models.Sequential([
|
||||
tf.keras.layers.Dense(10000, activation='relu'), # 第一层
|
||||
Dropout(0.2),
|
||||
tf.keras.layers.Dense(800, activation='relu'), # 第一层
|
||||
Dropout(0.2),
|
||||
tf.keras.layers.Dense(
|
||||
(len(np.unique(Y_train)) + 1), activation='relu', kernel_regularizer=regularizers.l2(0.01))
|
||||
])
|
||||
|
||||
# 损失函数
|
||||
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
|
||||
|
||||
# 编译模型
|
||||
model.compile(
|
||||
optimizer='SGD',
|
||||
loss=loss_fn,
|
||||
metrics=['accuracy'])
|
||||
|
||||
# 定义学习率指数递减的函数
|
||||
def lr_schedule(epoch):
|
||||
initial_learning_rate = 0.01
|
||||
decay_rate = 0.1
|
||||
decay_steps = 1500
|
||||
new_learning_rate = initial_learning_rate * \
|
||||
decay_rate ** (epoch / decay_steps)
|
||||
return new_learning_rate
|
||||
|
||||
# 定义学习率调度器
|
||||
lr_scheduler = LearningRateScheduler(lr_schedule)
|
||||
|
||||
# TensorBoard 回调
|
||||
log_dir = "logs/fit"
|
||||
tensorboard_callback = TensorBoard(log_dir=log_dir, histogram_freq=1)
|
||||
|
||||
# 训练模型,添加 TensorBoard 回调
|
||||
model.fit(X_train, Y_train, epochs=1000,
|
||||
callbacks=[tensorboard_callback, lr_scheduler], batch_size=256)
|
||||
|
||||
loss, accuracy = model.evaluate(X_test, Y_test)
|
||||
print("Test accuracy:", accuracy)
|
||||
|
||||
# 保存模型
|
||||
keras.models.save_model(model, 'model')
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
X_train, X_test, Y_train, Y_test = data_format(
|
||||
'data/archive/PowerQualityDistributionDataset1.csv')
|
||||
model_train(X_train, X_test, Y_train, Y_test)
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,86 @@
|
|||
from data_load import data_format
|
||||
import tensorflow as tf
|
||||
import numpy as np
|
||||
from keras.layers import Dropout
|
||||
from keras import regularizers
|
||||
from keras.callbacks import TensorBoard, LearningRateScheduler
|
||||
import keras
|
||||
|
||||
|
||||
def model_train(X_train, X_test, Y_train, Y_test):
|
||||
"""_summary_
|
||||
Args:
|
||||
X_train (np.array): _description_
|
||||
X_test (np.array): _description_
|
||||
Y_train (np.array): _description_
|
||||
Y_test (np.array): _description_
|
||||
|
||||
"""
|
||||
# 数据随机化
|
||||
np.random.seed(7)
|
||||
np.random.shuffle(X_train)
|
||||
np.random.seed(7)
|
||||
np.random.shuffle(Y_train)
|
||||
tf.random.set_seed(7)
|
||||
|
||||
# 构建模型
|
||||
model = tf.keras.models.Sequential([
|
||||
tf.keras.layers.LSTM(100, return_sequences=True), # 第一层
|
||||
Dropout(0.2),
|
||||
tf.keras.layers.LSTM(80), # 第二层
|
||||
Dropout(0.2),
|
||||
tf.keras.layers.Dense(
|
||||
1, kernel_regularizer=regularizers.l2(0.01))
|
||||
])
|
||||
|
||||
# 损失函数
|
||||
loss_fn = tf.keras.losses.MeanSquaredError()
|
||||
|
||||
# 编译模型
|
||||
model.compile(
|
||||
optimizer='SGD',
|
||||
loss=loss_fn)
|
||||
|
||||
# 定义学习率指数递减的函数
|
||||
def lr_schedule(epoch):
|
||||
initial_learning_rate = 0.01
|
||||
decay_rate = 0.1
|
||||
decay_steps = 1500
|
||||
new_learning_rate = initial_learning_rate * \
|
||||
decay_rate ** (epoch / decay_steps)
|
||||
return new_learning_rate
|
||||
|
||||
# 定义学习率调度器
|
||||
lr_scheduler = LearningRateScheduler(lr_schedule)
|
||||
|
||||
# TensorBoard 回调
|
||||
log_dir = "logs/fit"
|
||||
tensorboard_callback = TensorBoard(log_dir=log_dir, histogram_freq=1)
|
||||
|
||||
# EarlyStopping 回调
|
||||
early_stopping_callback = tf.keras.callbacks.EarlyStopping(
|
||||
monitor='val_loss', # 监控模型的验证集损失
|
||||
patience=10, # 设置“忍耐”周期,例如10个epoch
|
||||
min_delta=0.001, # 表示观察到的最小改变量,小于这个量的改变被认为是没有显著改善
|
||||
mode='min', # 'min' 表示监控量(loss)减小被认为是改善
|
||||
verbose=1 # 打印信息
|
||||
)
|
||||
|
||||
|
||||
# 训练模型,添加 TensorBoard 回调
|
||||
model.fit(X_train, Y_train, epochs=1000,
|
||||
callbacks=[tensorboard_callback, lr_scheduler, early_stopping_callback], batch_size=256, validation_split=0.2)
|
||||
|
||||
loss = model.evaluate(X_test, Y_test)
|
||||
print("Test loss:", loss)
|
||||
|
||||
# 保存模型
|
||||
keras.models.save_model(model, 'model')
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
X_train, X_test, Y_train, Y_test = data_format(
|
||||
'data/archive/PowerQualityDistributionDataset1.csv')
|
||||
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
|
||||
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1], 1)
|
||||
model_train(X_train, X_test, Y_train, Y_test)
|
Binary file not shown.
Binary file not shown.
|
@ -0,0 +1,49 @@
|
|||
import tensorflow as tf
|
||||
|
||||
|
||||
def craft_adv(X, Y, gamma, learning_rate, model, loss_fn):
|
||||
|
||||
# 将测试数据转换为TensorFlow张量
|
||||
X_test_tensor = tf.convert_to_tensor(X, dtype=tf.float64)
|
||||
Y_test_tensor = tf.convert_to_tensor(Y, dtype=tf.float64)
|
||||
|
||||
# 使用GradientTape计算梯度
|
||||
with tf.GradientTape() as tape:
|
||||
tape.watch(X_test_tensor)
|
||||
predictions = model(X_test_tensor)
|
||||
loss = loss_fn(Y_test_tensor, predictions)
|
||||
|
||||
# 计算关于输入的梯度
|
||||
gradients = tape.gradient(loss, X_test_tensor)
|
||||
|
||||
# 平坦化梯度以便进行处理
|
||||
flattened_gradients = tf.reshape(gradients, [-1])
|
||||
|
||||
# 选择最大的γ * |X|个梯度
|
||||
num_gradients_to_select = int(gamma * tf.size(flattened_gradients, out_type=tf.dtypes.float32))
|
||||
top_gradients_indices = tf.argsort(flattened_gradients, direction='DESCENDING')[:num_gradients_to_select]
|
||||
|
||||
# 创建新的梯度张量,初始值为原始梯度
|
||||
updated_gradients = tf.identity(flattened_gradients)
|
||||
|
||||
# 创建布尔掩码,用于选择特定梯度
|
||||
mask = tf.ones_like(updated_gradients, dtype=bool)
|
||||
mask = tf.tensor_scatter_nd_update(mask, tf.expand_dims(top_gradients_indices, 1), tf.zeros_like(top_gradients_indices, dtype=bool))
|
||||
|
||||
# 应用掩码更新梯度
|
||||
updated_gradients = tf.where(mask, tf.zeros_like(updated_gradients), updated_gradients)
|
||||
|
||||
# 将梯度恢复到原始形状
|
||||
updated_gradients = tf.reshape(updated_gradients, tf.shape(gradients))
|
||||
|
||||
# 应用学习率到梯度
|
||||
scaled_gradients = (learning_rate * 700) * updated_gradients
|
||||
# 更新X_test_tensor
|
||||
X_train_updated = tf.add(X_test_tensor, scaled_gradients)
|
||||
X_train_updated = X_train_updated.numpy()
|
||||
|
||||
# 评估更新后的模型
|
||||
loss = model.evaluate(X_train_updated, Y)
|
||||
print(f"Accuracy gamma: {gamma},learning:{learning_rate}", loss)
|
||||
|
||||
return X_train_updated, loss
|
|
@ -0,0 +1,89 @@
|
|||
import pandas as pd
|
||||
import numpy as np
|
||||
from sklearn.preprocessing import MinMaxScaler
|
||||
from sklearn.model_selection import train_test_split
|
||||
|
||||
|
||||
def data_format(data_path, is_column=False, rate=0.25):
|
||||
"""_summary_
|
||||
|
||||
Args:
|
||||
data_path (_type_): 数据路径
|
||||
is_column (bool, optional): 是否为列数据. Defaults to False.
|
||||
rate (float, optional): 实验集划分的比例. Defaults to 0.25.
|
||||
md:模式,0为分类,1为预测
|
||||
|
||||
Returns:X_train, X_test, Y_train, Y_test
|
||||
_type_: np.array
|
||||
"""
|
||||
# 读入数据
|
||||
X = data_load_forecast(data_path, is_column)
|
||||
|
||||
# 归一化数据
|
||||
sc = MinMaxScaler(feature_range=(-1, 1))
|
||||
X = sc.fit_transform(X)
|
||||
|
||||
# 分离Y
|
||||
# 分离第 128 个元素
|
||||
Y = X[:, -1]
|
||||
# 分离前 127 个元素
|
||||
X = X[:, :-1]
|
||||
|
||||
# 划分数据集,75%用于训练,25%用于测试
|
||||
X_train, X_test, Y_train, Y_test = train_test_split(
|
||||
X, Y, test_size=rate, random_state=7)
|
||||
|
||||
return X_train, X_test, Y_train, Y_test
|
||||
|
||||
|
||||
def data_load_forecast(data_path, is_column=False):
|
||||
"""
|
||||
数据加载
|
||||
data_path: 数据路径
|
||||
is_column: 是否是列数据
|
||||
return:X,Y
|
||||
"""
|
||||
# 读取csv文件
|
||||
df = pd.read_csv(data_path)
|
||||
|
||||
# 进行数据清洗
|
||||
data_clean(df, is_column)
|
||||
df = df[df['output'] == 1]
|
||||
|
||||
# 去除第一列
|
||||
df = df.drop(df.columns[0], axis=1)
|
||||
|
||||
# 初始化X,Y
|
||||
X = []
|
||||
|
||||
# 遍历DataFrame的每一行
|
||||
for index, row in df.iterrows():
|
||||
# 获取前127个数据项
|
||||
X.append(row.iloc[0:128])
|
||||
|
||||
return np.array(X)
|
||||
|
||||
|
||||
def data_clean(data, is_column=False):
|
||||
"""_summary_
|
||||
|
||||
Args:
|
||||
data (_type_): csv数据
|
||||
is_column (bool, optional): 清除含有NaN数据的列. Defaults to False.即清除含有NaN数据的行
|
||||
|
||||
Returns:
|
||||
_type_: 清洗过的数据
|
||||
"""
|
||||
if not is_column:
|
||||
data = data.dropna(axis=0)
|
||||
return data
|
||||
else:
|
||||
data = data.dropna(axis=1)
|
||||
return data
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
# 加载数据
|
||||
X_train, X_test, Y_train, Y_test = data_format(
|
||||
'data/archive/PowerQualityDistributionDataset1.csv')
|
||||
print(X_train.shape)
|
|
@ -0,0 +1,36 @@
|
|||
import matplotlib.pyplot as plt
|
||||
import tensorflow as tf
|
||||
import numpy as np
|
||||
import keras
|
||||
from data_load import data_format
|
||||
from attack_craft import craft_adv
|
||||
|
||||
|
||||
# 加载数据集
|
||||
X_train, X_test, Y_train, Y_test = data_format(
|
||||
'data/archive/PowerQualityDistributionDataset1.csv')
|
||||
|
||||
# 设置随机种子以确保重现性
|
||||
np.random.seed(7)
|
||||
np.random.shuffle(X_test)
|
||||
np.random.seed(7)
|
||||
np.random.shuffle(Y_test)
|
||||
tf.random.set_seed(7)
|
||||
|
||||
|
||||
# 加载训练好的模型
|
||||
model = keras.models.load_model('model')
|
||||
|
||||
model_adv = keras.models.load_model('model_adv')
|
||||
|
||||
# 定义损失函数
|
||||
loss_fn = tf.keras.losses.MeanSquaredError()
|
||||
|
||||
x_adv, loss = craft_adv(
|
||||
X_test, Y_test, 0.4, 0.5, model, loss_fn)
|
||||
|
||||
loss_adv = model_adv.evaluate(x_adv, Y_test)
|
||||
|
||||
print(f"原始模型:{loss},对抗训练后的模型:{loss_adv}")
|
||||
|
||||
|
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
File diff suppressed because one or more lines are too long
Binary file not shown.
Binary file not shown.
Binary file not shown.
File diff suppressed because one or more lines are too long
Binary file not shown.
Binary file not shown.
Binary file not shown.
File diff suppressed because one or more lines are too long
Binary file not shown.
Binary file not shown.
Binary file not shown.
File diff suppressed because one or more lines are too long
Binary file not shown.
Binary file not shown.
Binary file not shown.
|
@ -0,0 +1,16 @@
|
|||
#build the NN models: RNN module
|
||||
import tensorflow
|
||||
|
||||
from keras.models import Sequential
|
||||
from keras.layers import Dense, Dropout, Activation, Flatten
|
||||
|
||||
def dnn_model(input_dim):
|
||||
model = Sequential()
|
||||
model.add(Dense(128, input_dim=input_dim))
|
||||
model.add(Dropout(0.2))
|
||||
model.add(Dense(32))
|
||||
model.add(Activation('relu'))
|
||||
model.add(Dense(16))
|
||||
model.add(Activation('relu'))
|
||||
model.add(Dense(4,init='normal', activation='softmax'))
|
||||
return model
|
|
@ -0,0 +1 @@
|
|||
This is the code example for attacking a normal Neural Networks with adversarial inputs.
|
|
@ -0,0 +1,133 @@
|
|||
import tensorflow as tf
|
||||
import keras
|
||||
from keras.optimizers import SGD
|
||||
from Neural_Net_Module import dnn_model
|
||||
import csv
|
||||
from numpy import shape
|
||||
import numpy as np
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
batch_size=32
|
||||
nb_epoch=15
|
||||
eps=0.5
|
||||
gamma=80
|
||||
|
||||
def scaled_gradient(x, y, predictions):
|
||||
#loss: the mean of loss(cross entropy)
|
||||
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=predictions, labels=y))
|
||||
grad, = tf.gradients(loss, x)
|
||||
signed_grad = tf.sign(grad)
|
||||
return grad, signed_grad
|
||||
|
||||
if __name__ == '__main__':
|
||||
if keras.backend.image_dim_ordering() != 'tf':
|
||||
keras.backend.set_image_dim_ordering('tf')
|
||||
|
||||
sess = tf.Session()
|
||||
keras.backend.set_session(sess)
|
||||
|
||||
|
||||
with open('normal.csv', 'r') as csvfile:
|
||||
reader = csv.reader(csvfile)
|
||||
rows = [row for row in reader]
|
||||
rows=np.array(rows, dtype=float)
|
||||
data=rows
|
||||
label=np.zeros((200,1))
|
||||
with open('sag.csv', 'r') as csvfile:
|
||||
reader = csv.reader(csvfile)
|
||||
rows = [row for row in reader]
|
||||
rows=np.array(rows, dtype=float)
|
||||
data=np.concatenate((data, rows))
|
||||
labels=np.ones((200,1))
|
||||
label=np.concatenate((label, labels))
|
||||
with open('distortion.csv', 'r') as csvfile:
|
||||
reader = csv.reader(csvfile)
|
||||
rows = [row for row in reader]
|
||||
rows=np.array(rows, dtype=float)
|
||||
data=np.concatenate((data, rows))
|
||||
labels=2*np.ones((200,1))
|
||||
label=np.concatenate((label, labels))
|
||||
with open('impulse.csv', 'r') as csvfile:
|
||||
reader = csv.reader(csvfile)
|
||||
rows = [row for row in reader]
|
||||
rows=np.array(rows, dtype=float)
|
||||
data=np.concatenate((data, rows))
|
||||
labels=3*np.ones((200,1))
|
||||
label=np.concatenate((label, labels))
|
||||
label=label.reshape(-1,1)
|
||||
label=keras.utils.to_categorical(label, num_classes=None)
|
||||
print("Input label shape", shape(label))
|
||||
print("Input data shape", shape(data))
|
||||
|
||||
index = np.arange(len(label))
|
||||
np.random.shuffle(index)
|
||||
label = label[index]
|
||||
data = data[index]
|
||||
|
||||
trX=data[:600]
|
||||
trY=label[:600]
|
||||
teX=data[600:]
|
||||
teY=label[600:]
|
||||
|
||||
x = tf.placeholder(tf.float32, shape=(None, 1000))
|
||||
y = tf.placeholder(tf.float32, shape=(None, 4))
|
||||
|
||||
model = dnn_model(input_dim=1000)
|
||||
predictions = model(x)
|
||||
sgd = SGD(lr=0.0001, decay=1e-6, momentum=0.9, nesterov=True)
|
||||
model.compile(loss=keras.losses.categorical_crossentropy,
|
||||
optimizer=keras.optimizers.Adadelta(),
|
||||
metrics=['accuracy'])
|
||||
|
||||
model.fit(trX, trY, batch_size=batch_size, epochs=nb_epoch, shuffle=True) # validation_split=0.1
|
||||
# model.save_weights('dnn_clean.h5')
|
||||
score = model.evaluate(teX, teY, verbose=0)
|
||||
print('Test loss:', score[0])
|
||||
print('Test accuracy:', score[1])
|
||||
|
||||
|
||||
with sess.as_default():
|
||||
adv_sample=[]
|
||||
counter = 0
|
||||
# Initialize the SGD optimizer
|
||||
grad, sign_grad = scaled_gradient(x, y, predictions)
|
||||
for q in range(200):
|
||||
if counter % 50 == 0 and counter > 0:
|
||||
print("Attack on samples" + str(counter))
|
||||
X_new_group=np.copy(teX[counter])
|
||||
gradient_value, signed_grad = sess.run([grad, sign_grad], feed_dict={x: X_new_group.reshape(-1,1000),
|
||||
y: teY[counter].reshape(-1,4),
|
||||
keras.backend.learning_phase(): 0})
|
||||
saliency_mat = np.abs(gradient_value)
|
||||
saliency_mat = (saliency_mat > np.percentile(np.abs(gradient_value), [gamma])).astype(int)
|
||||
X_new_group = X_new_group + np.multiply(eps * signed_grad, saliency_mat)
|
||||
adv_sample.append(X_new_group)
|
||||
'''print("Ground truth", teY[counter])
|
||||
print(model.predict(teX[counter].reshape(-1, 1000)))
|
||||
print(model.predict(X_new_group.reshape(-1,1000)))
|
||||
plt.plot(teX[counter])
|
||||
plt.show()
|
||||
plt.plot(X_new_group.reshape(-1,1),'r')
|
||||
plt.show()'''
|
||||
|
||||
|
||||
counter+=1
|
||||
adv_sample=np.array(adv_sample, dtype=float).reshape(-1,1000)
|
||||
score=model.evaluate(adv_sample, teY, verbose=0)
|
||||
print('Test loss:', score[0])
|
||||
print('Test accuracy:', score[1])
|
||||
|
||||
teY_pred=np.argmax(model.predict(teX, batch_size=32), axis=1)
|
||||
adv_pred=np.argmax(model.predict(adv_sample, batch_size=32), axis=1)
|
||||
'''with open('test_true.csv', 'w') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(np.argmax(teY, axis=1).reshape(-1, 1))
|
||||
|
||||
with open('test_pred.csv', 'w') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(teY_pred.reshape(-1, 1))
|
||||
|
||||
with open('test_adversary.csv', 'w') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(adv_pred.reshape(-1, 1))'''
|
||||
|
|
@ -0,0 +1,90 @@
|
|||
import numpy as np
|
||||
import csv
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
|
||||
def voltage_sag(signal, level):
|
||||
x=np.linspace(4*np.pi, 10*np.pi, 300)
|
||||
y=level*np.sin(x)
|
||||
signal[200:500]=y
|
||||
return signal
|
||||
|
||||
def voltage_distortion(signal, level):
|
||||
noise=np.random.normal(loc=0, scale=level, size=np.shape(signal))
|
||||
signal+=noise
|
||||
#plt.plot(signal)
|
||||
#plt.show()
|
||||
|
||||
return signal
|
||||
|
||||
|
||||
def voltage_impulse(signal, level):
|
||||
noise=np.random.normal(loc=0, scale=level, size=np.shape(signal))
|
||||
signal[400:420]+=noise[400:420]
|
||||
#plt.plot(signal)
|
||||
#plt.show()
|
||||
|
||||
return signal
|
||||
|
||||
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
x = np.linspace(0 * np.pi, 20* np.pi, 1000)
|
||||
y = np.sin(x)
|
||||
signal=y
|
||||
signal_all=[]
|
||||
'''for i in range(200):
|
||||
levels=np.random.uniform(0.5, 0.9)
|
||||
#print(levels)
|
||||
signals=voltage_sag(signal, level=levels)
|
||||
#plt.plot(signals)
|
||||
#plt.show()
|
||||
signal_all.append(np.copy(signals))
|
||||
signal_all=np.array(signal_all, dtype=float).reshape(-1,1000)
|
||||
with open('sag.csv', 'w') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(signal_all)'''
|
||||
|
||||
|
||||
'''signal_all=[]
|
||||
for i in range(200):
|
||||
levels=np.random.uniform(0.0, 0.1)
|
||||
#print(levels)
|
||||
signals=voltage_distortion(signal, level=levels)
|
||||
#plt.plot(signals)
|
||||
#plt.show()
|
||||
signal_all.append(np.copy(signals))
|
||||
signal_all=np.array(signal_all, dtype=float).reshape(-1,1000)
|
||||
with open('distortion.csv', 'w') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(signal_all)'''
|
||||
|
||||
|
||||
'''signal_all=[]
|
||||
for i in range(200):
|
||||
levels=np.random.uniform(0.5, 0.8)
|
||||
#print(levels)
|
||||
signals=voltage_impulse(signal, level=levels)
|
||||
#plt.plot(signals)
|
||||
#plt.show()
|
||||
signal_all.append(np.copy(signals))
|
||||
signal_all=np.array(signal_all, dtype=float).reshape(-1,1000)
|
||||
with open('impulse.csv', 'w') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(signal_all)'''
|
||||
|
||||
signal_all = []
|
||||
for i in range(200):
|
||||
levels = np.random.uniform(0.0, 0.01)
|
||||
# print(levels)
|
||||
signals = voltage_distortion(signal, level=levels)
|
||||
# plt.plot(signals)
|
||||
# plt.show()
|
||||
signal_all.append(np.copy(signals))
|
||||
signal_all = np.array(signal_all, dtype=float).reshape(-1, 1000)
|
||||
with open('normal.csv', 'w') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(signal_all)
|
||||
|
||||
|
|
@ -0,0 +1,244 @@
|
|||
import tensorflow as tf
|
||||
import keras
|
||||
from keras.optimizers import SGD
|
||||
from Neural_Net_Module import rnn_model
|
||||
import csv
|
||||
import matplotlib.pyplot as plt
|
||||
from utils import *
|
||||
|
||||
lr = 0.01
|
||||
batch_size = 200
|
||||
nb_epoch = 10
|
||||
controllable_dim = 16
|
||||
seq_length = 10
|
||||
TEMP_MAX = 24
|
||||
TEMP_MIN = 19
|
||||
eps=0.03
|
||||
gamma=90
|
||||
|
||||
|
||||
def scaled_gradient(x, predictions, target):
|
||||
loss = tf.square(predictions - target)
|
||||
# Take gradient with respect to x_{T}, since it contains all the x value needs to be updated
|
||||
grad, = tf.gradients(loss, x)
|
||||
signed_grad = tf.sign(grad)
|
||||
# Define the gradient of log barrier function on constraints
|
||||
#grad_comfort_high = 1 / ((tref_high - tset))
|
||||
#grad_comfort_low = 1 / ((tset - tref_low))
|
||||
#grad_contrained = grad[:, :, 0:16] + 0.000000001 * (grad_comfort_high + grad_comfort_low)
|
||||
return grad, signed_grad
|
||||
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
if keras.backend.image_dim_ordering() != 'tf':
|
||||
keras.backend.set_image_dim_ordering('tf')
|
||||
|
||||
sess = tf.Session()
|
||||
keras.backend.set_session(sess)
|
||||
|
||||
with open('building_data.csv', 'r') as csvfile:
|
||||
reader = csv.reader(csvfile)
|
||||
rows = [row for row in reader]
|
||||
rows = rows[1:43264]
|
||||
print("Dataset shape", shape(rows))
|
||||
rows = np.array(rows[1:], dtype=float)
|
||||
|
||||
feature_dim = rows.shape[1]
|
||||
print("Feature dimension", feature_dim)
|
||||
|
||||
# Normalize the feature and response
|
||||
max_value = np.max(rows, axis=0)
|
||||
print("Max power values: ", max_value)
|
||||
min_value = np.min(rows, axis=0)
|
||||
rows2 = (rows - min_value) / (max_value - min_value)
|
||||
|
||||
# Reorganize to the RNN-like sequence
|
||||
X_train, Y_train = reorganize(rows2[:, 0:feature_dim - 1], rows2[:, feature_dim - 1], seq_length=seq_length)
|
||||
print("Training data shape", shape(X_train))
|
||||
print("X_train None:", np.argwhere(np.isnan(X_train)))
|
||||
X_train = np.array(X_train, dtype=float)
|
||||
Y_train = np.array(Y_train, dtype=float)
|
||||
|
||||
# Test data: change here for real testing data
|
||||
Y_test = np.copy(Y_train[3500:])
|
||||
X_test = np.copy(X_train[3500:])
|
||||
X_train = X_train[:35000]
|
||||
Y_train = Y_train[:35000]
|
||||
print('Number of testing samples', Y_test.shape[0])
|
||||
print('Number of training samples', Y_train.shape[0])
|
||||
|
||||
# Define tensor
|
||||
x = tf.placeholder(tf.float32, shape=(None, seq_length, feature_dim - 1))
|
||||
y = tf.placeholder(tf.float32, shape=(None, 1))
|
||||
tset = tf.placeholder(tf.float32, shape=(None, seq_length, controllable_dim))
|
||||
target = tf.placeholder(tf.float32, shape=(None, 1))
|
||||
|
||||
# Define the tempture setpoint upper and lower bound
|
||||
temp_low = TEMP_MIN * np.ones((1, controllable_dim)) # temp setpoint lowest as 20
|
||||
temp_low = (temp_low - min_value[0:controllable_dim]) / (max_value[0:controllable_dim] - min_value[0:controllable_dim])
|
||||
temp_high = TEMP_MAX * np.ones((1, controllable_dim)) # temp setpoint highest as 25
|
||||
temp_high = (temp_high - min_value[0:controllable_dim]) / (max_value[0:controllable_dim] - min_value[0:controllable_dim])
|
||||
|
||||
# Define the RNN model, establish the graph and SGD solver
|
||||
model = rnn_model(seq_length=seq_length, input_dim=feature_dim - 1)
|
||||
predictions = model(x)
|
||||
sgd = SGD(lr=0.0001, decay=1e-6, momentum=0.9, nesterov=True)
|
||||
model.compile(loss='mean_squared_error', optimizer=sgd)
|
||||
|
||||
# Fit the RNN model with training data and save the model weight
|
||||
model.fit(X_train, Y_train, batch_size=batch_size, epochs=nb_epoch, shuffle=True) # validation_split=0.1
|
||||
# model.save_weights('rnn_clean.h5')
|
||||
model.load_weights('rnn_clean.h5')
|
||||
y_value = model.predict(X_test[0:5000], batch_size=32)
|
||||
|
||||
# Record the prediction result
|
||||
with open('predicted_rnn2.csv', 'w') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(y_value)
|
||||
|
||||
with open('truth_rnn2.csv', 'w') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(Y_test[0:5000])
|
||||
|
||||
# Plot the prediction result. This is the same as Building_Load_Forecasting.py
|
||||
t = np.arange(0, 2016)
|
||||
plt.plot(t, Y_test[216:216 + 2016], 'r--', label="True")
|
||||
plt.plot(t, y_value[216:216 + 2016], 'b', label="predicted")
|
||||
plt.legend(loc='northeast')
|
||||
ax = plt.gca() # grab the current axis
|
||||
ax.set_xticks(144 * np.arange(0, 14)) # choose which x locations to have ticks
|
||||
ax.set_xticklabels(["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat",
|
||||
"Sun"]) # set the labels to display at those ticks
|
||||
plt.title("Building electricity consumption")
|
||||
plt.show()
|
||||
print("Clean training completed!")
|
||||
print("Training percentage error:", np.mean(np.divide(abs(y_value - Y_train[0:5000]), Y_train[0:5000])))
|
||||
#model.save_weights('rnn_clean.h5')
|
||||
|
||||
# Optimization step starts here!
|
||||
|
||||
|
||||
X_new = []
|
||||
grad_new = []
|
||||
mpc_scope = seq_length
|
||||
X_train2 = np.copy(X_test)
|
||||
with sess.as_default():
|
||||
counter = 0
|
||||
# Initialize the SGD optimizer
|
||||
grad, sign_grad = scaled_gradient(x, predictions, target)
|
||||
for q in range(1000 - seq_length):
|
||||
if counter % 100 == 0 and counter > 0:
|
||||
print("Optimization Time step" + str(counter))
|
||||
|
||||
# Define the control output target
|
||||
#Y_target = (0 * Y_test[counter:counter + mpc_scope]).reshape(-1, 1)
|
||||
Y_target = Y_test[counter:counter + mpc_scope].reshape(-1, 1)
|
||||
|
||||
# upper and lower bound for controllable features
|
||||
X_upper_bound = np.tile(temp_high, (mpc_scope, seq_length, 1))
|
||||
X_lower_bound = np.tile(temp_low, (mpc_scope, seq_length, 1))
|
||||
|
||||
# Define input: x_t, x_{t+1},...,x_{t+pred_scope}
|
||||
X_input = X_train2[counter:counter + mpc_scope]
|
||||
#X_input = check_control_constraint(X_input, controllable_dim, X_upper_bound, X_lower_bound)
|
||||
X_controllable = X_input[:, :, 0:controllable_dim]
|
||||
# the uncontrollable part needs to be replaced by prediction later!!!
|
||||
X_uncontrollable = X_input[:, :, controllable_dim:feature_dim - 1]
|
||||
|
||||
X_new_group = X_input
|
||||
#print("X_new_group shape", shape(X_new_group))
|
||||
gradient_value, signed_grad = sess.run([grad, sign_grad], feed_dict={x: X_new_group,
|
||||
target: Y_target,
|
||||
tset: X_controllable,
|
||||
keras.backend.learning_phase(): 0})
|
||||
#print("sign_grad", signed_grad)
|
||||
#print(np.shape(signed_grad))
|
||||
saliency_mat=np.abs(gradient_value)
|
||||
saliency_mat=(saliency_mat>np.percentile(np.abs(gradient_value),[gamma])).astype(int)
|
||||
random_num=np.random.randint(0,2)
|
||||
if random_num==0:
|
||||
X_new_group = X_new_group + np.multiply(eps * signed_grad, saliency_mat)
|
||||
#X_new_group = X_new_group + eps * signed_grad
|
||||
else:
|
||||
X_new_group = X_new_group - np.multiply(eps * signed_grad, saliency_mat)
|
||||
#X_new_group = X_new_group - eps * signed_grad
|
||||
|
||||
# check the norm constraints on input
|
||||
#X_new_group = check_control_constraint(X_new_group, controllable_dim, X_upper_bound, X_lower_bound)
|
||||
y_new_group = model.predict(X_new_group)
|
||||
|
||||
if X_new == []:
|
||||
X_new = X_new_group[0].reshape([1, seq_length, feature_dim - 1])
|
||||
grad_new = gradient_value[0]
|
||||
else:
|
||||
X_new = np.concatenate((X_new, X_new_group[0].reshape([1, seq_length, feature_dim - 1])), axis=0)
|
||||
grad_new = np.concatenate((grad_new, gradient_value[0]), axis=0)
|
||||
|
||||
# Update the x value in the training data
|
||||
X_train2[counter] = X_new_group[0].reshape([1, seq_length, feature_dim - 1])
|
||||
for i in range(1, seq_length):
|
||||
X_train2[counter + i, 0:seq_length - i, :] = X_train2[counter, i:seq_length, :]
|
||||
|
||||
# Next time step
|
||||
counter += 1
|
||||
|
||||
|
||||
X_new = np.array(X_new, dtype=float)
|
||||
print("Adversarial X shape", shape(X_new))
|
||||
dime=55
|
||||
y_new = model.predict(X_new, batch_size=64)* (max_value[dime] - min_value[dime]) + min_value[dime]
|
||||
y_val=model.predict(X_test[:1000], batch_size=32)* (max_value[dime] - min_value[dime]) + min_value[dime]
|
||||
y_orig=Y_test[:1000]* (max_value[dime] - min_value[dime]) + min_value[dime]
|
||||
plt.plot(y_new,'r')
|
||||
plt.plot(y_val, 'g')
|
||||
plt.plot(y_orig,'b')
|
||||
plt.show()
|
||||
|
||||
|
||||
print("Adversary Forecast Error:", np.mean(np.clip(np.abs(y_new - y_orig[:990])/y_orig[:990], 0,3)))
|
||||
with open('test_true.csv', 'w') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(y_orig.reshape(-1,1))
|
||||
|
||||
with open('test_pred.csv', 'w') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(y_val.reshape(-1,1))
|
||||
|
||||
with open('test_adversary.csv', 'w') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(y_new.reshape(-1,1))
|
||||
|
||||
|
||||
#Observe the difference on input features and visualize
|
||||
deviation_all=0
|
||||
'''for dime in range(30):
|
||||
X_temp = rows[0:len(X_new), dime]
|
||||
X_temp_new = X_new[0:len(X_new), 0, dime] * (max_value[dime] - min_value[dime]) + min_value[dime]
|
||||
deviation=np.mean(np.abs(X_temp_new-X_temp)/(X_temp+0.0001))
|
||||
print(deviation)
|
||||
deviation_all+=deviation
|
||||
plt.plot(X_temp, 'r--', label="previous")
|
||||
plt.plot(X_temp_new, 'b', label="adversarial")
|
||||
plt.show()
|
||||
|
||||
print("The overall input features deviation: ", deviation_all/30.0)'''
|
||||
|
||||
dime=26
|
||||
X_temp = rows[0:len(X_new), dime].reshape(-1,1)
|
||||
X_temp_new = (X_new[0:len(X_new), 0, dime] * (max_value[dime] - min_value[dime]) + min_value[dime]).reshape(-1,1)
|
||||
plt.plot(X_temp, 'r--', label="previous")
|
||||
plt.plot(X_temp_new, 'b', label="adversarial")
|
||||
plt.show()
|
||||
with open('fea10_orig.csv', 'w') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(X_temp)
|
||||
|
||||
with open('fea10_adv.csv', 'w') as f:
|
||||
writer = csv.writer(f)
|
||||
writer.writerows(X_temp_new)
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
@ -0,0 +1,37 @@
|
|||
#build the NN models: RNN module
|
||||
import tensorflow
|
||||
from tensorflow.python.ops import control_flow_ops
|
||||
from keras.models import Sequential
|
||||
from keras.layers import Dense, Dropout, Activation, Flatten
|
||||
from keras.layers import Convolution2D, MaxPooling2D
|
||||
from keras.layers import LSTM, Embedding,SimpleRNN
|
||||
from keras.utils import np_utils
|
||||
from tensorflow.python.platform import flags
|
||||
from numpy import shape
|
||||
import numpy as np
|
||||
from skimage import io, color, exposure, transform
|
||||
import os
|
||||
import glob
|
||||
import h5py
|
||||
import pandas as pd
|
||||
import numpy
|
||||
|
||||
|
||||
FLAGS = flags.FLAGS
|
||||
#tensorflow.python.control_flow_ops =control_flow_ops
|
||||
|
||||
|
||||
def rnn_model(seq_length, input_dim):
|
||||
model = Sequential()
|
||||
model.add((SimpleRNN(64, input_shape=(seq_length, input_dim))))
|
||||
model.add(Dropout(0.2))
|
||||
model.add(Dense(64))
|
||||
model.add(Activation('relu'))
|
||||
model.add(Dense(32))
|
||||
model.add(Activation('relu'))
|
||||
model.add(Dense(16))
|
||||
model.add(Activation('relu'))
|
||||
model.add(Dense(1,init='normal'))
|
||||
return model
|
||||
|
||||
|
|
@ -0,0 +1,20 @@
|
|||
# Power_adversary
|
||||
The code repo for Is Machine Learning in Power Systems Vulnerable?
|
||||
|
||||
Paper accepted to SmartGridComm2018, Workshop on AI in Energy Systems.
|
||||
|
||||
Authors: Yize Chen, Yushi Tan and Deepjyoti Deka
|
||||
|
||||
University of Washington and Los Alamos National Laboratory.
|
||||
|
||||
## Introduction
|
||||
|
||||
We look into the vulnerabilities of ML algorithms in power systems, and craft specific attacks on power systems with different applications.
|
||||
|
||||
|
||||
|
||||
## Usage
|
||||
|
||||
To exploit the algorithmic vulnerabilities, we consider the classification and forecasting case in power systems. Directly run the Python files and test the model accuracy before and after the attack.
|
||||
|
||||
Contact: yizechen@uw.edu
|
|
@ -0,0 +1,38 @@
|
|||
# Defense strategies from literature
|
||||
|
||||
## Manipulating ML: poisoning attacks and countermeasure for regression learning
|
||||
|
||||
### System model
|
||||
|
||||
The function is chosen to minimize a quadratic loss function:
|
||||
$
|
||||
\mathcal{L}(\mathcal{D}_{tr}, \mathsf{\theta}) = \frac{1}{n}\sum_{i = 1}^{n} (f(\mathsf{x}_i, \mathsf{\theta}) - y_i)^2 + \lambda \Omega(\mathsf{w})
|
||||
$
|
||||
|
||||
### Adversarial modeling
|
||||
|
||||
The goal is to corrupt the learning model generated in the training phase so that the predictions on new data will be modified in the testing phase. Two setup are considered, *white-box* and *black-box* attacks. In *black-box* attacks, the attackers has no knowledge of the training set $\mathcal{D}_{tr}$ but can collect a substitute data set $\mathcal{D}_{tr}^{\prime}$. The feature set and the learning algorithm are know, while the training parameters are not.
|
||||
|
||||
The *white-box* attack eventually could be modeled as
|
||||
$$
|
||||
\arg \max_{\mathcal{D}_p} \;\, \mathcal{W}(\mathcal{D}^{\prime}, \mathsf{\theta}_p^{\ast}) \\
|
||||
\;\,\;\,\;\,\;\, s.t. \;\, \mathsf{\theta}_p^{\ast} \in \arg \min_{\mathsf{\theta}} \mathcal{L(\mathcal{D}_{tr} \cup \mathcal{D}_{p}, \mathsf{\theta})}
|
||||
$$
|
||||
In the *black-box* setting, the poisoned regression parameters $\mathsf{\theta}_{p}^{\ast}$ are estimated using the substitute data.
|
||||
|
||||
### Attack Methods
|
||||
|
||||
|
||||
|
||||
### Comments
|
||||
|
||||
1. The attack model is kind of different if I understand correctly. The aim is to come up with an additional data set so that the “optimized” parameter would fail on any intact data set.
|
||||
2. And the set up is like the breakdown point, while the major difference is the evaluation. Breakdown point evaluates the parameter, but this setup evaluates the performance on “test” set.
|
||||
3. This setup intrinsically attacks the fitting strategy, rather than a specific model.
|
||||
4. And it uses the bi-level stackelberg game.
|
||||
5. The defense strategy is still, more or less, the conventional trimmed loss.
|
||||
|
||||
|
||||
|
||||
## The space of transferable adversarial examples
|
||||
|
File diff suppressed because it is too large
Load Diff
|
@ -0,0 +1,28 @@
|
|||
from numpy import shape
|
||||
import numpy as np
|
||||
|
||||
|
||||
|
||||
def reorganize(X_train, Y_train, seq_length):
|
||||
# Organize the input and output to feed into RNN model
|
||||
x_data = []
|
||||
for i in range(len(X_train) - seq_length):
|
||||
x_new = X_train[i:i + seq_length]
|
||||
x_data.append(x_new)
|
||||
|
||||
# Y_train
|
||||
y_data = Y_train[seq_length:]
|
||||
y_data = y_data.reshape((-1, 1))
|
||||
|
||||
return x_data, y_data
|
||||
|
||||
|
||||
def check_control_constraint(X, dim, uppper_bound, lower_bound):
|
||||
for i in range(0, shape(X)[0]):
|
||||
for j in range(0, shape(X)[0]):
|
||||
for k in range(0, dim):
|
||||
if X[i, j, k] >= uppper_bound[i, j, k]:
|
||||
X[i, j, k] = uppper_bound[i, j, k] - 0.01
|
||||
if X[i, j, k] <= lower_bound[i, j, k]:
|
||||
X[i, j, k] = lower_bound[i, j, k] + 0.01
|
||||
return X
|
Loading…
Reference in New Issue