Fault detection

We'll consider a problem of identifying faults that have occurred in a system based on sensor measurements of system performance.

Topic references

Problem statement

Each of $n$ possible faults occurs independently with probability $p$. The vector $x \in \lbrace 0,1 \rbrace^{n}$ encodes the fault occurrences, with $x_i = 1$ indicating that fault $i$ has occurred. System performance is measured by $m$ sensors. The sensor output is \begin{equation} y = Ax + v = \sum_{i=1}^n a_i x_i + v, \end{equation} where $A \in \mathbf{R}^{m \times n}$ is the sensing matrix with column $a_i$ being the fault signature of fault $i$, and $v \in \mathbf{R}^m$ is a noise vector where $v_j$ is Gaussian with mean 0 and variance $\sigma^2$.

The objective is to guess $x$ (which faults have occurred) given $y$ (sensor measurements).

We are interested in the setting where $n > m$, that is, when we have more possible faults than measurements. In this setting, we can expect a good recovery when the vector $x$ is sparse. This is the subject of compressed sensing.

Solution approach

To identify the faults, one reasonable approach is to choose $x \in \lbrace 0,1 \rbrace^{n}$ to minimize the negative log-likelihood function

\begin{equation} \ell(x) = \frac{1}{2 \sigma^2} \|Ax-y\|_2^2 + \log(1/p-1)\mathbf{1}^T x + c. \end{equation}

However, this problem is nonconvex and NP-hard, due to the constraint that $x$ must be Boolean.

To make this problem tractable, we can relax the Boolean constraints and instead constrain $x_i \in [0,1]$.

The optimization problem

\begin{array}{ll} \mbox{minimize} & \|Ax-y\|_2^2 + 2 \sigma^2 \log(1/p-1)\mathbf{1}^T x\\ \mbox{subject to} & 0 \leq x_i \leq 1, \quad i=1, \ldots n \end{array}

is convex. We'll refer to the solution of the convex problem as the relaxed ML estimate.

By taking the relaxed ML estimate of $x$ and rounding the entries to the nearest of 0 or 1, we recover a Boolean estimate of the fault occurrences.

Example

We'll generate an example with $n = 2000$ possible faults, $m = 200$ measurements, and fault probability $p = 0.01$. We'll choose $\sigma^2$ so that the signal-to-noise ratio is 5. That is, \begin{equation} \sqrt{\frac{\mathbf{E}|Ax |^2_2}{\mathbf{E} |v|_2^2}} = 5. \end{equation}

In [1]:
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(1)

n = 2000
m = 200
p = 0.01
snr = 5

sigma = np.sqrt(p*n/(snr**2))
A = np.random.randn(m,n)

x_true = (np.random.rand(n) <= p).astype(np.int)
v = sigma*np.random.randn(m)

y = A.dot(x_true) + v

Below, we show $x$, $Ax$ and the noise $v$.

In [2]:
plt.plot(range(n),x_true)
Out[2]:
[<matplotlib.lines.Line2D at 0x110695dd0>]
In [3]:
plt.plot(range(m), A.dot(x_true),range(m),v)
plt.legend(('Ax','v'))
Out[3]:
<matplotlib.legend.Legend at 0x110ee63d0>

Recovery

We solve the relaxed maximum likelihood problem with CVXPY and then round the result to get a Boolean solution.

In [4]:
%%time
from cvxpy import *
x = Variable(n)
tau = 2*log(1/p - 1)*sigma**2
obj = Minimize(sum_squares(A*x - y) + tau*sum_entries(x))
const = [0 <= x, x <= 1]
Problem(obj,const).solve(verbose=True)

# relaxed ML estimate
x_rml = np.array(x.value).flatten()

# rounded solution
x_rnd = (x_rml >= .5).astype(int)
ECOS 1.0.4 - (c) A. Domahidi, Automatic Control Laboratory, ETH Zurich, 2012-2014.

It     pcost         dcost      gap     pres    dres     k/t     mu      step     IR
 0   +7.127e+03   -6.144e+04   +8e+05   8e+00   1e-01   1e+00   2e+02    N/A     1 1 -
 1   +7.014e+02   -1.137e+04   +4e+05   1e+00   2e-02   4e+01   1e+02   0.9899   1 1 1
 2   +5.300e+01   -1.510e+03   +9e+04   2e-01   2e-03   2e+01   2e+01   0.9406   2 1 1
 3   +1.140e+02   -7.533e+02   +5e+04   1e-01   1e-03   1e+01   1e+01   0.5426   2 2 2
 4   +1.378e+02   -3.905e+02   +3e+04   6e-02   8e-04   5e+00   8e+00   0.5017   2 2 2
 5   +1.391e+02   -2.656e+02   +3e+04   5e-02   6e-04   3e+00   7e+00   0.4344   2 2 1
 6   +1.645e+02   +8.938e+00   +1e+04   2e-02   2e-04   9e-01   3e+00   0.6950   2 2 2
 7   +1.740e+02   +7.476e+01   +6e+03   1e-02   2e-04   5e-01   2e+00   0.5070   2 2 2
 8   +1.739e+02   +7.682e+01   +6e+03   1e-02   2e-04   4e-01   2e+00   0.0978   3 2 1
 9   +1.844e+02   +1.482e+02   +2e+03   4e-03   6e-05   2e-02   6e-01   0.9899   2 2 2
10   +1.889e+02   +1.755e+02   +9e+02   2e-03   2e-05   9e-03   2e-01   0.7568   2 2 2
11   +1.907e+02   +1.864e+02   +3e+02   5e-04   7e-06   3e-03   7e-02   0.8071   2 2 2
12   +1.912e+02   +1.892e+02   +1e+02   2e-04   3e-06   1e-03   3e-02   0.8099   2 2 2
13   +1.914e+02   +1.906e+02   +6e+01   1e-04   1e-06   5e-04   1e-02   0.7158   3 2 2
14   +1.916e+02   +1.912e+02   +3e+01   4e-05   6e-07   2e-04   6e-03   0.8640   3 1 1
15   +1.916e+02   +1.916e+02   +4e+00   7e-06   9e-08   3e-05   1e-03   0.8722   3 2 2
16   +1.916e+02   +1.916e+02   +4e-01   7e-07   1e-08   4e-06   1e-04   0.9258   2 2 2
17   +1.916e+02   +1.916e+02   +6e-02   1e-07   2e-09   5e-07   2e-05   0.8804   3 2 2
18   +1.916e+02   +1.916e+02   +2e-02   4e-08   5e-10   2e-07   5e-06   0.7988   3 3 3
19   +1.916e+02   +1.916e+02   +3e-03   6e-09   8e-11   3e-08   8e-07   0.9092   3 3 3
20   +1.916e+02   +1.916e+02   +5e-04   9e-10   1e-11   5e-09   1e-07   0.9134   3 2 2
21   +1.916e+02   +1.916e+02   +1e-04   2e-10   2e-12   9e-10   3e-08   0.8726   3 2 1
22   +1.916e+02   +1.916e+02   +1e-05   2e-11   3e-13   1e-10   4e-09   0.9512   2 1 1

OPTIMAL (within feastol=2.5e-11, reltol=7.3e-08, abstol=1.4e-05).
Runtime: 4.225071 seconds.

CPU times: user 4.66 s, sys: 123 ms, total: 4.78 s
Wall time: 4.97 s

Evaluation

We define a function for computing the estimation errors, and a function for plotting $x$, the relaxed ML estimate, and the rounded solutions.

In [5]:
import matplotlib

def errors(x_true, x, threshold=.5):
    '''Return estimation errors.
    
    Return the true number of faults, the number of false positives, and the number of false negatives.
    '''
    n = len(x_true)
    k = sum(x_true)
    false_pos = sum(np.logical_and(x_true < threshold, x >= threshold))
    false_neg = sum(np.logical_and(x_true >= threshold, x < threshold))
    return (k, false_pos, false_neg)

def plotXs(x_true, x_rml, x_rnd, filename=None):
    '''Plot true, relaxed ML, and rounded solutions.'''
    matplotlib.rcParams.update({'font.size': 14})
    xs = [x_true, x_rml, x_rnd]
    titles = ['x_true', 'x_rml', 'x_rnd']

    n = len(x_true)
    k = sum(x_true)

    fig, ax = plt.subplots(1, 3, sharex=True, sharey=True, figsize=(12, 3))

    for i,x in enumerate(xs):
            ax[i].plot(range(n), x)
            ax[i].set_title(titles[i])
            ax[i].set_ylim([0,1])
            
    if filename:
        fig.savefig(filename, bbox_inches='tight')
        
    return errors(x_true, x_rml,.5)

We see that out of 20 actual faults, the rounded solution gives perfect recovery with 0 false negatives and 0 false positives.

In [6]:
plotXs(x_true, x_rml, x_rnd, 'fault.pdf')
Out[6]:
(20, 0, 0)