by Robert Gowers, Roger Hill, Sami Al-Izzi, Timothy Pollington and Keith Briggs
from Boyd and Vandenberghe, Convex Optimization, exercise 4.57 pages 207-8
Convex optimization can be used to find the channel capacity $C$ of a discrete memoryless channel. Consider a communication channel with input $X(t) \in \{1,2,...,n\}$ and output $Y(t) \in \{1,2,...m\}$. This means that the random variables $X$ and $Y$ can take $n$ and $m$ different values, respectively.
In a discrete memoryless channel, the relation between the input and the output is given by the transition probability:
$p_{ij} = \mathbb{P}(Y(t)=i | X(t)=j)$
These transition probabilities form the channel transition matrix $P$, with $P \in \mathbb{R}^{m\times n}$.
Assume that $X$ has a probability distribution denoted by $x \in \mathbb{R}^n$, meaning that:
$x_j = \mathbb{P}(X(t) = j) \quad j \in \{1,...,n\}$.
From Shannon, the channel capacity is given by the maximum possible mutual information $I$ between $X$ and $Y$:
$C = \sup_x I(X;Y)$
where,
$I(X;Y) = -\sum_{i=1}^{m} y_i \log_2y_i + \sum_{j=1}^{n}\sum_{i=1}^{m}x_j p_{ij}\log_2p_{ij}$
Given that $x\log x$ is convex for $x \geq 0$, we can formulate this as a convex optimization problem:
minimise $-I(X;Y)$
subject to $\sum_{i=1}^{n}x_i = 1 \quad x \succeq 0 \quad$ since $x$ describes a probability
Due to the entropy function in CVXPY, this can be written quite easily in DCP.
#!/usr/bin/env python3
# @author: R. Gowers, S. Al-Izzi, T. Pollington, R. Hill & K. Briggs
import cvxpy as cvx
import numpy as np
def channel_capacity(n,m,P,sum_x=1):
'''
Boyd and Vandenberghe, Convex Optimization, exercise 4.57 page 207
Capacity of a communication channel.
We consider a communication channel, with input X(t)∈{1,..,n} and
output Y(t)∈{1,...,m}, for t=1,2,... .The relation between the
input and output is given statistically:
p_(i,j) = ℙ(Y(t)=i|X(t)=j), i=1,..,m j=1,...,m
The matrix P ∈ ℝ^(m*n) is called the channel transition matrix, and
the channel is called a discrete memoryless channel. Assuming X has a
probability distribution denoted x ∈ ℝ^n, i.e.,
x_j = ℙ(X=j), j=1,...,n
The mutual information between X and Y is given by
∑(∑(x_j p_(i,j)log_2(p_(i,j)/∑(x_k p_(i,k)))))
Then channel capacity C is given by
C = sup I(X;Y).
With a variable change of y = Px this becomes
I(X;Y)= c^T x - ∑(y_i log_2 y_i)
where c_j = ∑(p_(i,j)log_2(p_(i,j)))
'''
# n is the number of different input values
# m is the number of different output values
if n*m == 0:
print('The range of both input and output values must be greater than zero')
return 'failed',np.nan,np.nan
# x is probability distribution of the input signal X(t)
x = cvx.Variable(rows=n,cols=1)
# y is the probability distribution of the output signal Y(t)
# P is the channel transition matrix
y = P*x
# I is the mutual information between x and y
c = np.sum(P*np.log2(P),axis=0)
I = c*x + cvx.sum_entries(cvx.entr(y))
# Channel capacity maximised by maximising the mutual information
obj = cvx.Minimize(-I)
constraints = [cvx.sum_entries(x) == sum_x,x >= 0]
# Form and solve problem
prob = cvx.Problem(obj,constraints)
prob.solve()
if prob.status=='optimal':
return prob.status,prob.value,x.value
else:
return prob.status,np.nan,np.nan
In this example we consider a communication channel with two possible inputs and outputs, so $n = m = 2$. The channel transition matrix we use in this case is:
$P = \pmatrix{0.75,0.25\\0.25,0.75}$
Note that the rows of $P$ must sum to 1 and all elements of $P$ must be positive.
np.set_printoptions(precision=3)
n = 2
m = 2
P = np.array([[0.75,0.25],
[0.25,0.75]])
stat,C,x=channel_capacity(n,m,P)
print('Problem status: ',stat)
print('Optimal value of C = %.4g'%(C))
print('Optimal variable x = \n', x)