next up previous
Next: Existence of step length Up: Interior Point Methods for Previous: Interior Point Methods for

Introduction and Definitions

Over the past few years, primal-dual interior point methods (IPM) have been developed for all kinds of nonlinear optimization problems. Second-order cone programming (SOCP) is one of these. SOCP addresses the problem of minimizing a linear objective function over the intersection of an affine set and the direct product of quadratic cones. Numerous applications have been discussed in [5,15]. Recently, many reseachers have shown that the primal-dual IPM has high theoretical efficiency for SOCP. In particular, Tsuchiya [13] and Monteiro and Tsuchiya [9] have proved the complexity of variants of IPMs based on different scaling directions. Specifically, Tsuchiya [13] shows that the long-step path-following algorithm using the Nesterov and Todd (NT) direction has $ O(k$log $ \varepsilon^{-1})$ iteration complexity which is also the best result for the scaling methods they considered.

In this section we introduce the second order problem (SOCP) and quote some basic definitions for the primal-dual interior point approach. The next section contains as on of the main results a proof of the existence of the step length for the long-step method. A number of applications and in each case their reformulation as SOCP is presented in section 3. Subsequently, the various algorithms we have implemented are given in detail and compared on a test set derived from the application problems. We conclude with a summary in the last section.

Second-order cone programming addresses the problem of minimizing a linear objective function over the intersection of an affine set and the direct product of quadratic cones.

\begin{displaymath}\begin{array}{rcl} \mbox{(SOCP)}&\mbox{min} &c^Tx\\  &\mbox{ s.t.} & Ax=b\\  & & x \in K, \end{array}\end{displaymath} (1-1)

where K is a closed convex cone. In addition to (1-1), we suppose $ A$ is a $ m\times n$ real matrix and of full rank. Subsequently, the dual problem is

\begin{displaymath}\begin{array}{rcl} \mbox{(DSOCP)}&\mbox{max} &b^Ty\\  &\mbox{...
...+s=c\\  & & s \in K^*=\{s\vert x^Ts\geq 0,x\in K\}. \end{array}\end{displaymath} (1-2)

The definition of the quadratic cone $ K$ is the following:

Definition 1.1   The second order cone $ K$ is defined as, $ K=K^1\times
K^2\times\cdots\times K^k$, i.e., the direct product of $ k$ cones, where $ K^i$ is one of the three cones below,
  1. $ {\bf R}^+=\{x^i\in{\bf R}\vert x\geq 0\}.$
  2. quadratic cone $ K^q=\{x^i\in{\bf R}^{n_i}\vert\Vert x^i_{2:n_i}\Vert^2\leq (x^i_1)^2,x^i_1\geq 0\}.$
  3. rotated quadratic cone $ K^r=\{x^i\in{\bf R}^{n_i}\vert\Vert x^i_{3:n_i}\Vert^2\leq
2x^i_1x^i_2,x^i_1,x^i_2\geq 0\}.$

Note that $ K=K^*.$ Moreover, there is a linear transformation $ T$ between $ K^q$ and $ K^r$ such that

$\displaystyle x^i\in K^r \Longleftrightarrow
T^ix^i\in K^q,$


$\displaystyle T^i=\begin{pmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}}&0&\cdo...
...0\cr & & &\ddots& \cr 0& 0 &0 &\cdots&1\end{pmatrix}\in{\bf R}^{n_i\times n_i}.$ (1-3)

Here, $ T^i$ is a $ n_i\times n_i$ identity matrix if $ x^i\in K^q.$ The existence of the linear transformation $ T^i$ not only extends the problem to rotated quadratic cones but also helps to apply the convergence theorem for $ K^q$ to $ K^r$. Next, we partition $ x$ according to the cones, i.e.,

$\displaystyle x=\begin{pmatrix}x^1\cr x^2\cr\vdots\cr x^k\end{pmatrix}.$

We introduce two nonnegative scalar variables $ \tau, \kappa$ and define the general Goldman-Tucker homogeneous model as follows:

\begin{displaymath}\begin{array}{rcl} -c^Tx+b^Ty-\kappa &=& 0,\\  Ax-b\tau &=& 0...
...  (x,\tau)\in \bar{K},& & (s,\kappa) \in \bar{K^*}, \end{array}\end{displaymath} (1-4)

where $ \bar{K}=K \times {\bf R}_+$, and $ \bar{K^*}=K^* \times {\bf R}_+$. In order to solve (1-4) with the complementarity conditions, the following definitions are necessary,

Definition 1.2   Given $ u \in R^n$, then $ U=mat(u)=\begin{pmatrix}u_1 &u_{2:n}^T\cr
u_{2:n}&u_1I\end{pmatrix}$, where $ u_{2:n}=\begin{pmatrix}u_2\cr\vdots\cr u_n\end{pmatrix}$ and $ I$ is the $ (n-1) \times (n-1)$ identity matrix.

Next, we define $ X^i$ for each type of cone:

Definition 1.3  
  1. $ x^i \in R_+$: $ X^i=x^i$, and $ S^i=s^i$,
  2. $ x^i \in K^q$: $ X^i=mat(x^i)$, and $ S^i=mat(s^i)$,
  3. $ x^i \in K^r$: $ X^i=mat(Tx^i)$, and $ S^i=mat(Ts^i)$, where $ T$ is in (1-3).

Subsequently, the complementarity conditions can be written as a nonlinear system.

Lemma 1.4   Let $ x,s\in K$, then $ x$ and $ s$ are complementary, i.e., $ x^Ts=0$, if and only if

$\displaystyle X^iS^ie^i=S^iX^ie^i=0, i=1:k,$

where $ X^i$ and $ S^i$ are given in Definition 1.3. $ e^i\in
R^{n_i}$ is the first unit vector.

See [1] for proof. An initial point $ (x^{(0)},\tau^{(0)},y^{(0)},s^{(0)},\kappa^{(0)})$ is given by

$\displaystyle (x^{(0)},\tau^{(0)}), (s^{(0)},\kappa^{(0)})\in

Thus, the central path can be defined as:

Definition 1.5   The points $ (x,\tau,y,s,\kappa)$ of the central path $ C$ are defined by the following nonlinear equations,

\begin{displaymath}\begin{array}{rcl} Ax-b\tau &=& \sigma (Ax^{(0)}-b\tau^{(0)})...
...sigma \mu^{(0)}e,\cr \tau\kappa&=&\sigma \mu^{(0)}, \end{array}\end{displaymath} (1-5)

where the centering parameter $ \sigma\in [0,1]$, the duality measure is

$\displaystyle \mu =\frac{x^Ts+\tau\kappa}{k+1},$

$ X=diag\{X^1,\cdots,X^k\}$, $ S=diag\{S^1,\cdots,S^k\}$, and $ e=\begin{pmatrix}e^1\cr\vdots\cr
e^k\end{pmatrix},$ $ e^i\in
R^{n_i}$ is the first unit vector.

There are different ways to define the neighborhood. We choose the one similar to $ N_{-\infty}(\gamma),$

$\displaystyle N(\gamma)=\{(x,\tau),(s,\kappa)\in int(\bar{K})\vert
\sqrt{(x^i)^TQ^ix^i(s^i)^TQ^is^i}\geq\gamma\mu,\forall i,$and $\displaystyle \tau\kappa\geq\gamma\mu\}.$

In the original infeasible interior point method, the residuals only appear in equations after applying Newton's method. It is easy to verify that the ratio between the new and old residual is $ (1-\alpha)$, where $ \alpha$ is the step length. It is faster than the rate at which the complementarity gap decreases. This central path gives the same rate for infeasibility and the complementary gap. In experiments, we see the latter version needs less iterations than the previous one.

The well-definedness of (1-5) is not guaranteed. One way to assure that the search direction is well-defined is applying scaling schemes before using Newton's method, then scaling back to the original space. Next, we give the definition of scaling matrices,[1].

Definition 1.6   $ W^i\in{\bf R}^{n_i\times n_i}$ is a scaling matrix if it satisfies
  1. $ W^i$ is a symmetric positive definite matrix, and
  2. $ W^iQ^iW^i=Q^i.$

The scaled point $ (\bar{x},\bar{s})$ is defined by the transformation

$\displaystyle \bar{x}=\Theta Wx$    and $\displaystyle \bar{s}=(\Theta W)^{-1}s,$


$\displaystyle W=\begin{pmatrix}W^1&\cdots&0\cr \vdots&\ddots&\vdots\cr
0&\cdots&W^k\end{pmatrix},$     $\displaystyle \Theta=\begin{pmatrix}\theta_1I_{n_1}&\cdots&0\cr
\vdots&\ddots&\vdots\cr 0&\cdots&\theta_kI_{n_k}\end{pmatrix},$

$ \theta_i\in{\bf R}$, and $ I_{n_i}$ is the identity matrix $ \in{\bf R}^{n_i\times n_i}$, for all $ i$. It is easy to see that $ \bar{x}^i=\theta_iW^ix$, and $ \bar{s}^i=(\theta_iW^i)^{-1}s^i$. Therefore, the linear system for the scaled search direction is defined by

\begin{displaymath}\begin{array}{rcl} A\triangle x-b\triangle\tau&=&(\sigma-1)(A...
...le\tau+\tau\triangle\kappa&=&-\tau\kappa+\sigma\mu. \end{array}\end{displaymath} (1-6)

The NT direction was proposed by Nesterov and Todd [10]. The paper [10] also shows that the primal-dual IPM maintains its efficiency when the nonnegative constraints in LP are replaced by the homogeneous self-dual cone. More recently, Tsuchiya [13] showed that the long-step algorithm for SOCP using NT direction has $ O(k$log$ \varepsilon^{-1})$, iteration-complexity, where $ k$ is the number of cones, which is the best result among several scaling schemes.

The NT direction defines $ \Theta$, and $ W$, such that

$\displaystyle \bar{x}=\bar{s}.$

Hence, $ s=(\Theta W)^2x.$ The definitions of $ \Theta$ and $ W$ are given in the following lemma. See [13] for the uniqueness.

To simplify notations through out this paper, we define the following symbols.

Definition 1.7   Define $ \phi=(x,\tau,y,s,\kappa)$, $ \chi=(x,\tau)$ and $ \varsigma=(s,\kappa).$

next up previous
Next: Existence of step length Up: Interior Point Methods for Previous: Interior Point Methods for
Hans D. Mittelmann 2003-09-10