next up previous contents
Next: Multi-objective optimization Up: Constrained problem Previous: Constrained problem   Contents

Lagrange multiplier and Penalty functions

The first method defines a Lagrangian function:

$\displaystyle L(\mathbf{x},\mathbf{\lambda})=f(\mathbf{x})+\sum_{i=1}^{m}\lambda_{i} g_i(\mathbf{x})$ (5.18)

If we define $ \mathbf{x}^\star$ as the solution that:

$\displaystyle \mathbf{x}^\star = \min_{x\in X} f(\mathbf{x}) \qquad g_i(\mathbf{x}^\star) \leq 0, \mathstrut i=1, 2, \dotsc, m$    

then we can write the necessary Kuhn-Tucker conditions for the existence of the minimum:

  $\displaystyle \nabla_x L(\mathbf{x}^\star,\mathbf{\lambda}^\star) = 0$ (5.19)
  $\displaystyle \nabla_{\lambda} L(\mathbf{x}^\star,\mathbf{\lambda}^\star) \leq 0$ (5.20)
  $\displaystyle (\mathbf{\lambda}^\star)^Tg(\mathbf{x}^\star)=0$ (5.21)
  $\displaystyle \mathbf{\lambda}^\star \geq 0$ (5.22)

In order to find out sufficient conditions, we define the saddle-point conditions:

Theorem 5.17   A point $ (\mathbf{x}^\star,\mathbf{\lambda}^\star)$ with $ \mathbf{\lambda}^\star \geq 0$ is a a saddle-point of the Lagrangian $ L(\mathbf{x},\mathbf{\lambda})$ iff
i)
$ \mathbf{x}^\star$ minimizes $ L(\mathbf{x},\mathbf{\lambda})$ over the whole $ X$
ii)
$ g_i(\mathbf{x}^\star) \leq 0, \mathstrut i=1, 2, \dotsc, m$
iii)
$ \mathbf{\lambda}_i^\star g_i(\mathbf{x}^\star) = 0, \mathstrut i=1, 2, \dotsc, m$

It can be proved that if the functions $ f,g$ are even not differentiable but are convex, then the saddle-point conditions are necessary and sufficient conditions. Although these conditions must hold at the minimum, they are not very useful in determining the optimum point. The determination of the optimum by direct solution of these equations is rarely practicable.

A more feasible way is to convert the constrained problem into an unconstrained one, by defining the new objective function:

$\displaystyle P(\mathbf{x},\mathbf{K})=f(\mathbf{x})+\sum_{i=1}^m K_i [g_i(\mathbf{x})]^2$ (5.23)

The sum added to the objective function is called penalty function, since it penalizes the objective function adding a positive quantities (recall that we want to minimize the cost function). The constants $ \mathbf{K}=[\mathstrut K_1, K_2, \dotsc, K_m]^T$ are weighting factors (positive) that define how strongly must be satisfied the $ i$-th constraint, and can also made it commensurable.

Wherever $ \mathbf{x}$ is inside the feasible region, we can ignore the constraints, so a new objective function can be defined as:

$\displaystyle P(\mathbf{x},\mathbf{K})=f(\mathbf{x})+\sum_{i=1}^m K_i [g_i(\mathbf{x})]^2u_i(g_i)$ (5.24)

where $ u_i(g_i)$ is the usual step function:

$\displaystyle u_i(g_i)= \begin{cases}0& \text{if $g_i(\mathbf{x}) \leq 0 $}\\  1& \text{if $g_i(\mathbf{x}) > 0 $} \\  \end{cases}$    

The introduction of the step function makes possible to relate the penalty function defined in (4.8) with the Lagrangian function of (4.2) (page [*]):

$\displaystyle P(\mathbf{x},\mathbf{K})=L(\mathbf{\lambda},\mathbf{K})$    

if we let $ \lambda_i=K_ig_i(\mathbf{x})u_i(g_i)$, so that all previous results valid for the Lagrangian function are valid for the penalty function.
Note that the solution $ \mathbf{x}^\star$ found optimizing the penalty function $ P(\mathbf{x},\mathbf{K})$ converges to $ (\mathbf{x}^\star,\mathbf{\lambda}^\star)$, defined by the Kuhn-Tucker conditions, only in the limit $ \mathbf{K}\rightarrow\infty$.


next up previous contents
Next: Multi-objective optimization Up: Constrained problem Previous: Constrained problem   Contents
marco+site@equars.com