next up previous contents
Next: Conclusion Up: Optimization examples Previous: Delay   Contents


Multi-objective optimizations

The multi-objective optimization means to optimize at the same time different target, that is, for example minimize contemporarily the delay and the power, or the power and the area, and so on. From §5.2.2.1, §5.2.2.2 and §5.2.2.3 we have seen that some of these goals clash. These clashes are briefly summarized in table 5.4.


Table 5.4: Agreements of targets 
  Area Delay Power
Area -- $ \spadesuit$ $ \heartsuit$
Delay $ \spadesuit$ -- $ \spadesuit$
Power $ \heartsuit$ $ \spadesuit$ --

So, for example, optimizing together delay and power, i.e. minimizing both, it is not possible: the power is minimized when all the transistors are at minimum width, while minimizing the delay involves to have some transistors (maybe all) at a width greater than the minimum.
This disagreement among some optimization targets leads to new possible definition(s) of ``multi-objective'' optimization:

i)
there is a primary target to be optimized, and one or more secondary targets to be taken into account: then we may define a threshold on the latter. The algorithm goes on optimizing the primary target, being careful on maintaining all the secondary targets below the threshold; or
ii)
there are only primary targets, and each target account into the total objective function with a relative weight, which indicates how much the final solution should depend on the corresponding target; or
iii)
both the previous definitions.

The most suitable policy is the second, because it gives to each target the same priority with different importance. The first alternatives leads to a sub-optimal optimization since: first, the designer must know which are the order of magnitude of the targets, in order to impose a limit on them; second, not the whole space of solutions may be explored with such constraints.

In the case of primary target with relative weights, we have chosen the sum of relative weights to represents the entire normalized objective function, that is the sum of relative weights must be equal to one.
Given the results of §4.1.2 (page [*]) then the total objective function to be minimized is a linear combination of the delay ( $ \mathcal{D}$), power ( $ \mathcal{P}$) and area ( $ \mathcal{A}$):

$\displaystyle O = \alpha \mathcal{D} + \beta \mathcal{P} + \gamma \mathcal{A}\, ,$ (6.31)

where $ O$ is the total objective function and where

$\displaystyle \alpha \geq 0,\; \beta\geq 0,\; \gamma\geq 0, \quad\alpha + \beta + \gamma = 1$    

From the point of view of the user of the optimizer, specifying this kind of weights means to have the possibility to see this weights as a measure of how much the corresponding target matters in the final solution: for example specifying $ \alpha=0.5$, $ \beta=0.5$ and $ \gamma=0$ means that we want to optimize the delay at the $ 50\%$ and the power at the $ 50\%$.

The subtle point in the eq. (5.4) is that the quantities $ \mathcal{D}$, $ \mathcal{P}$ and $ \mathcal{A}$ are not commensurable, that is order of magnitude of the quantities may not be same. Let's think only to the unit of measure: if, for example, the delay is measured in picosecond (e.g. 1000 ps), the power is measured in Joule (e.g. $ 10^{-13}$ J). When one quantity is very greater than the others, then all the changes in the latter quantities disappear in the total sum.

In order to overcome the problem of the non-commensurable quantities in eq. (5.4), all the terms comprising the sum should be normalized. The mathematical theory of optimization states that each term should be normalized dividing them by the optimum found optimizing only that particular term. This implies an a-priori knowledge of the optimum of each term, and so of the total weighted sum. At every moment of the optimization run is possible to know the distance between the actual solution and the optimum.
This is not practically feasible for a circuit optimization, since it would involve the run of mono-objective optimizations, one for each term of the sum, and then the run of the final multi-objective optimization. This would lead to a total session of the optimization unacceptable, both for the time it will takes and for the resources it will occupy.

Thus the normalization applied here is the division of each quantity for its corresponding maximum: a maximum of the delay occurs when all the transistors are at minimum width, while the maximum of the power and of the area is measured when all the transistors are at the maximum allowed width in the optimization session (being careful that choosing a too large maximum allowed width will result in a power and area term too little).

The total normalized optimization objective function becomes then:

$\displaystyle O = \alpha \frac{\mathcal{D}}{\mathcal{D}\rvert_{\textrm{min widt...
...ths}}} + \gamma \frac{\mathcal{A}}{\mathcal{A}\vert _{\textrm{max widths}}}\, .$ (6.32)

Choosing all the combinations of the parameters $ \alpha$, $ \beta$ and $ \gamma$ it is possible to obtain an optimized circuit in which the delay, the power consumption and the area occupancy account more or less.


Table 5.5: Full-adder: delay and power optimization 
Delay [ps] Energy [pJ] Area [ $ \boldsymbol{\mu}\mathbf{m^2}$]
Full-adder Pre-opt. Post-opt. Pre-opt. Post-opt. Pre-opt. Post-opt.
41.2inStatic $ \mathbf{0.7}\,\boldsymbol{\mu}\mathbf{m}$ technology
1781 1156 6.475 22.34 43 110.5
$ \mathbf{0.25}\,\boldsymbol{\mu}\mathbf{m}$
571.3 429.5 3.155 13.63 17 83.12
41.2inTSPC (one-stage) $ \mathbf{0.7}\,\boldsymbol{\mu}\mathbf{m}$ technology
930.6 744.1 2.168 3.921 26 62.8
$ \mathbf{0.25}\,\boldsymbol{\mu}\mathbf{m}$ technology
276.7 187.1 0.641 1.879 13 41.6


Table 5.6: Full-adder: optimizations comparison among two kinds of optimization and the minimum widths results. The number in the parentheses shows the worsening (if positive) or the improvement (if negative) of the power-delay optimization from the full-delay optimization. 
$ \boldsymbol{\Delta}$Delay $ \boldsymbol{\Delta}$Energy $ \boldsymbol{\Delta}$Area
Full-adder $ \boldsymbol{\alpha}\mathbf{=1}$ $ \boldsymbol{\alpha}\mathbf{=}\boldsymbol{\beta}\mathbf{=0.5}$ $ \boldsymbol{\alpha}\mathbf{=1}$ $ \boldsymbol{\alpha}\mathbf{=}\boldsymbol{\beta}\mathbf{=0.5}$ $ \boldsymbol{\alpha}\mathbf{=1}$ $ \boldsymbol{\alpha}\mathbf{=}\boldsymbol{\beta}\mathbf{=0.5}$
40.8inStatic $ \mathbf{0.7}\,\boldsymbol{\mu}\mathbf{m}$ technology
$ \div$1.65 $ \div$1.54 (+7%) $ \times$6.18 $ \times$3.45 (-44.1%) $ \times$5.75 $ \times$2.57 (-43.5%)
$ \mathbf{0.25}\,\boldsymbol{\mu}\mathbf{m}$ technology
$ \div$1.38 $ \div$1.33 (+3.44%) $ \times$35.25 $ \times$4.32 (-87.7%) $ \times$40.74 $ \times$4.89 (-88%)
40.8inTSPC (one-stage) $ \mathbf{0.7}\,\boldsymbol{\mu}\mathbf{m}$ technology
$ \div$2.33 $ \div$1.25 (+85.9%) $ \times$6.18 $ \times$1.81 (-70.7%) $ \times$5.83 $ \times$2.42 (-58.6%)
$ \mathbf{0.25}\,\boldsymbol{\mu}\mathbf{m}$ technology
$ \div$1.75 $ \div$1.48 (+18.2%) $ \times$5.65 $ \times$2.93 (-48.1%) $ \times$6.18 $ \times$3.20 (-48.3%)

If for, for example the same full-adder of table 5.3 (page [*]) are optimized both for delay and for power in the same measure, i.e. in equation (5.5) $ \alpha=0.5$, $ \beta=0.5$ and $ \gamma=0$, we obtain the results of table 5.5.
The comparison of the full delay optimization (mono-objective) and delay-power optimization (multi-objective) is sketched in table 5.6: as we can see between the full-delay optimization and the power-delay optimization (50%-50%) there is a slightly worsening in the delay of the final circuit (from 5.2% to 46.9%); at the same time there is an effective improvement in the power consumption: the power dissipation decreases from $ 5.8\%$ to $ 76.5\%$.

A more complete survey of the optimization results of the circuit presented in this chapter can be found in chapter 7 (page [*]).


next up previous contents
Next: Conclusion Up: Optimization examples Previous: Delay   Contents
marco+site@equars.com