Thursday, November 4, 2010

Which cones are needed to represent almost all convex optimization problems?

Recently I visited my Professor Yinyu Ye at Stanford university.  I worked with during my ph.d. studies where we did some nice work together. Now we hope we might have some new ideas to explore. More about that later. During my visit I also talked Stephen Boyd and Jacob Mattingley. Jacob also gave me a presentation of his interesting ph.d. work which he defended while I was at Stanford.

During a lunch with Yinyu, Stephen and some other Standford guys Stephen said something like: "Almost all convex optimization problem can be formulated using a combination of  linear, quadratic, semi-definite and the exponential cones". It is a view I am sharing. Given it is true it has an important practical ramification I will return to shortly. Stephen is aware that his statement is hard to prove but I think it is true like almost all large scale LPs are sparse. It is not something  that is not universally true but it is true in practice.

Please note that any convex problem can be formulated in conic form but Stephens postulate that only very few cone types are require. Moreover, currently it is only the exponential cone we do not know how to deal with in practice because it is a so-called nonsymmetric cone.

Occasionally at MOSEK support we get a question like: "I have a nonlinear model. How do I solve it with MOSEK?"  My first reply is always: "Is your model convex?" because MOSEK only deals with convex models. If the user knows what convexity is then user will usually reply yes. Maybe, the user will add that it looks complicated to solve general convex convex models with MOSEK. That is particularly true if you do not use a modeling language like AMPL or GAMS. Here comes Stephen Boyds observation handy because if your model does not include exponential terms (or logarithmic terms) or semi-definite terms then most likely the problem can be formulated as a conic quadratic optimization problem. That is fortunate because
  • conic quadratic optimization problems are as easy as linear problems to deal with software wise. The user does not have to bother with gradients and Hessians for instance.
  • the optimizer for conic quadratic optimization problems is much more robust than the optimizer for general convex optimization problems.
Recently after I came back from Stanford an user wrote to MOSEK  support: "I have this complicated convex problems that cannot be formulated as conic quadratic optimization problem. How should I solve it with MOSEK?" After some back and forth I got him to reveal that it essentially had nonlinear functions of the type


      max(x,0)^2 <= t

Now that is in fact conic quadratic representable as follows

     s^2 <= t
     x    <= s


So, it turned that the complicated convex model is a conic quadratic representable contrary to the initial statement.


The upshot from Stephen Boyds statement is that if your model is convex and does not include exponential or logarithmic terms then is very likely it can represented by conic quadratic or a semi-definite problem. I think that is a very useful observation in practice.

PS. Usually when reformulating a problem as conic quadratic optimization problem then the number of constraints and variables are expanded. Some thinks that inefficient. However, it should be noted that the structure introduced is very sparse and hence does not hurt performance much.  There even some cases (thinks QPs with a low rank dense Hessian) where the conic quadratic representation is bug but requires much less space when stored sparse.

Wednesday, August 25, 2010

Exploiting the Farkas certificate in Dantzig-Wolfe decomposition

Paul Rubin ask in the comments of his blog post how to exploit a Farkas certificate Dantzig-Wolfe decomposition. I will answer that question here since it is very simple and not well-known. It is also fun to answer.

So let us assume you do

min c_1' x_1 + c_2' x_2
st.  A_1 x_1 + A_2 x_2 = b
     B_1 x_1           = f_1              (P)
     B_2 x_2           = f_2
    x_1, x_2 >= 0

In DW the master problem looks like

min h_1' z_1  + h_2' z_2
st   H_1  z_1 + H_2 z_2 = b          [y]
      e' z_1                       = 1          [u_1]
                        e' z_2    = 1           [u_2]
      z_1, z_2 >= 0

e is the vector of all ones. y, u_1 and u_2 are dual variables associated with the constraints.        

So it DW you have master problem and I will *NOT* assume that it is feasible. On other hand I will assume
it is infeasible and the LP optimizer provides a Farkas certificate of infeasibility denoted y, u_1 and u_2 i.e.

b'y + u_1 + u_2 > 0,
H_1'y + e u_1 <=0,
H_2'y + e u_2 <= 0

So if you do DW you will have to add columns to the master problem that invalids infeasibility certificate of the master problem.. Finding such column you do using the sub problems

min (0-A_j^T y)' x_j - u_j
st. B_j x_j =f_j
    x_j  >= 0

An interpretation of the subproblem is that we find a new column to the master problem that invalidates the Fakas certificate to the master problem as much as possible.

Now it is easy to verify that

  • Assume the objective value to a sub problem is negative, then the optimal solution can be used to generate  a new (useful) column to the master problem.
  • Now if all the objective values to the problem are nonnegative then the problem (P) is infeasible. Hint: Can you specify a Farkas certificate to (P)?

Wednesday, July 21, 2010

A small excercise in conic quadratic optimization.

Assume we have the small conic quadartic problem

min c'x

st. ax=b
     x_1 = l_1
     x_2 = l_2
     x_1 >= sqrt(x_2^2 + x_3^2)

c and a are arbitrary 3 dimensional vectors.  l_1 and l_2 are scalar constants.

The above problem is in fact a 1 dimensional LP.  The exercise is:
  1. Write the 1 dimensional LP?
  2. State the corresponding dual problem.
  3. Show how to recover the dual solution to conic problem from the dual solution to the the LP.
  4. What if l_1 - |l_2| = 0? Does it give rise to problems?

Tuesday, April 20, 2010

A question about the conic representation of the power function.

My previous blog item discussed how to represent the set

x^5/3 <= t, x>=0

using linear and quadratic cones. Now the book by Ben-Tal and Nemirovski shows how to represent the set


x^p/q <= t , x>=0       (1)

using conic quadratic cones where p and q are integers and p>=q>=1 which much more general than my example of course. Now the method suggested by Ben-Tal and Nemirovsky is not optimal in a complexity sense. Hence, I do not think it leads to the minimal number of quadratic cones. Therefore, a natural question  is: What is the optimal conic quadratic representation of (1)?

Friday, April 16, 2010

Is x raised to the power 5/3 conic quadratic representable?

The set


   x^5/3 <= t, x,t>=0.0 

can be represented by

x^2    <= 2 s t,  s,t >= 0,
u         =     x,
v         =     s, 
z         =     v,
z^2   <= 2fg,  f,g > = 0,
f         =     0.5,
4 g      =     h,
h^2  <=  2uv,  u,v  >= 0.

I will leave it to reader to verify it is correct.

this particular set I have come up over again and again in financial applications. I suppose it has something to do with modeling of transactions costs.

An obvious question is why replace a simple problem but something that looks quite a bit complicated.
However, both in theory and practice the conic quadratic optimization problems is easier to solve then general convex problems.

Friday, March 19, 2010

A conic quadratic representable set.

Assume you have the problem

min v
st.   |x|^3 / y^2 <= v, y>=0

A customer asked me if that can be formulated as conic quadratic problem. Indeed it can and my solution is

z^2  <=  2ty
2t      <= a
a^2   <= 2 uv
z         =  2u
x       <= z
-x      <= z
0       <= y,t,u,v
 

Effective cuts for mixed integer goal programs

Assume you have the problem

min c'x + ||Fx-f||_1
st.   A x = b
       x binary

This is sort of a goal programming model I have seen frequently.  Now the typical LP formulation is


min c'x +e'(s+t)
st.   F x - f =  s-t
      A x = b
      s,t >= 0
       x binary

Note this model includes continuous variables too.  At least for an application I have seen then

  1. It easy to find a feasible solution.
  2. There is huge gap between the LP relaxation and the feasible solution.
  3. It is hard to close the gap. A huge number of nodes is required and the standard cuts works poorly.
Then leads to the question. What is good cuts for this type of problems?

Tuesday, February 2, 2010

Windows PowerShell

I have played a bit around with the Windows PowerShell but is now going back to the DOS command line. What I was looking for was something like Linux bash for Windows. PowerShell is not that and in my opinion it sucks.

Here are some problems
  • It has a history but works nowhere nearly as well on bash. For instance it does not save the history automatically.
  • The terminal is still lousy with respect to copy and paste.
  • If you C program sends a lot of output to the terminal then t is so slow.
  • If you press ctrl break then it closes down the whole console. That is not what the DOS command does and hence the close down too frequently.
So a good advice is. If you are looking for a replacement for Linux bash then do not waste time on Windows PowerShell. Maybe the Hotwire Shell (http://code.google.com/p/hotwire-shell/) is worth looking at. Let me know if you a better suggestion.

Monday, January 11, 2010

From Eudora to Gmail.

I have used the email client Eudora since about 1997 when I did a post doc at TU Delft.  However, now Eudora is no longer really support so I thought I should switch to another email client and decided to go for Googles Gmail.

I exported all my old emails Eudora to Gmail using the IMAP mode. It was fairly easy. However, it seems that Eudora adds custom headers to emails so not all old emails shows up properly in Gmail.  However, installing the script http://userscripts.org/scripts/show/62058 that works in Firefox, then most old emails shows up properly.

To summarize I am very happy about my switch from Eudora to Gmail. It is so nice to have all your emails available online everywhere.