Hi, what would be the best way (theoretically and practically) to decide whether a convex but (only) in 0 non-differentiable function attains its global minimum in zero? A possibility would be to use a subgradient method still needing the subgradients. Is there a way to easily test, if 0 is the global minimizer? For example, is it to easy decide if there is a point where 0 is in its subdifferential or to decide if no descent direction in this point exists? Is it sufficient to evaluate for example all points (0+e_i) and (0-e_i) for all unit vectors e_i and show that the objective function value is non-negative?

asked 20 Jan '15, 09:05

Long's gravatar image

accept rate: 0%

Check the following math carefully, since I last studied calculus at Newton's knee. The linear part of your objective function is harmless, so I'm going to ignore it for simplicity. Consider the function \(f(z)=\parallel z\parallel \), which is within a change of variables (\(z=Hx\) where \(Q = H'H\)) of your function. Pick a small \(\epsilon > 0\). Let \[g(x)=\frac{x'x}{4\epsilon}+\frac{3\epsilon}{4}, \]which is convex and smooth. Now define \[h(x)=\begin{cases}f(x) & \parallel x\parallel\ge\epsilon\\g(x) & \parallel x\parallel<\epsilon.\end{cases}\]At any \(x\) with \(\parallel x\parallel = \epsilon\), we have \[g(x)=\frac{\epsilon^{2}}{4\epsilon}+\frac{3\epsilon}{4}=\epsilon=f(x)\]and \[\nabla g(x) = \frac{1}{2\epsilon}x' = \nabla f(x) \] (I think). So you have a continuous first derivative, which should let you optimize \(h\) pretty easily once you confirm that \(h\) is in fact convex. (I'm pretty sure about that but too lazy to check it.)


answered 23 Jan '15, 11:46

Paul%20Rubin's gravatar image

Paul Rubin ♦♦
accept rate: 19%

edited 23 Jan '15, 11:50

Thank you Paul for your very useful comment! There is a small mistake, since the gradient of f is x'/eps, but nothing that cannot be fixed.

(26 Jan '15, 05:59) Long

<sigh>Two senior moments in one response.</sigh> You're right about the gradient of \(f\), which means I should have said \[g(z) = \frac{z'z}{2\epsilon}+\frac{\epsilon}{2}\] (barring more senior moments, of course). The other senior moment, which I realized after I posted, was that I shifted from \(z\) back to \(x\). Everything after "Let" should be in terms of \(z\), not ((x\).

(26 Jan '15, 10:46) Paul Rubin ♦♦

@Long: You mentioned "knapsack constraint" in a comment to my first response. Do you by any chance constrain \(x \ge 0\)?

(26 Jan '15, 10:57) Paul Rubin ♦♦

He probably means \(x \geq 0\) in combination with \(a^T x \leq b\). When the elements of a are nonnegative, it has the interpretation of a knapsack constraint.

(26 Jan '15, 14:50) optimizer

If so, a conceptually simpler solution than hacking the objective function is to keep the original objective and add the constraint \(\sum x_i \ge \epsilon\). Solve that (convex differentiable objective over a polytope) and compare the objective value to the value at 0.

(26 Jan '15, 15:07) Paul Rubin ♦♦

If you allow for a solve step, you could as well optimize the original problem (second order conic optimization problems can be solved almost as efficiently as linear optimization problems).

(26 Jan '15, 16:20) optimizer

Agreed, provided the lack of differentiability at zero does not bother the solver.

(26 Jan '15, 16:50) Paul Rubin ♦♦

You are both right, i assume the variables x to be non-negative. @Paul: what do you mean by keeping the original objective? The original one is not differentiable in zero, i think i might misunderstood something.

The reason why i'm still looking for a cheap (maybe analytical) alternative is that this kind of problem might has to be solved several times as a subproblem. From an implementation point of view i want to keep my overall code as simple as possible without the usage of one more solver if possible.

Doesn't it also depend on the solver if there exist non-differentiable points?

(26 Jan '15, 17:25) Long

If you add the constraint that the \(x\) variables sum to at least \(\epsilon > 0\), the differentiability issue goes away, because 0 is not in the modified feasible region. You just have to live with the possibility that the true optimum has variables summing to a positive value less than \(\epsilon\).

(26 Jan '15, 17:57) Paul Rubin ♦♦

Conic solvers are usually of the interior point type. Interior point solvers will also have no issues with the original formulation because they will never consider x=0.

(28 Jan '15, 08:23) optimizer

True. The added constraint I suggested also lets one try to solve the modified problem "by hand", using the KKT conditions, without bumping into the differentiability problem at the origin.

(28 Jan '15, 10:14) Paul Rubin ♦♦

What i thought is maybe it can be proven that the optimal solution is either 0 or "far away" from zero such that imposing your suggested constraint is even equivalent in case the optimal solution is not zero, implying that taking the minimum of f(0) and the optimal solution of the problem including your constraint gives the optimal solution

(28 Jan '15, 13:43) Long
showing 5 of 12 show 7 more comments

Your question is confusing, since you want to verify if 0 is the optimum but also mention a subgradient method as a possible technique. The latter is an iterative optimization technique that can find the global minimum of a convex function. Using that technique and comparing the solution with 0 is an inefficient way of verifying optimality.

It is not possible to efficiently determine the subgradient numerically. Evaluating all points for unit vectors is insufficient, e.g., take f(x,y) = (x-y)^2 + 0.5(x+y). Then f(0,0) = 0, while the four function values at unit vectors are 1.5, 0.5, 1.5 and 0.5, yet 0 is not optimal (since f(-1,-1) = -1).

Can't you determine the subgradient analytically? What is the function?

Otherwise, the only thing I can think of, is to check if the derivative is 0 at some point other than 0, because that point would be a local optimum. You can compare the function value at that point with the function value at 0.

On a sidenote, for constrained optimization, there are KKT conditions for convex functions that use the subgradient instead of the derivative. See, e.g,. Theorem 3.1 and Corollary 3.5 in this document. Another good reference is the book by Rockafellar on Convex Analysis. One of the criteria for optimality is that 0 is in the subgradient of the Lagrangean at 0, which shows that these conditions are not of any help for unconstrained optimization.


answered 20 Jan '15, 10:55

optimizer's gravatar image

accept rate: 6%

edited 20 Jan '15, 15:28

I'm not positive, but I think you should be able to approximate the original function \(f\) with a convex function \(g\) that is differentiable everywhere and satisfies \(\sup |f(x)-g(x)| \le \epsilon\) for an arbitrary (small) \(\epsilon > 0\). If \(f(0) \le \min_x g(x)\), then 0 is at least \(\epsilon\)-optimal for \(f\) (which is as much of a guarantee as you are going to get from any numerical method, due to rounding and convergence tolerance).


answered 22 Jan '15, 15:37

Paul%20Rubin's gravatar image

Paul Rubin ♦♦
accept rate: 19%

Thank you for your answers.

Actually, I have a linear constraint.

To be more precise, the problem I'm in interested in, is finding out if 0 is the optimizer of the following continuous optimization problem:

\[\min \sqrt{x'Qx} + c'x, s.t. a'x <= b\]

We have that Q is positive semidefinite, \(c \le 0\), \(a \ge 0\) and \(b \ge 0\), so that we have a convex function and a knapsack constraint.

@Paul, can you give me hint of how to compute such a function g?

Maybe a reference?

(22 Jan '15, 17:18) Long

Not enough space in a comment; I'll have to add another answer.

(23 Jan '15, 11:35) Paul Rubin ♦♦

If you decompose \(Q\) into \(PP^T\), the objective can be written as \(||P^T x||+c^Tx\). This means that you have a a second order conic optimization problem. These are easy to optimize, and most conic solvers will give you a primal-dual pair that allows you to verify optimality. You could try to find a dual solution yourself.


answered 23 Jan '15, 08:15

optimizer's gravatar image

accept rate: 6%

edited 23 Jan '15, 13:35

I understand that it can be solved by some SOCP solver. What i'm wondering is, if it can be decided more clever. Actually i don't want to find the optimal solution, but only decide if 0 is the optimal solution or is this question as hard to answer as the seeking for an optimal solution?

(23 Jan '15, 08:54) Long

No, the question becomes much easier. Let us first formulate the problem into standard form: \(\min\{t : v = [Px; t-c'x], a'x-b+s=0, x \in R^n, t \in R, s \ge 0, v \in K\}\). You can now easily write down the KKT conditions for x=0, and try to find a y that satisfies the conditions. Since you only have linear constraints, the Slater condition is satisfied, so the KKT conditions are necessary and sufficient for optimality.

(23 Jan '15, 09:29) optimizer

In general, the KKT system is a system of nonlinear equations, so it can also be solved only numerically by for example some Newton-type methods, right?

(26 Jan '15, 06:01) Long

In this case the primal solution x=0 is given (for the formulation in standard form, you also have values for s,t and v), so just a dual variable y needs to be found. The equations are nonlinear, but since this problem is so small, you may be able to analytically solve the KKT conditions (or show that no such y exists, which shows that 0 is not optimal). In another answer I have worked out these conditions.

(26 Jan '15, 09:08) optimizer

$$ \begin{pmatrix} P^T & 0 & 0 & -I & 0 \\ -c^T & 1 & 0 & 0 & -1 \\ a^T & 0 & 1 & 0 & 0 \end{pmatrix} \begin{pmatrix} x \\ t \\ s \\ v_1 \\ v_2 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ b \end{pmatrix} \\ x \in \mathbb{R^n}, t \in \mathbb{R}, s \in \mathbb{R_+}, ||v_1|| \leq v_2 $$ The KKT conditions are:

  • primal feasibility (\(x=0, s = a^T x - b = -b, t = ||P^T x|| + c^T x = 0\); we assume that s is nonnegative)

  • dual feasibility: $$ (1) \quad P y_1 + y_3 a - y_2 c = 0 \\ (2) \quad y_2 = 1 \\ (3) \quad y_3 \geq 0 \\ (4) \quad ||y_1|| \leq y_2 $$

  • complementary slackness, which implies that when \(b=0\), (3) holds with equality (\(y_3 = 0\)).

It now seems to make sense to distinguish between \(b=0\) and \(b \neq 0\). When \(b=0\), obviously \(x=0\) is optimal. Otherwise, we need to find a y that satisfies 1-4 to show that x=0 is optimal; i.e., solve: $$ P y_1 + y_3 a = c \\ y_3 \geq 0 \\ ||y_1|| \leq y_2 $$ Note that the columns in P can be assumed to be linearly independent by construction. I don't see an easy way to conclude if this system is feasible. Maybe someone else can comment on it.


answered 26 Jan '15, 09:59

optimizer's gravatar image

accept rate: 6%

edited 26 Jan '15, 10:06

A related question: Assume that i know the optimal solution is NOT zero, is there a way (preferred analytically, or if not possible with as less effort as possible) to compute a point with objective function value < 0 ? My conjecture is that there is a vertex e that satisfies f(e) < 0 so that i can just enumerate all vertices. But i could not prove it so far nor did i find a counterexample.


answered 11 Mar '15, 09:43

Long's gravatar image

accept rate: 0%

You should post this as a separate question, so as not to confuse future readers with intermingled answers to two different questions.

(11 Mar '15, 10:30) Paul Rubin ♦♦

Hi Paul, thank you. By the way how can if write math code here?

(11 Mar '15, 10:42) Long
Your answer
toggle preview

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here



Answers and Comments

Markdown Basics

  • *italic* or _italic_
  • **bold** or __bold__
  • link:[text](http://url.com/ "Title")
  • image?![alt text](/path/img.jpg "Title")
  • numbered list: 1. Foo 2. Bar
  • to add a line break simply add two spaces to where you would like the new line to be.
  • basic HTML tags are also supported



Asked: 20 Jan '15, 09:05

Seen: 2,204 times

Last updated: 11 Mar '15, 10:42

OR-Exchange! Your site for questions, answers, and announcements about operations research.