source_localization.proximal_gradient module

source_localization.proximal_gradient.proximal_gradient(x0, fun1, proximal_fun2, args=(), options=None, callback=None)[source]

Minimize a function of the shape \(f(x)+g(x)\) using the proximal gradient method, with \(f\) a differentiable function and \(g\) a function whose proximal operator is known, e.g. a LASSO or group-LASSO regularization term).

The proximal gradient method, also refered to as ISTA, combines the gradient-descent step for the differentiable part with the proximal step for the non-differentiable part as following:

Initialization: Starting from an initial point \(x_0\).

For each iteration \(k\):

compute the gradient of the function \(\nabla f(x_k)\) at the current point \(x_k\),

Update the estimate of \(x\) using the rule \(x_{k+1} = \text{prox}^g_\alpha(x_k - \alpha \nabla f(x_k))\) where \(\text{prox}^g_\lambda\) is the proximal operator of \(g\) and \(\alpha\) is the step size of the gradient descent step.

Convergence Check: the algorithm stops when the maximum number of iterations is reached.

An accelerated version, called FISTA, incorporates an inertial term, that takes into account not only \(x_k\) but also \(x_{k-1}\), to speed up convergence.

Parameters:

x0 (ndarray) – Initial point \(x_0\) from which the iterative optimization algorithm starts.
fun1 (callable) – Method that, given \(x\), returns the evaluation of the function \(f(x)\) and its gradient \(\nabla f(x)\).
proximal_fun2 (callable) – Method that, given \(x\) and a parameter \(\lambda\), returns the evaluation of the function \(\text{prox}^g_\lambda(x)\).
args (tuple, optional) – Arguments that should be given to the callable fun1() besides the array containing the current value of \(x\).
options (dict, optional, default: None) –
Dictionary containing several options to customize the proximal gradient method and its stopping criteria. These options include:
- ’algorithm’: str, default: ‘ISTA’, the type of algorithm used (‘ISTA’ or ‘FISTA’).
- ’step_size’: float, default: 1, the step size of the gradient descent.
- ’nit_max’: int, default: 50, the maximal number of iterations.
callback (callable, optional) – Method called at each optimization iteration. The method should take as sole input the array containing the current value of \(x\).

Returns:

x (ndarray) – Optimal value \(x^*\).
f (float) – Evaluation of the function \(f\) at the optimal value \(f(x^*)\).
df (ndarray) – Evaluation of the gradient of the function \(f\) at the optimal value \(\nabla f(x^*)\).
nit (int) – The number of iterations needed to converge.