source_localization.proximal_gradient module

source_localization.proximal_gradient.proximal_gradient(x0, fun1, proximal_fun2, args=(), options=None, callback=None)[source]

Minimize a function of the shape \(f(x)+g(x)\) using the proximal gradient method, with \(f\) a differentiable function and \(g\) a function whose proximal operator is known, e.g. a LASSO or group-LASSO regularization term).

The proximal gradient method, also refered to as ISTA, combines the gradient-descent step for the differentiable part with the proximal step for the non-differentiable part as following:

  • Initialization: Starting from an initial point \(x_0\).

  • For each iteration \(k\):

    1. compute the gradient of the function \(\nabla f(x_k)\) at the current point \(x_k\),

    2. Update the estimate of \(x\) using the rule \(x_{k+1} = \text{prox}^g_\alpha(x_k - \alpha \nabla f(x_k))\) where \(\text{prox}^g_\lambda\) is the proximal operator of \(g\) and \(\alpha\) is the step size of the gradient descent step.

  • Convergence Check: the algorithm stops when the maximum number of iterations is reached.

An accelerated version, called FISTA, incorporates an inertial term, that takes into account not only \(x_k\) but also \(x_{k-1}\), to speed up convergence.

Parameters:
  • x0 (ndarray) – Initial point \(x_0\) from which the iterative optimization algorithm starts.

  • fun1 (callable) – Method that, given \(x\), returns the evaluation of the function \(f(x)\) and its gradient \(\nabla f(x)\).

  • proximal_fun2 (callable) – Method that, given \(x\) and a parameter \(\lambda\), returns the evaluation of the function \(\text{prox}^g_\lambda(x)\).

  • args (tuple, optional) – Arguments that should be given to the callable fun1() besides the array containing the current value of \(x\).

  • options (dict, optional, default: None) –

    Dictionary containing several options to customize the proximal gradient method and its stopping criteria. These options include:

    • ’algorithm’: str, default: ‘ISTA’, the type of algorithm used (‘ISTA’ or ‘FISTA’).

    • ’step_size’: float, default: 1, the step size of the gradient descent.

    • ’nit_max’: int, default: 50, the maximal number of iterations.

  • callback (callable, optional) – Method called at each optimization iteration. The method should take as sole input the array containing the current value of \(x\).

Returns:

  • x (ndarray) – Optimal value \(x^*\).

  • f (float) – Evaluation of the function \(f\) at the optimal value \(f(x^*)\).

  • df (ndarray) – Evaluation of the gradient of the function \(f\) at the optimal value \(\nabla f(x^*)\).

  • nit (int) – The number of iterations needed to converge.