Let $V$ be a finite-dimensional real vector space (e.g space of $m \times n$ real matrices equiped with Hilbert-Schmidt inner product $(A,B) \to \mathrm{tr}(AB^\top)$, and let $f:V^2 \to \mathbb R$, $(x,y) \to f(x,y)$ be a continuously-differentiable function (say with Lipschitz-continuous gradient, if that helps).
What is a principled strategy to go about minimizing $f$ on $V^2$ subject to the constraint $x \perp y$ ?
Of course, one could try do something like
- Do gradient descent on $x$ and $y$.
- Project $y$ on the orthogonal complement of $x$.
I don't how rigorous this is or of whether it even converges.