Search for a command to run...
Abstract Negative distance kernels $$K(x,y) := - \Vert x-y\Vert $$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>K</mml:mi> <mml:mo>(</mml:mo> <mml:mi>x</mml:mi> <mml:mo>,</mml:mo> <mml:mi>y</mml:mi> <mml:mo>)</mml:mo> <mml:mo>:</mml:mo> <mml:mo>=</mml:mo> <mml:mo>-</mml:mo> <mml:mo>‖</mml:mo> <mml:mi>x</mml:mi> <mml:mo>-</mml:mo> <mml:mi>y</mml:mi> <mml:mo>‖</mml:mo> </mml:mrow> </mml:math> were used in the definition of maximum mean discrepancies (MMDs) in statistics and lead to favorable numerical results in various applications. In particular, so-called slicing techniques for handling high-dimensional kernel summations profit from the simple parameter-free structure of the distance kernel. However, due to its non-smoothness in $$x=y$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mrow> <mml:mi>x</mml:mi> <mml:mo>=</mml:mo> <mml:mi>y</mml:mi> </mml:mrow> </mml:math> , most of the classical theoretical results, e.g., on Wasserstein gradient flows of the corresponding MMD functional, no longer hold true. In this paper, we propose a new kernel which keeps the favorable properties of the negative distance kernel as being conditionally positive definite of order one with a nearly linear increase towards infinity and a simple slicing structure, but is Lipschitz differentiable now. Our construction is based on a simple 1D smoothing procedure of the absolute value function followed by a Riemann–Liouville fractional integral transform. Numerical results demonstrate that the new kernel performs similarly well as the negative distance kernel in gradient descent methods, but now with theoretical guarantees.
Published in: Advances in Computational Mathematics
Volume 52, Issue 2