Search for a command to run...
SUMMARY & CONCLUSIONSChatterjee’s ξ-correlation is a powerful, rank-based dependence measure that captures a wide range of functional relationships between variables. While its statistical advantages over classical methods like Pearson or Spearman correlation are well documented, the computational cost of calculating ξ — especially for large or streaming datasets—has hindered its use in real-time streaming applications.This paper introduces a sketch-based approximation framework for ξ-correlation, using an adapted Greenwald- Khanna quantile sketch to reduce memory and computational requirements. Our proposed method, called the ξ-sketch, enables efficient online correlation estimation with theoretical error guarantees. We prove that the sketch converges to the true ξ value with approximation error O(ε) under dependence and $O(\sqrt \varepsilon )$ otherwise, where ε is the sketch tolerance parameter.Empirical evaluations demonstrate that the ξ-sketch retains high accuracy across diverse settings, including both synthetic and real-world datasets, while reducing computational cost by over 90% in large-scale regimes. Additionally, the method is robust to tied values and non-monotonic dependencies, and remains interpretable and parameter-free.The ξ-sketch makes it practical to incorporate Chatterjee’s correlation in streaming analytics, anomaly detection, and IoT telemetry, where low-latency and low-memory solutions are essential. By tuning the sketch’s precision parameter, users can effectively trade off accuracy and efficiency to meet deployment constraints. This approach paves the way for scalable, nonparametric quantification of the relationship and the strength of association between different variables. This helps in understanding how uncertainties and variations in one parameter might affect another, which is critical for structural reliability and engineering.