pyspark.sql.functions.regr_sxy#

pyspark.sql.functions.regr_sxy(y, x)[source]#

Aggregate function: returns REGR_COUNT(y, x) * COVAR_POP(y, x) for non-null pairs in a group, where y is the dependent variable and x is the independent variable.

New in version 3.5.0.

Parameters
yColumn or str

the dependent variable.

xColumn or str

the independent variable.

Returns
Column

REGR_COUNT(y, x) * COVAR_POP(y, x) for non-null pairs in a group.

Examples

>>> from pyspark.sql import functions as sf
>>> x = (sf.col("id") % 3).alias("x")
>>> y = (sf.randn(42) + x * 10).alias("y")
>>> spark.range(0, 1000, 1, 1).select(x, y).select(
...     sf.regr_sxy("y", "x")
... ).show()
+----------------+
|  regr_sxy(y, x)|
+----------------+
|6696.93065315...|
+----------------+