pyspark.sql.Column.substr#

Column.substr(startPos, length)[source]#

Return a Column which is a substring of the column.

New in version 1.3.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
startPosColumn or int

start position

lengthColumn or int

length of the substring

Returns
Column

Column representing whether each element of Column is substr of origin Column.

Examples

Example 1. Using integers for the input arguments.

>>> df = spark.createDataFrame(
...      [(2, "Alice"), (5, "Bob")], ["age", "name"])
>>> df.select(df.name.substr(1, 3).alias("col")).collect()
[Row(col='Ali'), Row(col='Bob')]

Example 2. Using columns for the input arguments.

>>> df = spark.createDataFrame(
...      [(3, 4, "Alice"), (2, 3, "Bob")], ["sidx", "eidx", "name"])
>>> df.select(df.name.substr(df.sidx, df.eidx).alias("col")).collect()
[Row(col='ice'), Row(col='ob')]