pyspark.sql.functions.date_format#

pyspark.sql.functions.date_format(date, format)[source]#

Converts a date/timestamp/string to a value of string in the format specified by the date format given by the second argument.

A pattern could be for instance dd.MM.yyyy and could return a string like ‘18.03.1993’. All pattern letters of datetime pattern. can be used.

New in version 1.5.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
dateColumn or column name

input column of values to format.

format: literal string

format to use to represent datetime values.

Returns
Column

string value representing formatted datetime.

Notes

Whenever possible, use specialized functions like year.

Examples

Example 1: Format a string column representing dates

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([('2015-04-08',), ('2024-10-31',)], ['dt'])
>>> df.select("*", sf.typeof('dt'), sf.date_format('dt', 'MM/dd/yyyy')).show()
+----------+----------+---------------------------+
|        dt|typeof(dt)|date_format(dt, MM/dd/yyyy)|
+----------+----------+---------------------------+
|2015-04-08|    string|                 04/08/2015|
|2024-10-31|    string|                 10/31/2024|
+----------+----------+---------------------------+

Example 2: Format a string column representing timestamp

>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([('2015-04-08 13:08:15',), ('2024-10-31 10:09:16',)], ['ts'])
>>> df.select("*", sf.typeof('ts'), sf.date_format('ts', 'yy=MM=dd HH=mm=ss')).show()
+-------------------+----------+----------------------------------+
|                 ts|typeof(ts)|date_format(ts, yy=MM=dd HH=mm=ss)|
+-------------------+----------+----------------------------------+
|2015-04-08 13:08:15|    string|                 15=04=08 13=08=15|
|2024-10-31 10:09:16|    string|                 24=10=31 10=09=16|
+-------------------+----------+----------------------------------+

Example 3: Format a date column

>>> import datetime
>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([
...     (datetime.date(2015, 4, 8),),
...     (datetime.date(2024, 10, 31),)], ['dt'])
>>> df.select("*", sf.typeof('dt'), sf.date_format('dt', 'yy--MM--dd')).show()
+----------+----------+---------------------------+
|        dt|typeof(dt)|date_format(dt, yy--MM--dd)|
+----------+----------+---------------------------+
|2015-04-08|      date|                 15--04--08|
|2024-10-31|      date|                 24--10--31|
+----------+----------+---------------------------+

Example 4: Format a timestamp column

>>> import datetime
>>> from pyspark.sql import functions as sf
>>> df = spark.createDataFrame([
...     (datetime.datetime(2015, 4, 8, 13, 8, 15),),
...     (datetime.datetime(2024, 10, 31, 10, 9, 16),)], ['ts'])
>>> df.select("*", sf.typeof('ts'), sf.date_format('ts', 'yy=MM=dd HH=mm=ss')).show()
+-------------------+----------+----------------------------------+
|                 ts|typeof(ts)|date_format(ts, yy=MM=dd HH=mm=ss)|
+-------------------+----------+----------------------------------+
|2015-04-08 13:08:15| timestamp|                 15=04=08 13=08=15|
|2024-10-31 10:09:16| timestamp|                 24=10=31 10=09=16|
+-------------------+----------+----------------------------------+