pyspark.sql.functions.date_format#
- pyspark.sql.functions.date_format(date, format)[source]#
Converts a date/timestamp/string to a value of string in the format specified by the date format given by the second argument.
A pattern could be for instance dd.MM.yyyy and could return a string like ‘18.03.1993’. All pattern letters of datetime pattern. can be used.
New in version 1.5.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- date
Column
or column name input column of values to format.
- format: literal string
format to use to represent datetime values.
- date
- Returns
Column
string value representing formatted datetime.
See also
Notes
Whenever possible, use specialized functions like year.
Examples
Example 1: Format a string column representing dates
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([('2015-04-08',), ('2024-10-31',)], ['dt']) >>> df.select("*", sf.typeof('dt'), sf.date_format('dt', 'MM/dd/yyyy')).show() +----------+----------+---------------------------+ | dt|typeof(dt)|date_format(dt, MM/dd/yyyy)| +----------+----------+---------------------------+ |2015-04-08| string| 04/08/2015| |2024-10-31| string| 10/31/2024| +----------+----------+---------------------------+
Example 2: Format a string column representing timestamp
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([('2015-04-08 13:08:15',), ('2024-10-31 10:09:16',)], ['ts']) >>> df.select("*", sf.typeof('ts'), sf.date_format('ts', 'yy=MM=dd HH=mm=ss')).show() +-------------------+----------+----------------------------------+ | ts|typeof(ts)|date_format(ts, yy=MM=dd HH=mm=ss)| +-------------------+----------+----------------------------------+ |2015-04-08 13:08:15| string| 15=04=08 13=08=15| |2024-10-31 10:09:16| string| 24=10=31 10=09=16| +-------------------+----------+----------------------------------+
Example 3: Format a date column
>>> import datetime >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([ ... (datetime.date(2015, 4, 8),), ... (datetime.date(2024, 10, 31),)], ['dt']) >>> df.select("*", sf.typeof('dt'), sf.date_format('dt', 'yy--MM--dd')).show() +----------+----------+---------------------------+ | dt|typeof(dt)|date_format(dt, yy--MM--dd)| +----------+----------+---------------------------+ |2015-04-08| date| 15--04--08| |2024-10-31| date| 24--10--31| +----------+----------+---------------------------+
Example 4: Format a timestamp column
>>> import datetime >>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([ ... (datetime.datetime(2015, 4, 8, 13, 8, 15),), ... (datetime.datetime(2024, 10, 31, 10, 9, 16),)], ['ts']) >>> df.select("*", sf.typeof('ts'), sf.date_format('ts', 'yy=MM=dd HH=mm=ss')).show() +-------------------+----------+----------------------------------+ | ts|typeof(ts)|date_format(ts, yy=MM=dd HH=mm=ss)| +-------------------+----------+----------------------------------+ |2015-04-08 13:08:15| timestamp| 15=04=08 13=08=15| |2024-10-31 10:09:16| timestamp| 24=10=31 10=09=16| +-------------------+----------+----------------------------------+