================================================================================================
Benchmark to measure CSV read/write performance
================================================================================================

OpenJDK 64-Bit Server VM 1.8.0_372-b07 on Linux 5.15.0-1040-azure
Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Parsing quoted values:                    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
One quoted string                                 43827          44673         740          0.0      876536.0       1.0X

OpenJDK 64-Bit Server VM 1.8.0_372-b07 on Linux 5.15.0-1040-azure
Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Wide rows with 1000 columns:              Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Select 1000 columns                               93035          94150        1041          0.0       93035.3       1.0X
Select 100 columns                                34333          34440         185          0.0       34333.3       2.7X
Select one column                                 28763          28860         116          0.0       28763.1       3.2X
count()                                            7449           7665         300          0.1        7448.9      12.5X
Select 100 columns, one bad input field           50278          50458         175          0.0       50277.6       1.9X
Select 100 columns, corrupt record field          53481          53833         540          0.0       53480.7       1.7X

OpenJDK 64-Bit Server VM 1.8.0_372-b07 on Linux 5.15.0-1040-azure
Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Count a dataset with 10 columns:          Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Select 10 columns + count()                       13070          13085          19          0.8        1307.0       1.0X
Select 1 column + count()                         11406          11437          35          0.9        1140.6       1.1X
count()                                            2840           2873          30          3.5         284.0       4.6X

OpenJDK 64-Bit Server VM 1.8.0_372-b07 on Linux 5.15.0-1040-azure
Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Write dates and timestamps:               Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
Create a dataset of timestamps                     1150           1169          26          8.7         115.0       1.0X
to_csv(timestamp)                                  9488           9499          15          1.1         948.8       0.1X
write timestamps to files                          9194           9205          13          1.1         919.4       0.1X
Create a dataset of dates                          1497           1506          15          6.7         149.7       0.8X
to_csv(date)                                       6030           6041          18          1.7         603.0       0.2X
write dates to files                               5722           5729           7          1.7         572.2       0.2X

OpenJDK 64-Bit Server VM 1.8.0_372-b07 on Linux 5.15.0-1040-azure
Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Read dates and timestamps:                                             Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
-----------------------------------------------------------------------------------------------------------------------------------------------------
read timestamp text from files                                                  1528           1560          28          6.5         152.8       1.0X
read timestamps from files                                                     27594          27600           8          0.4        2759.4       0.1X
infer timestamps from files                                                    54923          54958          49          0.2        5492.3       0.0X
read date text from files                                                       1388           1389           2          7.2         138.8       1.1X
read date from files                                                           13358          13388          43          0.7        1335.8       0.1X
infer date from files                                                          27254          27304          46          0.4        2725.4       0.1X
timestamp strings                                                               2688           2698          11          3.7         268.8       0.6X
parse timestamps from Dataset[String]                                          30710          30731          21          0.3        3071.0       0.0X
infer timestamps from Dataset[String]                                          58123          58211         122          0.2        5812.3       0.0X
date strings                                                                    2804           2805           1          3.6         280.4       0.5X
parse dates from Dataset[String]                                               15409          15459          58          0.6        1540.9       0.1X
from_csv(timestamp)                                                            29102          29113          17          0.3        2910.2       0.1X
from_csv(date)                                                                 15682          15687           6          0.6        1568.2       0.1X
infer error timestamps from Dataset[String] with default format                17912          17926          12          0.6        1791.2       0.1X
infer error timestamps from Dataset[String] with user-provided format          17892          17911          26          0.6        1789.2       0.1X
infer error timestamps from Dataset[String] with legacy format                 17929          17935          10          0.6        1792.9       0.1X

OpenJDK 64-Bit Server VM 1.8.0_372-b07 on Linux 5.15.0-1040-azure
Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
Filters pushdown:                         Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
w/o filters                                       17003          17018          14          0.0      170025.5       1.0X
pushdown disabled                                 17092          17103          10          0.0      170919.6       1.0X
w/ filters                                         1340           1352          13          0.1       13395.9      12.7X


