Performance improvement hints [Python]

If a column, e.g. the timestamp is frequently used to access the DataFrame rows, set it as index. If it should be a datetime format, you may want to convert to datetime from string format.


Numpy vectorization. Apply .values (or .to_numpy() for pandas version > 0.24) to pandas dataframe columns for converting them and iterate over it, e.g. using list comprehensions:

Use the pandas to_datetime function specifying the format parameter to speed it up:

Use list of column names for dataframe columns instead of using loops. For example, this

could be dozens of times faster than this:

[.. to be continued ..]


