==== astype vs. to_numeric ====
Example 1:
Say we are trying to convert a column of strings to floats
In [1]:
import pandas as pd
df = pd.DataFrame({'foo': ['1', '1.52', '', '1.6']})
df
Out[1]:
foo
0 1
1 1.52
2
3 1.6
astype('float') gives a ValueError.
In [2]:
df['foo'].astype('float')
...
ValueError: could not convert string to float: ''
But to_numeric() with errors='coerce' works.
In [3]:
pd.to_numeric(df['foo'], errors='coerce')
Out[3]:
0 1.00
1 1.52
2 NaN
3 1.60
Name: foo, dtype: float64
tags | astype float but set empty string to NaN
Ref:
* https://stackoverflow.com/questions/35465741/pandas-convert-column-with-empty-strings-to-float
* https://pandas.pydata.org/docs/reference/api/pandas.to_numeric.html
==== code snippets ====
Strip '%' character at the end and convert it to a number. Empty strings will be converted to NaNs.
df["pctchange"] = (
df["pctchange"].str.rstrip("%").apply(pd.to_numeric, errors="coerce")
)
Ref: http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.apply.html