How to Correctly Read European Thousands and Decimals from a CSV File
European numbers use a period (.) as the thousands separator and a comma (,) as the decimal point, which is different from the US style. When reading a CSV file with pandas, these differences can cause errors. To fix this, use the thousands and decimal parameters in the read_csv
function.
from io import StringIO
test_data = """
col1;col2
A;3.000,12
B;2.000,22
"""
wrong = pd.read_csv(StringIO(test_data), sep=';', decimal=",")
# If the numbers have both separators, make sure to specify both parameters.
corret = pd.read_csv(StringIO(test_data), sep=';', decimal=",", thousands=".")
print(wrong.dtypes)
print(correct.dtypes)