My Python/Pandas code is working fine on my MacOS, but now that I've moved it to Windows, it's not working due to type differences and I'm getting an error when trying to write to gbq (Google Big Query):
The code is as follows:
def formatNumber(x):
if math.isnan(x):
f_number = 0.0
else:
f_number = str(round(x, 8))
return f_number
... <reading df from file> ...
print("A")
print(df.info())
df['Date'] = [x.date().strftime("%Y-%m-%d") for x in df['Date']]
df['A'] = [formatNumber(x) for x in df['A']]
# drop duplicates
print(df.shape)
df = df.drop_duplicates()
print(df.shape)
# upload to bigquery
print("B")
print(df.info())
table_schema = [{
'name': 'Date',
'type': 'date'
}, {
'name': 'A',
'type': 'numeric'
}, {
'name': 'B',
'type': 'string'
}]
df.to_gbq('tablename',
'dbname',
chunksize=None,
if_exists='replace',
table_schema=table_schema,
credentials=credentials
)
The output is:
A
<class 'pandas.core.frame.DataFrame'>
Int64Index: 82624 entries, 0 to 9
Data columns (total 13 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 82624 non-null datetime64[ns]
1 A 82624 non-null float64
2 B 80769 non-null object
...
dtypes: datetime64[ns](1), float64(6), object(6)
memory usage: 8.8+ MB
None
(82624, 13)
(82624, 13)
[5 rows x 13 columns]
B
<class 'pandas.core.frame.DataFrame'>
Int64Index: 82624 entries, 0 to 9
Data columns (total 13 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 82624 non-null datetime64[ns]
1 A 82624 non-null object
2 B 80769 non-null object
...
dtypes: datetime64[ns](1), float64(6), object(6)
memory usage: 8.8+ MB
Error message:
File "pyarrow\array.pxi", line 1044, in pyarrow.lib.Array.from_pandas
File "pyarrow\array.pxi", line 316, in pyarrow.lib.array
File "pyarrow\array.pxi", line 83, in pyarrow.lib._ndarray_to_array
File "pyarrow\error.pxi", line 123, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: Expected bytes, got a 'datetime.time' object
Another difference I've noticed between running it on MacOS and Windows are the changes to indexes on MacOS whereas nothing changes on Windows.
MacOS:
- A --> Int64Index: 82624 entries, 0 to 1015
- B --> RangeIndex: 1016 entries, 0 to 1015
Windows:
- A and B --> Int64Index: 82624 entries, 0 to 9
try to change
to
it appears The error you are receiving suggests that there is a type incompatibility between the datetime.time object and the expected bytes type. This may be caused by a difference in the behavior of the strftime() method of the datetime object on MacOS and Windows.