I decided to try odo for handling my csv data because it supposedly is way faster than doing it with pandas, but I can't make it work.
This is their documentation about migrating a 30gb csv file to a MySQL database:
In [1]: %time t = odo('all.csv', 'mysql+pymysql://localhost/test::nyc')
CPU times: user 1.32 s, sys: 304 ms, total: 1.63 s
Wall time: 20min 49s
I tried the same in my local MySQL instance on my OS X El Capitan but it gives me the following error:
/Library/Python/2.7/site-packages/PyMySQL-0.7.1-py2.7.egg/pymysql/err.pyc in _check_mysql_exception(errinfo)
113
114 # couldn't find the right error number
--> 115 raise InternalError(errno, errorvalue)
116
117
InternalError: (pymysql.err.InternalError) (13, u"Can't get stat of '/path/to/test.csv' (Errcode: 13 - Permission denied)") [SQL: u'LOAD DATA INFILE %(path)s\n INTO TABLE test_file2\n CHARACTER SET %(encoding)s\n FIELDS\n TERMINATED BY %(delimiter)s\n ENCLOSED BY %(quotechar)s\n ESCAPED BY %(escapechar)s\n LINES TERMINATED BY %(lineterminator)s\n IGNORE %(skiprows)s LINES\n '] [parameters: {'escapechar': '\\', 'encoding': 'utf8', 'skiprows': 1, 'delimiter': ',', 'lineterminator': u'\n', 'quotechar': '"', 'path': '/path/to/test.csv'}]
At first, I thought it was an error with file permissions, but then I discovered that the table is created successfully and with the right column names, so odo is able to access the file. So, I don't really understand the error (13, u"Can't get stat of '/path/to/test.csv' (Errcode: 13 - Permission denied)")
What else can I check?
Well, it was because MySQL couldn't access the file. It seems that it needs to be inside the mysql installation folder, which in my case was
/usr/local/mysql-5.6.20-osx10.7-x86_64/
.I tried to put it inside the data folder,
/usr/local/mysql-5.6.20-osx10.7-x86_64/data/my_db
and that didn't work out.