I am trying to export data using sqoop from files stored in hdfs to vertica. For around 10k's of data the files get loaded within a few minutes. But when I try to run crores of data, it is loading around .5% within 15 mins or so. I have tried to increase the number of mappers, but they are not serving any purpose to improve efficienct. Even setting the chunk size to increase the number the mappers, does not increase the number.
Please help.
Thanks!
Most MPP/RDBMS have sqoop connectors to exploit the parallelism and increase efficiency in transfer of data between HDFS and MPP/RDBMS. However it seems the vertica has taken this approach: http://www.vertica.com/2012/07/05/teaching-the-elephant-new-tricks/ https://github.com/vertica/Vertica-Hadoop-Connector