Sqoop Incremental Import and CURRENT_TIMESTAMP

1.2k Views Asked by At

I am trying incremental import through SQOOP from Teradata to Hadoop. which is not working in my case.

It seems from error that SQOOP internally creating SQL which are syntactically incorrect. I even tried with --verbose option ....no usefull info.

Here is Table schema at Teradata which I am importing to Hadoop :

CREATE TABLE Employee ( EmpNo INT NOT NULL,  EmpName CHAR(30),  DOB DATE, Mob integer,  LastUpdated timestamp  );

Here is import command :

sqoop import --connect jdbc:teradata://XXXXXXXX/Database=XXXXX  --driver com.teradata.jdbc.TeraDriver --username XXXXX --password XXXXXX  --table Employee --target-dir /user/hive/incremental_emp_table -m 1 --check-column LastUpdated --incremental lastmodified --last-value "2001-12-17 07:36:01.280000"

and I get the following:

Warning: /usr/share/hadoop_echosystem/sqoop-1.4.5//../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /usr/share/hadoop_echosystem/sqoop-1.4.5//../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
Note: /tmp/sqoop-cloud/compile/917cdf768aea5267d838a949502ed0d0/Employee.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/share/hadoop_echosystem/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/share/hadoop_echosystem/hbase-0.96.1-hadoop2/lib/slf4j-log4j12-1.6.4.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/share/hadoop_echosystem/apache-hive-1.0.0-bin/lib/hive-jdbc-1.0.0-standalone.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/06/16 14:20:16 ERROR manager.SqlManager: SQL exception accessing current timestamp: com.teradata.jdbc.jdbc_4.util.JDBCException: [Teradata Database] [TeraJDBC 14.10.00.26] [Error 3706] [SQLState 42000] Syntax error: expected something between '(' and ')'.
com.teradata.jdbc.jdbc_4.util.JDBCException: [Teradata Database] [TeraJDBC 14.10.00.26] [Error 3706] [SQLState 42000] Syntax error: expected something between '(' and ')'.
    at com.teradata.jdbc.jdbc_4.util.ErrorFactory.makeDatabaseSQLException(ErrorFactory.java:307)
    at com.teradata.jdbc.jdbc_4.statemachine.ReceiveInitSubState.action(ReceiveInitSubState.java:109)
    at com.teradata.jdbc.jdbc_4.statemachine.StatementReceiveState.subStateMachine(StatementReceiveState.java:314)
    at com.teradata.jdbc.jdbc_4.statemachine.StatementReceiveState.action(StatementReceiveState.java:202)
    at com.teradata.jdbc.jdbc_4.statemachine.StatementController.runBody(StatementController.java:123)
    at com.teradata.jdbc.jdbc_4.statemachine.StatementController.run(StatementController.java:114)
    at com.teradata.jdbc.jdbc_4.TDStatement.executeStatement(TDStatement.java:384)
    at com.teradata.jdbc.jdbc_4.TDStatement.executeStatement(TDStatement.java:326)
    at com.teradata.jdbc.jdbc_4.TDStatement.doNonPrepExecuteQuery(TDStatement.java:314)
    at com.teradata.jdbc.jdbc_4.TDStatement.executeQuery(TDStatement.java:1091)
    at org.apache.sqoop.manager.SqlManager.getCurrentDbTimestamp(SqlManager.java:960)
    at org.apache.sqoop.tool.ImportTool.initIncrementalConstraints(ImportTool.java:328)
    at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:488)
    at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:601)
    at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
    at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
15/06/16 14:20:16 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Could not get current time from database
    at org.apache.sqoop.tool.ImportTool.initIncrementalConstraints(ImportTool.java:330)
    at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:488)
    at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:601)
    at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
    at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
    at org.apache.sqoop.Sqoop.main(Sqoop.java:236)

I have walked through implementation of method org.apache.sqoop.manager.SqlManager.getCurrentDbTimestamp()

   protected String getCurTimestampQuery() {
     return "SELECT CURRENT_TIMESTAMP()";
   }

SqlManager uses "SELECT CURRENT_TIMESTAMP();" to get current timestamp which is syntactically incorrect.

For teradata, It should be "SELECT CURRENT_TIMESTAMP;"

Please help me in resolving the issue.

1

There are 1 best solutions below

0
On

It's bug .... Current timestamp query needs to be DB specific.

I have raised JIRA for the same.

https://issues.apache.org/jira/browse/SQOOP-2402