How to load delimited file with apache druid

15 Views Asked by At

I like to ingest delimited file to druid. However, I cannot ingest anything because my data is not parsed. Can someone tell me what I did wrong.

My tsv file is like

1, 0
2.8736, 8.29
7.10, 8.83

My task spec is below

{   "type" : "index",   "spec" : {
    "dataSchema" : {
      "dataSource" : "test_append",
      "timestampSpec": {
        "column": "timestamp",
        "format": "iso"
      },
      "dimensionsSpec" : {
      "dimensions": [{ "name" : "requested", "type" : "long" },
         { "name" : "bfee", "type" : "long" }]
      }
    },
     "ioConfig" : {
      "type" : "index",
      "inputSource" : {
        "type" : "local",
        "baseDir" : "/tmp/test",
        "filter" : "test_append.csv"
      },
      "inputFormat" : {
        "type" : "tsv",
        "columns" : [ "requested", "bfee" ]
      },
      "appendToExisting" : true,
      "dropExisting" : false
    },
    "tuningConfig" : {
      "type" : "index_parallel",
      "maxRowsPerSegment" : 5000000,
      "maxRowsInMemory" : 25000
    }   } }

Log shows data is not parsed so cannot be published

2023-12-04T21:40:43,048 INFO [[index_test_append_bnilolmo_2023-12-04T21:40:38.315Z]-appenderator-merge] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Preparing to push (stats): processed rows: [0], sinks: [0], fireHydrants (across sinks): [0] 2023-12-04T21:40:43,049 INFO [[index_test_append_bnilolmo_2023-12-04T21:40:38.315Z]-appenderator-merge] org.apache.druid.segment.realtime.appenderator.AppenderatorImpl - Push complete... 2023-12-04T21:40:43,057 INFO [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Nothing to publish, skipping publish step. 2023-12-04T21:40:43,058 INFO [task-runner-0-priority-0] org.apache.druid.indexing.common.task.IndexTask - Processed[0] events, unparseable[4], thrownAway[0].

0

There are 0 best solutions below