I want to capture the difference between the number of records collected by the source and the number of records that are emitted.
When I use Kafka source and send 4 records, out of which 1 is corrupt(will give decoding failure), I get the correct values for Source__Kafka_collector.numRecordsIn(4) & Source__Kafka_collector.numRecordsOut(3):
But in the same way, if I send 10 records to a File Source out of which 1 is corrupt(will give decoding failure), I get the correct value for Source__File_Collector.numrecordsOut(9), but the value for Source__File_Collector.numRecordIn(9) is incorrect.
The value of Source__File_Collector.numRecordsIn metric should have been 10 as it read 10 records from a file.
Can this be a bug in FileSource? Or is there any reason for this difference in the behavior of File & Kafka Source?