raise core._status_to_exception(e) from None tensorflow.python.framework.errors_impl.InvalidArgumentError:

136 Views Asked by At

Trying to train an object detection model on the Citypersons dataset following this tutorial. https://neptune.ai/blog/how-to-train-your-own-object-detector-using-tensorflow-object-detection-api

I run the following command:

python model_main_tf2.py --pipeline_config_path=models/MaskCNN/v1/pipeline.config --  model_dir=models/MaskCNN/v1/  --checkpoint_every_n=4  --num_workers=2  --alsologtostderr

And getting the following error:

File "/Users/Desktop/TensorFlow/tf2_api_env/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 7215, in raise_from_not_ok_status
raise core._status_to_exception(e) from None  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: {{function_node __wrapped__IteratorGetNext_output_types_18_device_/job:localhost/replica:0/task:0/device:CPU:0}} indices[0] = 0 is not in [0, 0)
     [[{{node GatherV2_7}}]]
     [[MultiDeviceIteratorGetNextFromShard]]
     [[RemoteCall]] [Op:IteratorGetNext]

I am fairly new to this and can't figure out the problem, any help would be greatly appreciated. Thanks.

1

There are 1 best solutions below

2
Musabbir Arrafi On

I faced this same issue when I tried to train efficientDet-object-detection model, from exported datasets as tfrecord from Roboflow. Here's how I solve this:

This error means your generated tf-record files are corrupted. Use this following script to check the state of the tf-records:

import tensorflow as tf

def is_tfrecord_corrupted(tfrecord_file):
    try:
        for record in tf.data.TFRecordDataset(tfrecord_file):
            # Attempt to parse the record
            _ = tf.train.Example.FromString(record.numpy())
    except tf.errors.DataLossError as e:
        print(f"DataLossError encountered: {e}")
        return True
    except Exception as e:
        print(f"An error occurred: {e}")
        return True
    return False

# Replace with your TFRecord file paths 
tfrecord_files = ['your_test_record_fname', 'your_train_record_fname']

for tfrecord_file in tfrecord_files:
  if is_tfrecord_corrupted(tfrecord_file):
      print(f"The TFRecord file {tfrecord_file} is corrupted.")
  else:
      print(f"The TFRecord file {tfrecord_file} is fine.")

To fix the corrupted tfrecords, I exported the datasets as pascal-voc format, and then I wrote the following script hosted here on GitHub to generate new tfrecords from the pascal-voc formatted dataset.

Script to generate new tfrecords is here: https://github.com/arrafi-musabbir/license-plate-detection-recognition/blob/main/generate_tfrecord.py

  • Create your own label-map-pbtxt according to your dataset:
label_path = "your label_map.pbtxt path"

# modify according to your dataset class names
labels = [{'name':'license', 'id':1}]

with open(label_path, 'w') as f:
    for label in labels:
        f.write('item { \n')
        f.write('\tname:\'{}\'\n'.format(label['name']))
        f.write('\tid:{}\n'.format(label['id']))
        f.write('}\n')
  • Run the script as the following:
python generate_tfrecord.py -x {train_dir_path} -l {labelmap_path} -o {new_train_record_path}
python generate_tfrecord.py -x {valid_dir_path} -l {labelmap_path} -o {new_valid_record_path}
python generate_tfrecord.py -x {test_dir_path} -l {labelmap_path} -o {new_test_record_path}

Afterwards, run the is_tfrecord_corrupted(tfrecord_file) again and you will see that the tfrecords are fine.

enter image description here