TypeError: '...' has type str, but expected one of: bytes

Question

TypeError: '...' has type str, but expected one of: bytes

18.6k Views Asked by djb At 25 September 2020 at 22:42

I'm trying to get basic Tensorflow bounding box object detection working on Open Images Dataset (v6)...

  File "/home/work/models/research/object_detection/dataset_tools/create_oid_tf_record.py", line 115, in main
    tf_example = oid_tfrecord_creation.tf_example_from_annotations_data_frame(
  File "/root/anaconda3/lib/python3.8/site-packages/object_detection/dataset_tools/oid_tfrecord_creation.py", line 71, in tf_example_from_annotations_data_frame
    dataset_util.bytes_feature('{}.jpg'.format(image_id)),
  File "/root/anaconda3/lib/python3.8/site-packages/object_detection/utils/dataset_util.py", line 33, in bytes_feature
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
TypeError: '000411001ff7dd4f.jpg' has type str, but expected one of: bytes

The relevant code seems to be here:

def bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

I thought maybe value=[value.encode()] might fix it, but then it says:

AttributeError: 'bytes' object has no attribute 'encode'

(Well which is it, TF? bytes or str?)

The row in the input file contains:

ImageID,Source,LabelName,Confidence,XMin,XMax,YMin,YMax,IsOccluded,IsTruncated,IsGroupOf,IsDepiction,IsInside,XClick1X,XClick2X,XClick3X,XClick4X,XClick1Y,XClick2Y,XClick3Y,XClick4Y
000411001ff7dd4f,xclick,/m/09b5t,1,0.1734375,0.46875,0.19791667,0.7916667,0,0,1,0,0

The feature map for the TFRecord:

feature_map = {
standard_fields.TfExampleFields.object_bbox_ymin:
dataset_util.float_list_feature(
filtered_data_frame_boxes.YMin.to_numpy()),
standard_fields.TfExampleFields.object_bbox_xmin:
dataset_util.float_list_feature(
filtered_data_frame_boxes.XMin.to_numpy()),
standard_fields.TfExampleFields.object_bbox_ymax:
dataset_util.float_list_feature(
filtered_data_frame_boxes.YMax.to_numpy()),
standard_fields.TfExampleFields.object_bbox_xmax:
dataset_util.float_list_feature(
filtered_data_frame_boxes.XMax.to_numpy()),
standard_fields.TfExampleFields.object_class_text:
dataset_util.bytes_list_feature(
filtered_data_frame_boxes.LabelName.to_numpy()),
standard_fields.TfExampleFields.object_class_label:
dataset_util.int64_list_feature(
filtered_data_frame_boxes.LabelName.map(lambda x: label_map[x])
.to_numpy()),
standard_fields.TfExampleFields.filename:
dataset_util.bytes_feature('{}.jpg'.format(image_id)),
standard_fields.TfExampleFields.source_id:
dataset_util.bytes_feature(image_id),
standard_fields.TfExampleFields.image_encoded:
dataset_util.bytes_feature(encoded_image),
}

Any idea? I installed with pip3 and had to fix a bunch of package deprecation bugs before it got to this point.

pip3 install tensorflow
pip3 install tensorflow-object-detection-api

EDITs:

Versions:

tensorflow                         2.3.1
tensorflow-object-detection-api    0.1.1

I tried

  standard_fields.TfExampleFields.filename:
      dataset_util.bytes_feature(bytes(('{}.jpg'.format(image_id)),'ascii')),

but it gets the following:

TypeError: '000411001ff7dd4f' has type str, but expected one of: bytes

(Where did the .jpg go?)

Original Q&A

There are 2 best solutions below

**Vasili Syrakis** · Answer 1 · 2020-09-25T23:37:35.783000

The TypeError appears to refer to a filename created in this line:

dataset_util.bytes_feature('{}.jpg'.format(image_id)),

Perhaps this image name should be bytes, like so:

dataset_util.bytes_feature('{}.jpg'.format(image_id).encode()),

**djb** · Answer 2 · 2020-09-26T21:13:20.443000

So ideally, Karl would be right and we "should definitely not be trying to solve the problem by editing the library code."

But it would seem my library code isn't working at the moment. Found a relevant github issue bug.

and fixed it using .encode() It seems the bytes() solution also worked. The missing '.jpg' was because it moved onto the next time and died there.

Fixed code:

  standard_fields.TfExampleFields.filename:
      dataset_util.bytes_feature(('{}.jpg'.format(image_id)).encode('utf-8')),
  standard_fields.TfExampleFields.source_id:
      dataset_util.bytes_feature(image_id.encode('utf-8')),

Just moved onto the next bug.

TypeError: '/m/09b5t' has type str, but expected one of: bytes

Same error, but for bytes_list_feature() call. Fixed by changing:

  standard_fields.TfExampleFields.object_class_text:
      dataset_util.bytes_list_feature(
          filtered_data_frame_boxes.LabelName.to_numpy()),

to:

  standard_fields.TfExampleFields.object_class_text:
      dataset_util.bytes_list_feature(
          filtered_data_frame_boxes.LabelName.map(lambda x: x.encode('utf8')).to_numpy()()),

following a pull request: https://github.com/tensorflow/models/pull/4771/files (and also had to change as_matrix() to to_numpy())

TypeError: '...' has type str, but expected one of: bytes

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in TENSORFLOW

Related Questions in TENSORFLOW2.0

Related Questions in OBJECT-DETECTION

Related Questions in OBJECT-DETECTION-API

Trending Questions

Popular # Hahtags

Popular Questions