I am trying to implement a pre-labeler for Label Studio. The documentation (e.g. https://labelstud.io/guide/ml_create.html#Example-inference-call) is not very helpful here. I have imported images locally and need to read them from a task record to pass them to my model. When I print a task record, it looks like this:
{
"id": 7,
"data": {
"image": "/data/upload/1/cfcf4486-0cdd1413486f7d923e6eff475c43809f.jpeg"
},
"meta": {},
"created_at": "2022-12-29T00:49:34.141715Z",
"updated_at": "2022-12-29T00:49:34.141720Z",
"is_labeled": false,
"overlap": 1,
"inner_id": 7,
"total_annotations": 0,
"cancelled_annotations": 0,
"total_predictions": 0,
"comment_count": 0,
"unresolved_comment_count": 0,
"last_comment_updated_at": null,
"project": 1,
"updated_by": null,
"file_upload": 7,
"comment_authors": [],
"annotations": [],
"predictions": []
}
My labeler implementation at the moment is this:
class MyModel(LabelStudioMLBase):
def __init__(self, **kwargs):
super(MyModel, self).__init__(**kwargs)
self.model = ...
self.query_fn = ...
def make_result_record(self, path: Path):
# Reference: https://labelstud.io/tutorials/sklearn-text-classifier.html
mask = self.query_fn(path)
image = Image.open(path)
result = [
{
"original_width": image.width,
"original_height": image.height,
"image_rotation": 0,
"value": {"format": "rle", "rle": [mask], "brushlabels": ["crack"]},
"id": uuid(),
"from_name": "tag",
"to_name": "image",
"type": "brushlabels",
"origin": "inference",
}
]
return {"result": result, "score": 1.0}
def predict(self, tasks, **kwargs):
predictions = []
for task in tasks:
logger.info("task:")
logger.info("\n" + json.dumps(task, indent=2))
result = self.make_result_record(Path(task["data"]["image"]))
predictions.append(result)
return predictions
So where is /data/upload/1/cfcf4486-0cdd1413486f7d923e6eff475c43809f.jpeg
? It is inside some storage that Label Studio spins up I suppose. How do I access this? (And why does the documentation not talk about this......)
those are the fils label studio create when you upload some date to it.
you will finde it in your system if you try for search for the filename "cfcf4486-0cdd1413486f7d923e6eff475c43809f.jpeg"
or if you run it on docker you will finde it under the same path.
in my case on macOS I found the .jpg file under : "/Users/user/Library/Application Support/label-studio/media/upload/1/1deaeb75-0f29e9df11dbc1cce55cb3529517dcd5.jpg"
I think You don need to read them from a task. If you create a ml-backend and connect it your label-studio to it, your label.s will create and send the tasks for you.
take a look to this backend for example: https://www.kaggle.com/code/yujiyamamoto/semi-auto-annotation-label-studio-and-tf2-od/notebook