Document AI - GCP error in dataset maximum number of documents to train a custom processor reached

49 Views Asked by At

I am working with document AI from GCP training 2 custom processors to extract information from a image/pdf but an error begun to arises in one of then.

custom processor I error log:

{
  "name": "projects/93127239131/locations/us/operations/16208318867953",
  "done": true,
  "result": "error",
  "response": {},
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.documentai.uiv1beta3.TrainProcessorVersionMetadata",
    "commonMetadata": {
      "state": "FAILED",
      "createTime": "2023-12-01T17:39:42.787927Z",
      "updateTime": "2023-12-01T17:51:55.140592Z",
      "resource": "projects/931239131/locations/us/processors/e1de18466d94b/processorVersions/e1db22fb84f98"
    },
    "trainingDatasetValidation": {
      "datasetErrors": [
        {
          "code": 3,
          "message": "Invalid dataset.",
          "details": [
            {
              "@type": "type.googleapis.com/google.rpc.ErrorInfo",
              "reason": "INVALID_DATASET",
              "domain": "documentai.googleapis.com",
              "metadata": {
                "annotation_name": "DOCUMENTS_WITH_ENTITIES",
                "max_documents_allowed": "300",
                "num_documents_with_annotation": "346"
              }
            }
          ]
        }
      ],
      "datasetErrorCount": 1
    },
    "testDatasetValidation": {}
  },
  "error": {
    "code": 3,
    "message": "Invalid dataset. See operation metadata for specific errors.",
    "details": []
  }
}

custom processor I infos (in yellow):

enter image description here

The log error from custom processor I show that a maximum number of documents was reached "max_documents_allowed": "300". In this custom processor I have 346 documents to train. But, the custom processor II has many more documents to train and no one error arises, in fact the custom processor II has 609 documents being 404 for training and 125 for test (see image below - infos in yellow).

enter image description here

Can anyone face this problem before, or has any clue how to solve it? Because there is no sense to have a maximum limit of documents in one of the processor and not in the other if they are of the same type.

0

There are 0 best solutions below