MLKit Text Recognition: Text Not Being Detected

1.5k Views Asked by At

I am making an iOS where a user takes a picture and then I want to use Google's MLKit from Firebase to detect text in the picture. I have set up a custom camera UIViewController that we'll call CameraViewController. There is a simple button that a user will press to take a picture. I have followed Firebase's documentation, here, but MLKit is not working for me. Here is the code I have for your refrence and then we'll talk about what the problem is.

1.Here are my imports, class delegates, and outlets:

import UIKit
import AVFoundation
import Firebase

    class CameraViewController: UIViewController, AVCapturePhotoCaptureDelegate {
        var captureSession: AVCaptureSession?
        var videoPreviewLayer: AVCaptureVideoPreviewLayer?
        var capturePhotoOutput: AVCapturePhotoOutput?
        @IBOutlet var previewView: UIView!
        @IBOutlet var captureButton: UIButton!
}

2.In the viewDidLoad, I set up the "previewView" so that the user has a "view finder":

override func viewDidLoad() {
    super.viewDidLoad()

    let captureDevice = AVCaptureDevice.default(for: .video)!
    do {
        let input = try AVCaptureDeviceInput(device: captureDevice)
        captureSession = AVCaptureSession()
        captureSession?.addInput(input)
        videoPreviewLayer = AVCaptureVideoPreviewLayer(session: captureSession!)
        videoPreviewLayer?.videoGravity = AVLayerVideoGravity.resizeAspectFill
        videoPreviewLayer?.frame = view.layer.bounds
        previewView.layer.addSublayer(videoPreviewLayer!)
        captureSession?.startRunning()
        capturePhotoOutput = AVCapturePhotoOutput()
        capturePhotoOutput?.isHighResolutionCaptureEnabled = true
        captureSession?.addOutput(capturePhotoOutput!)
    } catch {
        print(error)
    }
  }

3.Here is my action for the button that takes the image

@IBAction func captureButtonTapped(_ sender: Any) {

        guard let capturePhotoOutput = self.capturePhotoOutput else { return }
        let photoSettings = AVCapturePhotoSettings()
        photoSettings.isAutoStillImageStabilizationEnabled = true
        photoSettings.isHighResolutionPhotoEnabled = true
        photoSettings.flashMode = .off
        capturePhotoOutput.capturePhoto(with: photoSettings, delegate: self)
    }

4.This is where I receive the picture taken using the didFinishProcessingPhoto delegate method and start using MLKit

func photoOutput(_ captureOutput: AVCapturePhotoOutput, didFinishProcessingPhoto photoSampleBuffer: CMSampleBuffer?, previewPhoto previewPhotoSampleBuffer: CMSampleBuffer?, resolvedSettings: AVCaptureResolvedPhotoSettings, bracketSettings: AVCaptureBracketedStillImageSettings?, error: Error?) {

    guard error == nil,
        let photoSampleBuffer = photoSampleBuffer else {
            print("Error capturing photo: \(String(describing: error))")
            return
    }
    guard let imageData =
        AVCapturePhotoOutput.jpegPhotoDataRepresentation(forJPEGSampleBuffer: photoSampleBuffer, previewPhotoSampleBuffer: previewPhotoSampleBuffer) else {
            return
    }
    let capturedImage = UIImage.init(data: imageData , scale: 1.0)
    captureNormal()
    DispatchQueue.main.asyncAfter(deadline: .now()+0.1) {
        self.captureSession?.stopRunning()
        self.processText(with: capturedImage!)
        // Here is where I call the function processText where MLKit is run
    }

}

5.Lastly, here is my function processText(with:UIImage) that uses MLKit

func processText(with image: UIImage) {
        let vision = Vision.vision()
        let textRecognizer = vision.onDeviceTextRecognizer()
        let visionImage = VisionImage(image: image)

        textRecognizer.process(visionImage) { result, error in

        if error != nil {
            print("MLKIT ERROR - \(error)")
        } else {
            let resultText = result?.text
            print("MLKIT RESULT - \(resultText)")
        }

     }

   }

Ok, that was a lot, thank you for reading all of that. Alright, so the problem is that this does not work. I do get a proper UIImage in step 4 so it's not that. Here's a screenshot of an example of what I am trying to scan...

example screenshot

MLKit should be able to easily detect this text. But every time I try, result?.text is always printed as nil. I'm out of ideas. Does anyone have any ideas on how to fix this? If so, thanks a lot!

0

There are 0 best solutions below