Problem with running Azure spatial-analysis container

84 Views Asked by At

I need to create and run Azure spatial analysis container on my desktop machine using WSL. I followed this tutorial: https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/spatial-analysis-container?tabs=desktop-machine. Everything should be running fine according to IoT Hub. But i'm not getting any output and when looking to logs of the spatial-analysis module i saw this error a lot:

2024-03-06T19:46:45.429562642Z <warning> 93 [VIDEO_INGESTER-cognitiveservices_vision_spatialanalysis_1.store.spatialanalysisgraph.videosource] cognitiveservices_vision_spatialanalysis_1 Error: Failed to allocate shared buffer. Skipping frame. 

2024-03-06T19:46:45.501593175Z <warning> 93 [VIDEO_INGESTER-cognitiveservices_vision_spatialanalysis_1.store.spatialanalysisgraph.videosource] cognitiveservices_vision_spatialanalysis_1 Failed to get CUDA handle: cudaIpcGetMemHandle failed with error 2 

2024-03-06T19:46:45.502484545Z <error> 93 [VIDEO_INGESTER-cognitiveservices_vision_spatialanalysis_1.store.spatialanalysisgraph.videosource] cognitiveservices_vision_spatialanalysis_1 Cannot create cuda shared buffer. Size: 6684672

I have no idea what to do also this is report from nvidia-smi:


| NVIDIA-SMI 530.30.02              Driver Version: 527.99       CUDA Version: 12.0     |

|-----------------------------------------+----------------------+----------------------+

| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |

| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |

|                                         |                      |               MIG M. |

|=========================================+======================+======================|

|   0  NVIDIA GeForce RTX 3060 L...    On | 00000000:01:00.0  On |                  N/A |

| N/A   55C    P8               16W / 115W|   2609MiB /  6144MiB |     28%      Default |

|                                         |                      |                  N/A |

+-----------------------------------------+----------------------+----------------------+````

I tried restarting the modules and whole IoT edge. Also checked the connectivity to IoT hub and that should be also fine.

Thanks in advance for your help!
1

There are 1 best solutions below

0
Sampath On

The error is due to GPU memory, resulting in a buffer size of 6.6 MB. If your NVIDIA GeForce RTX 3060 has limited available memory, allocating this buffer could cause issues. Make sure you have sufficient space in your system and meet the Spatial Analysis container requirements.

The below are step-by-step guide to install and run the Spatial Analysis container:

Install NVIDIA CUDA Toolkit and Nvidia Graphics Drivers:

  • Run the provided bash script to install the required drivers and CUDA Toolkit. - Reboot the machine after installation.
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub
sudo add-apt-repository "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ /"
sudo apt-get update
sudo apt-get -y install cuda
sudo reboot

Install Docker CE and nvidia-docker2:

  • Install Docker CE and nvidia-docker2 software package.
 sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gnupg-agent software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
sudo apt-get update
sudo apt-get install -y docker-ce docker-ce-cli containerd.io
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
sudo apt-get install -y docker-ce nvidia-docker2
sudo systemctl restart docker
  • Enable NVIDIA MPS:

    • Configure GPU(s) for NVIDIA Multiprocess Service (MPS) for better performance.
      sudo nvidia-smi --compute-mode=EXCLUSIVE_PROCESS
    echo "SHELL=/bin/bash" > /tmp/nvidia-mps-cronjob
    
    sudo chown root:root /tmp/nvidia-mps-cronjob
    sudo mv /tmp/nvidia-mps-cronjob /etc/cron.d/
    
    sudo chown root:root /tmp/nvidia-mps.service
    sudo mv /tmp/nvidia-mps.service /etc/systemd/system/
    sudo systemctl --now enable nvidia-mps.service

Configure Azure IoT Edge on the host computer:

  • Create Azure IoT Hub instance:

    • Use Azure CLI to create an instance of Azure IoT Hub.
sudo az login
sudo az account set --subscription "<name or ID of Azure Subscription>"
sudo az group create --name "<resource-group-name>" --location "<your-region>"
sudo az iot hub create --name "<iothub-group-name>" --sku S1 --resource-group "<resource-group-name>"
sudo az iot hub device-identity create --hub-name "<iothub-name>" --device-id "<device-name>" --edge-enabled
  • Install Azure IoT Edge:

    • Download and install Azure IoT Edge version 1.0.9.
    sudo cp ./microsoft-prod.list /etc/apt/sources.list.d/
    curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.gpg
    sudo cp ./microsoft.gpg /etc/apt/trusted.gpg.d/
    sudo apt-get update
    sudo apt-get install iotedge=1.1* libiothsm-std=1.1 
    
  • Register IoT Edge device:

    • Obtain the connection string from the IoT Edge device created earlier and update the configuration file.
    • Restart the IoT Edge service.
 sudo az iot hub device-identity connection-string show --device-id <device-id> --hub-name <hub-name>
sudo nano /etc/iotedge/config.yaml  # Replace ADD DEVICE CONNECTION STRING HERE with the connection string
sudo systemctl restart iotedge

enter image description here

Deploy the Spatial Analysis container:

  • Deploy the container:

    • Use Azure CLI to deploy the container as an IoT Edge Module on the host computer.
   sudo az login
    sudo az extension add --name azure-iot
    sudo az iot edge set-modules --hub-name "<iothub-name>" --device-id "<device-name>" --content DeploymentManifest.json --subscription "<name or ID of Azure Subscription>"