cannot import name 'dataproc_v1' from 'google.cloud' (unknown location)

2.7k Views Asked by At

Trying to access Using Dataproc via Jupyter Notebook from the computer, I installed required libraries using pip. However, getting error while importing

import google.cloud.dataproc_v1

Error is as follows:

ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-3-fc8862c62c75> in <module>
----> 1 import google.cloud.dataproc_v1

Also tried installing the package python3 -m pip install google-cloud-dataproc. For reference, here is the output of pip list. Any suggestion/ help is appreciated!

Package                  Version
------------------------ ---------
cachetools               4.1.1
certifi                  2020.6.20
chardet                  3.0.4
google-api-core          1.22.2
google-auth              1.21.1
google-cloud-dataproc    2.0.0
googleapis-common-protos 1.52.0
grpcio                   1.32.0
idna                     2.10
libcst                   0.3.10
mypy-extensions          0.4.3
pip                      20.2.2
proto-plus               1.9.1
protobuf                 3.13.0
pyasn1                   0.4.8
pyasn1-modules           0.2.8
pytz                     2020.1
PyYAML                   5.3.1
requests                 2.24.0
rsa                      4.6
setuptools               45.0.0
six                      1.15.0
typing-extensions        3.7.4.3
typing-inspect           0.6.0
urllib3                  1.25.10
wheel                    0.35.1
3

There are 3 best solutions below

0
On

Can you confirm you are running from a Jupyter notebook on Dataproc and what version of Dataproc?

I have tested the following code running on Dataproc notebook using Dataproc version 1.5

from google.cloud import dataproc_v1
from google.cloud import storage

project_id = 'project'
region = 'us-central1'
cluster_name = 'cluster'

cluster_client = dataproc_v1.ClusterControllerClient(
    client_options={"api_endpoint": "{}-dataproc.googleapis.com:443".format(region)}
)

for cluster in cluster_client.list_clusters(request={"project_id": project_id, "region": region}):
    cluster_name = cluster.cluster_name
    print(f"{cluster_name}")
2
On

If you are installing from within the notebook, try this in a cell:

! pip install google.cloud.dataproc_v1

If you get an error due to missing access, try it with --user option, i.e.

! pip install google.cloud.dataproc_v1 --user

Restart the kernel and try importing the library again.

0
On

try this:

pip install google-cloud-dataproc