I'm working on a Python script that uses the TMDB (The Movie Database) API to fetch movie information. However, I'm encountering a TypeError when trying to access the 'id' attribute from the API response. I'm using the TMDBv3 API wrapper in my script.
Here's the relevant code:
import pandas as pd
import numpy as np
import requests
import bs4 as bs
import urllib.request
## Extracting features of 2020 movies from Wikipedia
link = "https://en.wikipedia.org/wiki/List_of_American_films_of_2020"
source = urllib.request.urlopen(link).read()
soup = bs.BeautifulSoup(source,'lxml')
tables = soup.find_all('table',class_='wikitable sortable')
len(tables)
type(tables[0])
from io import StringIO
# Assuming 'tables' is a list containing HTML tables
df1 = pd.read_html(StringIO(str(tables[0])))[0]
df2 = pd.read_html(StringIO(str(tables[1])))[0]
df3 = pd.read_html(StringIO(str(tables[2])))[0]
# Replace "1" with '1"'
df4 = pd.read_html(StringIO(str(tables[3]).replace("'1\"\'",'"1"')))[0]
df = df1._append(df2._append(df3._append(df4,ignore_index=True),ignore_index=True),ignore_index=True)
df
df_2020 = df[['Title','Cast and crew']]
df_2020
!pip install tmdbv3api
from tmdbv3api import TMDb
import json
import requests
tmdb = TMDb()
tmdb.api_key = 'API_KEY'
import numpy as np
import requests
from tmdbv3api import Movie
tmdb_movie = Movie()
def get_genre(x):
genres = []
result = tmdb_movie.search(x)
if not result or not hasattr(result[0], 'id'):
return np.NaN
movie_id = result[0].id
response = requests.get('https://api.themoviedb.org/3/movie/{}?api_key={}'.format(movie_id, tmdb_movie.api_key))
data_json = response.json()
if 'genres' in data_json and data_json['genres']:
genre_str = " "
for i in range(0, len(data_json['genres'])):
genres.append(data_json['genres'][i]['name'])
return genre_str.join(genres)
return np.NaN
df_2020['genres'] = df_2020['Title'].map(lambda x: get_genre(str(x)))
Error Message: ---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-56-ec558471f2dd> in <module>
----> 1 df_2020['genres'] = df_2020['Title'].map(lambda x: get_genre(str(x)))
c:\users\dell\appdata\local\programs\python\python39\lib\site-packages\pandas\core\series.py in map(self, arg, na_action)
4542 dtype: object
4543 """
-> 4544 new_values = self._map_values(arg, na_action=na_action)
4545 return self._constructor(new_values, index=self.index, copy=False).__finalize__(
4546 self, method="map"
c:\users\dell\appdata\local\programs\python\python39\lib\site-packages\pandas\core\base.py in _map_values(self, mapper, na_action, convert)
921 elif isinstance(arr, ExtensionArray):
922 # dispatch to ExtensionArray interface
--> 923 new_values = map_array(arr, mapper, na_action=na_action, convert=convert)
924
925 else:
c:\users\dell\appdata\local\programs\python\python39\lib\site-packages\pandas\core\algorithms.py in map_array(arr, mapper, na_action, convert)
1814 return lib.map_infer(values, mapper, convert=convert)
1815 else:
-> 1816 return lib.map_infer_mask(
1817 values, mapper, mask=isna(values).view(np.uint8), convert=convert
1818 )
c:\users\dell\appdata\local\programs\python\python39\lib\site-packages\pandas\_libs\lib.pyx in pandas._libs.lib.map_infer()
<ipython-input-56-ec558471f2dd> in <lambda>(x)
----> 1 df_2020['genres'] = df_2020['Title'].map(lambda x: get_genre(str(x)))
<ipython-input-53-5e28b0f3e7db> in get_genre(x)
7 if not result:
8 return np.NaN
----> 9 movie_id = result[0].id
10 response = requests.get('https://api.themoviedb.org/3/movie/{}?api_key={}'.format(movie_id,tmdb.api_key))
11 data_json = response.json()
c:\users\dell\appdata\local\programs\python\python39\lib\site-packages\tmdbv3api\as_obj.py in __getitem__(self, key)
47 return getattr(self, key)
48 else:
---> 49 return self._obj_list[key]
50
51 def __iter__(self):
TypeError: getattr(): attribute name must be string
I suspect that the structure of the response object might be causing the issue.
How can I modify the code to handle the
TypeErrorand ensure I'm correctly accessing the 'id' attribute from the TMDB API response?Are there any additional checks I should perform on the response object to avoid such errors?
The issue seems to be with the way you're accessing the 'id' attribute from the TMDB API response. Instead of using the
hasattr()function, you can directly check if the 'id' key exists in the response JSON object.To avoid the
TypeErrorand ensure you're correctly accessing the 'id' attribute, you can modify your code as follows:In this updated code, we are checking if 'id' is present in the TMDB API response by using the
inoperator. If it exists, we access it asresult[0]['id']instead of using thehasattr()function.Similarly, we also add a check for the presence of the 'genres' key in the
data_jsonobject before accessing its value. This ensures that the code doesn't break if the 'genres' key is missing in the response.By making these modifications, you should be able to handle the
TypeErrorand properly access the 'id' attribute from the TMDB API response.