How could I get hours of each timezone for 2000rows in dataframes

81 Views Asked by At

I used many methods to get what I want but it doesn’t work

I have dataframe for 2000rows and in this df, I have latitude and longitudes columns I wants to get the hours for each rows in new column hours_timezone The issue I used Timefinder but my code run for 10mn for just 2000 rows It’s not normal and I don’t know how could I execute or how could I code the function to get timezone hours for each latitude and longitude in 2mn or less than

If you have any idea or code thanks to share it with me Thank your

Example :
Input data : don’t forget I don’t have 4 rows in input I have 2000rows

data = [['23', 35.5, 30.6],
        ['12', 30.6, 31.3444],
        ['15', 35.9, 24.5],
        ['14', 40.5, 38.6]]

# Colonnes du dataframe
columns = ['key', 'latitude', 'longitude']

# Créer le dataframe
df = pd.DataFrame(data, columns=columns)


# get timezone 
Here I use TimezoneFinder and other codes

# the output we’ll be a dataframe like that
Input data : don’t forget I don’t have 4 rows in input I have 2000rows

data = [['23', 35.5, 30.6, +3],
        ['12', 30.6, 31.3444, -6],
        ['15', 35.9, 24.5, +7],
        ['14', 40.5, 38.6, +5]

columns = ['key', 'latitude', 'longitude', ‘timezone_hours’]
1

There are 1 best solutions below

1
TheHungryCub On BEST ANSWER

This is what I have tried :

# Function to get timezone for a single row
def get_timezone(lat, lon):
    tf = TimezoneFinder()
    timezone = tf.timezone_at(lng=lon, lat=lat)
    return timezone

# Apply the function to each row
df['timezone'] = df.apply(lambda row: get_timezone(row['latitude'], row['longitude']), axis=1)

# Extract hours from timezone information
df['timezone_hours'] = df['timezone'].str.extract(r'([+-]?\d+)')

# Set a default timezone for rows where timezone is missing or has no offset
default_timezone = 'Etc/GMT'
df['timezone'].fillna(default_timezone, inplace=True)
df['timezone_hours'].fillna(0, inplace=True)  # Set hours to 0 for default timezone

# Display the dataframe
print(df)

Output:

  key  latitude  longitude         timezone timezone_hours
0  23      35.5    30.6000        Etc/GMT-2             -2
1  12      30.6    31.3444     Africa/Cairo              0
2  15      35.9    24.5000        Etc/GMT-2             -2
3  14      40.5    38.6000  Europe/Istanbul              0