Create URLs for different data frames

115 Views Asked by At

I have a data frame that I split into different data frames of size 100 (to be able to make Python able to process it). Therefore, I get different data frames (df1 to df..). For all those data frames, I want to create an URL as shown below.

When I use type(df), it shows me it is a data frame, however, when I use for j in dfs: print(type(j)), it is shown it is a string. I need the data frame to make it able to create the URL.

Can you please help me what the loop for creating the urls for all data frames could look like?

Thank you so much for your help!

df = pd.DataFrame.from_dict(pd.json_normalize(tweets_data), orient='columns')


n = 100  #chunk row size
list_df = [df[i:i+n] for i in range(0,df.shape[0],n)]

dfs = {}
for idx,df in enumerate(list_df, 1):
    dfs[f'df{idx}'] = df

type(df1)

for j in dfs:
    print(type(j))

def create_url():
     url = "https://api.twitter.com/2/tweets?{}&{}".format("ids=" + (str(str((df1['id'].tolist()))[1:-1])).replace(" ", ""), tweet_fields)
     return url
1

There are 1 best solutions below

3
furas On

dfs is dictionary so for j in dfs: gives you only keys - which are string.

You need .values()

for j in dfs.values(): 

or .items()

for key, j in df.items():

or you have to use dfs[j]

for j in dfs:
    print(type( dfs[j] ))

EDIT:

Frankly, you could do it all in one loop without list_df


import pandas as pd

#df = pd.DataFrame.from_dict(pd.json_normalize(tweets_data), orient='columns')
df = pd.DataFrame({'id': range(1000)})

tweet_fields = 'something'

n = 100  #chunk row size

for i in range(0, df.shape[0], n):
    ids = df[i:i+n]['id'].tolist()
    ids_str = ','.join(str(x) for x in ids)
    url = "https://api.twitter.com/2/tweets?ids={}&{}".format(ids_str, tweet_fields)
    print(url)

You can also use groupby and index if index uses numbers 0,1,...

for i, group in df.groupby(df.index//100):
    ids = group['id'].tolist()
    ids_str = ','.join(str(x) for x in ids)
    url = "https://api.twitter.com/2/tweets?ids={}&{}".format(ids_str, tweet_fields)
    print(url)