How to fix: compairing result of a bigquery query to a list

125 Views Asked by At

I am a newbie in Python, I appreciate all help. I want a list of activities of a group, who purchased for 4$ this way: purchase_date(1,1,0,1,1,0,1) where purchase_date is the date of the purchase, and the arrays indeces+1 gives the days after the purchases. 1 means active day, 0 means not active day. E.g. 20190203(1,1,1,0,0,0,1) means the purchase was on 2019-02-03, and the user was active after that on 4th, 5th, 6th and 10th of February.

I tried the code below. Steps:

  1. Created a datatable with the purchases: four_dollar_buyers(user_pseudo_id,purchase_date). Queried it and loaded the result into the four_dollar_purchases list.
  2. Iterated over on four_dollar purchases
  3. Made 2 helper arrays: seven_days_date contains the dates after the purchases seven_days_number should contain ones and zeros (active or not in given day)
  4. Iterated over the seven_days_date, made a query from datatables of the given date, what gave the id-s of the active users on that day. Load the result of the query into a list named 'actives'
  5. If the user_id of the given purchase is in actives, then the seven_days_number array should change from 0 to 1 on the given index.
client = bigquery.Client(project="project")

QUERY = ('SELECT * FROM `project.four_dollar_buyers`')
query_job = client.query(QUERY)                             
four_dollar_purchases = list(query_job.result())                        

for row in four_dollar_purchases:                                       

  seven_days_date = ["","","","","","",""]                          
  seven_days_number = [0,0,0,0,0,0,0]                                   

  for i in range(7):
    date_time_obj = datetime.strptime(row[1], '%Y%m%d')                 
    date_time_obj = date_time_obj + timedelta(days=1)+timedelta(days=i)         
    seven_days_date[i] = date_time_obj.strftime("%Y%m%d")                   

  for idx, days in enumerate(seven_days_date):

    QUERY = ('''SELECT DISTINCT user_pseudo_id FROM 
    `project.events_'''+days+'''` WHERE event_name IN 
    ("activity_added")''')
    query_job = client.query(QUERY)
    actives = list(query_job.result())                          


  if row[0] in actives:                                 
    seven_days_number[idx] = 1                              


  print(row[1] + str(seven_days_number))

There is no error message anymore, but all result is like this 20181212(0,0,0,0,0,0,0). So for some reason the helper array does not change, after the purchase date it gives only zeros. I checked the variables row[0] and actives with pprint and both of them contains the right result.

2

There are 2 best solutions below

2
On BEST ANSWER
query_job = client.query(QUERY)
actives = list(query_job.result())

for dict in actives:
  if dict[0] == row[0]:
    seven_days_number[idx] = 1

print(row[1] + str(seven_days_number))
2
On

Days is not an integer type as ralaxpy has suggested. So, you can use enumeration or something else in order to modify the list using the index.