I have a list of strings called 'entries'. Each entry includes a date and time in a format like this: 'Mon Jun 15 17:52:03 2015'
I'm parsing the dates/times from each entry with regex and then I need to put them into python's datetime format and change the timezone to UTC (which is local time +4 hrs). Here's my code:
from datetime import datetime
import pytz
local = pytz.timezone("Etc/GMT+4")
localdate = [None]*len(entries)
local_dt = [None]*len(entries)
utc_dt = [None]*len(entries)
utdate = [None]*len(entries)
for i in range(len(entries)):
localdate[i] = datetime.strptime(re.search(r'\w{3}\s*?\w{3}\s*?\d{1,2}\s*?
\d{1,2}:\d{2}:\d{2}\s*?\d{4}', entries[i]).group(0), "%c")
local_dt[i] = local.localize(localdate[i], is_dst=None)
utc_dt[i] = local_dt[i].astimezone(pytz.utc)
utdate[i] = utc_dt[i].strftime("%c")
utdate = map(str, utdate)
print utdate
It seems to work well line-by-line if I go through and print each step, but once it gets to the last step it reverts back to the original format of the dates/times rather than the python datetime format of 'yyyy-mm-dd hh:mm:ss'. Anyone know what's wrong?
tl;dr
You're formatting the
datetime
object into a string withutdate[i] = utc_dt[i].strftime("%c")
. The%c
code formats the date according to the system's localization settings, not the format you're expecting.The standard string representation of a
datetime
object will generate the format you're looking for – you can get a string fromstr(some_datetime)
, orprint(some_datetime)
to print it to the console.Timezones
This is notoriously hard to keep track of, but you may want to double check which timezone you're using. As is, your code will take an input time and give an output time that's 4 hours earlier. If I'm understanding correctly, you expect it the other way around. You should know that the "Etc" timezones are labelled oppositely for weird reasons, and you may want to change the timezone used. It's a different question, but using a location-based timezone instead of a UTC offset may be a good idea for things like DST support.
Improvements
You can simplify and clarify what you're trying to do here with a few changes. It makes it a bit more "Pythonic" as well.
Changes
Use a strftime/strptime formatter.
strftime
andstrptime
are designed to parse strings, ordinarily regular expressions shouldn't be needed to process them first. The same goes for output formats – if specific format is needed that's not provided with a built-in method likedatetime.isoformat
, use a formatter.In Python there's no need to initialize a list a length ahead of time (or with
None
).list_var = []
orlist_var = list()
will give you an empty list that will expand on demand.Typically it's best and simplest to just iterate over a list, rather than jump through hoops to get a loop counter. It's more readable, and ultimately less to remember.
for i, entry in enumerate(entries):
Use scoped variables. Temporary values like
localdate
andlocaldt
can just be kept inside thefor
loop. Technically it's wasting memory, but more importantly it keeps the code simpler and more encapsulated.If the values are needed for later, then do what I've done with the
converted_entries
list. Initialize it outside the loop, then just append the value to the list each time through.No need for counter variables:
I hope that's helpful for you. The beauty of Python is that it can be pretty simple, so just embrace it