Reading in lines of numeric strings into separate lists in python

Question

Reading in lines of numeric strings into separate lists in python

137 Views Asked by ASG At 28 May 2025 at 22:00

I have csv file which has multiple lines of numeric string values of following format:

csv sample of 2 lines:

[['ASA00211063', '2005'], [-0.434358, -0.793407, -1.070576, nan, nan,...(365 values)], [0.354615, -0.108102,nan,...(365 values)]]

[['AFR02516075', '1998'], [-0.434358, -0.7934039, -1.0705767, nan, nan,...(365 values)], [0.3546153, -0.1081022, nan,...(365 values)]]

How can I split as well as join the csv file into lists, such that out put is:

list[0] = ['ASA00211063', '2005'], ['AFR02516075', '1998']...
list[1] = [-0.434358, -0.793407, -1.070576, nan, nan,..., 0.354615, -0.108102,nan,...(**730** values)]
list[2] = [-0.434358, -0.7934039, -1.0705767, nan, nan,..., 0.3546153, -0.1081022, nan,...(**730** values)]

Original Q&A

There are 2 best solutions below

Alexander McFarlane On 07 June 2015 at 21:58

To read a pythonic structure from a text file always use ast.literal_eval() it will only read in python structures and prevents anyone from embedding anything nasty in an input file.

This code will go through each line in your input file and append it to a list from which you can decide what to do.

import ast

l = []
for line in open('inputfile.txt'):
    edited_line = line.replace('nan','"nan"')
    l.append(ast.literal_eval(edited_line))

This will also replace all nan with numpy.nan objects:

import ast
from numpy import nan

l = []
for line in open('inputfile.txt'):
    edited_line = line.replace('nan','"nan"')
    edited_line = ast.literal_eval(edited_line)
    edited_line =  [[nan if v == 'nan' else v for v in vals] for vals in edited_line]
    l.append(edited_line)

# combine elements [1] and [2] in the sublist to a list of len = 730
# element l[0] is list of ['code', 'yyyy']
# element l[1 ... n] is list of data by row of length 730
l = [[subl[0] for subl in l]] + [subl[1]+subl[2] for subl in l]

gives output:

for row in l: print row
>>> [['ASA00211063', '2005'], ['AFR02516075', '1998']]
    [-0.434358, -0.793407, -1.070576, nan, nan, 0.354615, -0.108102, nan]
    [-0.434358, -0.7934039, -1.0705767, nan, nan, 0.3546153, -0.1081022, nan]

**stevieb** · Accepted Answer

I think I satisfied your requirements with this code:

#!/usr/bin/python

import re

data = [[]]

for line in open('in'):
    line = line.strip()
    line = re.match(r'\[?(.*)\]', line).group(1)

    res = re.split(r', (?=\[)', line)

    data[0].append(res[0])
    string = res[1] + res[2]
    data.append([string])

for i, v in enumerate(data):
    print("{}\n".format(data[i]))

Input:

[['ASA00211063', '2005'], [-0.434358, -0.793407, -1.070576, nan, nan,...(365 values)], [0.354615, -0.108102,nan,...(365 values)]]
[['AFR02516075', '1998'], [-0.434358, -0.7934039, -1.0705767, nan, nan,...(365 values)], [0.3546153, -0.1081022, nan,...(365 values)]]
[['XXX02516075', '1998'], [-1.434358, -1.7934039, -1.1705767, nan, nan,...(365 values)], [0.7546153, -0.7081022, nan,...(365 values)]]

Output:

data[0]:
["['ASA00211063', '2005']", "['AFR02516075', '1998']", "['XXX02516075', '1998']"]

data[1]:
['[-0.434358, -0.793407, -1.070576, nan, nan,...(365 values)][0.354615, -0.108102,nan,...(365 values)]']

data[2]:
['[-0.434358, -0.7934039, -1.0705767, nan, nan,...(365 values)][0.3546153, -0.1081022, nan,...(365 values)]']

data[3]:
['[-1.434358, -1.7934039, -1.1705767, nan, nan,...(365 values)][0.7546153, -0.7081022, nan,...(365 values)]']

Reading in lines of numeric strings into separate lists in python

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in CSV

Trending Questions

Popular # Hahtags

Popular Questions