sFrame into scipy.sparse csr_matrix

277 Views Asked by At

I have a sframe like:

x = sf.SFrame({'users': [{'123': 1.0, '122': 5},
{'134': 3.0, '123': 10}]})

I want to convert into scipy.sparse csr_matrix without invoking graphlab create, but only using sframe and Python.

How to do it?

1

There are 1 best solutions below

1
On BEST ANSWER

Assuming you want the row number to be the row index in the output sparse matrix, the only tricky step is using SFrame.stack - from there you should be able to construct a csr_matrix directly.

import sframe as sf
from scipy.sparse import csr_matrix

x = sf.SFrame({'users': [{'123': 1.0, '122': 5},
                         {'134': 3.0, '123': 10}]})
x = x.add_row_number('row_id')
x = x.stack('users')
A = csr_matrix((x['X3'], (x['row_id'], x['X2'])), 
               shape=(2, 135))

I'm also hard-coding the dimension of the matrix here, but that's probably something you'd want to figure out programmtically.