ValueError with file extension

221 Views Asked by CeDeR At 02 May 2023 at 05:42

I downloaded a raw data set from GSE (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE92332) which contains single cell analysis data. There are three different file formats matrix.mtx.gz, barcodes.tsv.gz and genes.tsv.tz

I now tried to run this code in order to load the data:

#Load data

data_file = "/Users/---/desktop/single-cell-tutorial/latest_notebook/GSE92332_RAW"
adata = sc.read(data_file, cache=True)
adata = adata.transpose()
adata.X = adata.X.toarray()

But I always get the following value error

ValueError: Reading with filekey '/Users/---/desktop/single-cell-tutorial/latest_notebook/GSE92332_RAW/MTX/mtx.gv' failed, the inferred filename PosixPath('/Users/---/desktop/single-cell-tutorial/latest_notebook/GSE92332_RAW/MTX/mtx.gv.h5ad') does not exist. If you intended to provide a filename, either use a filename ending on one of the available extensions {'csv', 'data', 'tab', 'h5ad', 'anndata', 'h5', 'tsv', 'xlsx', 'loom', 'txt', 'mtx.gz', 'soft.gz', 'mtx'} or pass the parameter ext.

I understand that I need to add an extension but regardless of whichever extension I add I still get the same error.

I tried all different extensions that are also file types (mtx.gz etc.), made an own folder with only the MTX data and tried calling that but nothing is working.

Original Q&A

There are 1 best solutions below

merv On 02 May 2023 at 17:07 BEST ANSWER

The scanpy.read method is for .h5ad files. If loading raw CellRanger MTX, then you should use the scanpy.read_10x_mtx method. E.g.,

import scanpy as sc

data_file = "path/to/GSE92332_RAW"
adata = sc.read_10_mtx(data_file, cache=True)

As commented, the .mtx and .tsv files likely need to be unzipped (run gzip -d *.gz from command line while in the folder). This is idiosyncratic to scanpy, which requires data with genes.tsv (pre-v3 CellRanger output) to be unzipped, whereas data with features.tsv (v3+ CellRanger output) can stay zipped. At least that's what the code shows.

Since this appears to be many runs, you may also need the prefix argument to specify which particular run you want to load.

ValueError with file extension

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in BIOINFORMATICS

Related Questions in VALUEERROR

Related Questions in SCANPY

Trending Questions

Popular # Hahtags

Popular Questions