Here basically I'm read a file of h5adfile format of a single cell data.
What I'm doing is to read the files and define a filter which I can do manually mean hardcoded which works.
My objective is build a function kind of where user just gives a h5ad file as input then define no of genes and no of cells to keep which can be used as filtering parameter for the single cell data.
So far my code is this
import loompy
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import scanpy as sc
sc.settings.verbosity = 3 # verbosity: errors (0), warnings (1), info (2), hints (3)
sc.settings.set_figure_params(dpi=80, color_map='Greys')
sc.logging.print_versions()
#Reading the data
adata = sc.read_h5ad('/Single_cell/cerebellar_development/508cf892-174c-45ab-b2dc-05f54f1ee7ed/aldinger20.processed.h5ad')
adata
obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'sample_id', 'percent.mito', 'S.Score', 'G2M.Score', 'Phase', 'CC.Difference', 'nCount_SCT', 'nFeature_SCT', 'age', 'figure_clusters', 'sex', 'type', 'experiment', 'fig_cell_type', 'n_genes', 'n_genes_by_counts', 'total_counts', 'total_counts_mt', 'pct_counts_mt'
var: 'name', 'n_cells', 'mt', 'n_cells_by_counts', 'mean_counts', 'pct_dropout_by_counts', 'total_counts'
obsm: 'X_pca', 'X_umap', 'X_tsne'
#Next Step is where I define the filter
sc.pp.filter_cells(adata, min_genes=500)
sc.pp.filter_genes(adata, min_cells=10)
print(adata.n_obs, adata.n_vars)
#Once the above steps happen
The next step is to save the filtered data file as h5ad format.
So to make it simple I this whole process to be automated sort of where it can take user input as file input and then define the filter then run the code and save the filtered output