sceleto.network

Correlation-based Gene Network

sceleto.network.compute_corr(adata: AnnData, gene: str, label: str | None = None, layer: str | None = None, chunk_size: int = 4096) DataFrame[source]

Pearson correlation of gene against all genes in adata.

Parameters

adata

Input AnnData.

gene

Gene of interest; must be in adata.var_names.

label

Column prefix for output. Falls back to adata.uns["label"] then "sample".

layer

Layer to use instead of adata.X.

chunk_size

Genes processed per chunk (memory control).

Returns

pd.DataFrame

Columns: gene, {label}_corr, {label}_pval.

sceleto.network.build_corr_matrix(adatas: dict[str, AnnData], gene: str, layer: str | None = None, chunk_size: int = 4096) DataFrame[source]

Compute per-condition correlation for gene across multiple AnnData objects.

Parameters

adatas

{label: AnnData} mapping. The key is used as the column prefix.

gene

Gene of interest.

layer

Layer to use instead of adata.X.

chunk_size

Passed to compute_corr().

Returns

pd.DataFrame

Wide table: gene + {label}_corr + {label}_pval per condition.

sceleto.network.select_top_genes(corr_df: DataFrame, top_n: int = 10, conditions: list[str] | None = None, exclude_gene: str | None = None) DataFrame[source]

Select the top top_n positively correlated genes per condition.

Parameters

corr_df

Wide table from build_corr_matrix() or load_corr_db().

top_n

Number of top genes to keep per condition.

conditions

Subset of condition labels (column prefix, i.e. without _corr). If None, all *_corr columns are used.

exclude_gene

Gene name to exclude (typically the GOI itself). Removes rows where gene == exclude_gene or corr >= 1.0.

Returns

pd.DataFrame

Long-form: condition, gene, corr, pval.

sceleto.network.build_feature_matrix(top_genes_df: DataFrame, corr_df: DataFrame) DataFrame[source]

Build a gene × conditions correlation matrix for network construction.

Parameters

top_genes_df

Long-form output of select_top_genes().

corr_df

Wide table from build_corr_matrix().

Returns

pd.DataFrame

Index = unique genes, columns = condition labels, values = corr (NaN filled with 0.0).

sceleto.network.build_gene_network(feature_matrix: DataFrame, k: int = 5, metric: str = 'euclidean') Graph[source]

Build a k-NN gene network from a feature matrix.

Parameters

feature_matrix

Gene × conditions matrix (output of build_feature_matrix()).

k

Number of nearest neighbours per gene.

metric

Distance metric passed to scipy.spatial.distance.pdist.

Returns

networkx.Graph

Nodes = gene names; edge attributes: dist, weight.

sceleto.network.plot_network(G: Graph, feature_matrix: DataFrame | None = None, condition: str | None = None, pos: dict | None = None, seed: int = 3, figsize: tuple[int, int] = (15, 15), node_size_range: tuple[int, int] = (50, 600), cmap: str = 'coolwarm', ax: Axes | None = None) Figure[source]

Draw a gene network with optional per-condition node coloring.

Parameters

G

networkx Graph from build_gene_network().

feature_matrix

Gene × conditions matrix. Required when condition is set.

condition

Column in feature_matrix to use for node color/size.

pos

Pre-computed layout positions. If None, spring layout is computed.

seed

Random seed for spring layout.

figsize node_size_range

(min_size, max_size) when coloring by condition.

cmap

Colormap name for condition coloring.

ax

Existing Axes to draw on.

Returns

matplotlib Figure

sceleto.network.plot_clustermap(feature_matrix: DataFrame, figsize: tuple[int, int] = (15, 35), cmap: str = 'coolwarm', max_genes: int = 96) ClusterGrid[source]

Hierarchically clustered heatmap of the feature matrix.

Parameters

feature_matrix

Gene × conditions matrix.

figsize cmap max_genes

If more genes than this, keep top max_genes by mean |corr|.

Returns

seaborn ClusterGrid

sceleto.network.corr_pangea(gene: str, data_dir: str, cell_types: list[str] | None = None, top_n: int = 10, k: int = 5) tuple[DataFrame, DataFrame, Graph][source]

One-shot gene network from PANGEA pre-computed correlation DB.

Parameters

gene

Gene of interest (e.g. "CD55").

data_dir

Directory containing pangea_corr_{CT}_v03.csv.gz files.

cell_types

Subset of cell types. None = all 6.

top_n

Number of top correlated genes per cell type.

k

Number of nearest neighbours for the kNN gene network.

Returns

corr_dfpd.DataFrame

Wide table (gene + per-cell-type corr/pval).

feature_matrixpd.DataFrame

Gene × conditions correlation matrix.

Gnetworkx.Graph

kNN gene network.

Correlation Database

sceleto.network.list_cell_types(data_dir: str | Path | None = None, name: str = 'pangea', version: str = 'v03') list[str][source]

Return available cell types in a corr database.

Parameters

data_dir

Directory containing the DB. If None, returns the PANGEA defaults without touching disk.

name, version

DB identifiers (only used when data_dir is given).

Returns

list[str]

Cell type keys (read from {name}_n_obs_{version}.json).

sceleto.network.load_corr_db(gene: str, data_dir: str | Path, cell_types: list[str] | None = None, name: str = 'pangea', version: str = 'v03') DataFrame[source]

Load pre-computed correlations for a gene of interest.

Uses memory-mapped npy files for fast random row access.

Parameters

gene

Gene name (must exist in the corr database).

data_dir

Directory containing {name}_corr_{CT}_{version}.npy, {name}_gene_names_{version}.npy, and {name}_n_obs_{version}.json.

cell_types

Subset of cell types to load. None = all (auto-discovered from {name}_n_obs_{version}.json).

name, version

DB identifiers. Defaults are PANGEA ("pangea" / "v03").

Returns

pd.DataFrame

Wide table: gene + {CT}_corr + {CT}_pval per cell type. Compatible with select_top_genes().

Legacy

sceleto.network.network(adata, n_neighbor=10)[source]
sceleto.network.get_grid(bdata, scale=1, border=2, expand=3, select_per_grid=5, min_count=2, n_neighbor=10)[source]
sceleto.network.impute_neighbor(bdata, n_neighbor=10)[source]
sceleto.network.new_exp_matrix(bdata, idata, select, n_min_exp_cell=10, min_mean=0, min_disp=0.1, ratio_expressed=0.1, example_gene='CDK1', show_filter=None, max_cutoff=0.2, tflist=None, calc_var=False)[source]
sceleto.network.generate_gene_network(tfdata)[source]
sceleto.network.impute_anno(bdata, select, anno_key, n_neighbor=10)[source]
sceleto.network.draw_graph(tfdata, anno_key, anno_uniq, anno_ratio, factor0=100.0, adjust=False, text_fontsize=8, z_score_cut=0.4, color_dict=None, axis='X_draw_graph_kk')[source]