sceleto.network¶

Correlation-based Gene Network¶

sceleto.network.compute_corr(adata: AnnData, gene: str, label: str | None = None, layer: str | None = None, chunk_size: int = 4096) → DataFrame[source]¶

Pearson correlation of gene against all genes in adata.

Parameters¶

adata: Input AnnData.
gene: Gene of interest; must be in adata.var_names.
label: Column prefix for output. Falls back to adata.uns["label"] then "sample".
layer: Layer to use instead of adata.X.
chunk_size: Genes processed per chunk (memory control).

Returns¶

pd.DataFrame: Columns: gene, {label}_corr, {label}_pval.

sceleto.network.build_corr_matrix(adatas: dict[str, AnnData], gene: str, layer: str | None = None, chunk_size: int = 4096) → DataFrame[source]¶

Compute per-condition correlation for gene across multiple AnnData objects.

Parameters¶

adatas: {label: AnnData} mapping. The key is used as the column prefix.
gene: Gene of interest.
layer: Layer to use instead of adata.X.
chunk_size: Passed to compute_corr().

Returns¶

pd.DataFrame: Wide table: gene + {label}_corr + {label}_pval per condition.

sceleto.network.select_top_genes(corr_df: DataFrame, top_n: int = 10, conditions: list[str] | None = None, exclude_gene: str | None = None) → DataFrame[source]¶

Select the top top_n positively correlated genes per condition.

Parameters¶

corr_df: Wide table from build_corr_matrix() or load_corr_db().
top_n: Number of top genes to keep per condition.
conditions: Subset of condition labels (column prefix, i.e. without _corr). If None, all *_corr columns are used.
exclude_gene: Gene name to exclude (typically the GOI itself). Removes rows where gene == exclude_gene or corr >= 1.0.

Returns¶

pd.DataFrame: Long-form: condition, gene, corr, pval.

sceleto.network.build_feature_matrix(top_genes_df: DataFrame, corr_df: DataFrame) → DataFrame[source]¶

Build a gene × conditions correlation matrix for network construction.

Parameters¶

top_genes_df: Long-form output of select_top_genes().
corr_df: Wide table from build_corr_matrix().

Returns¶

pd.DataFrame: Index = unique genes, columns = condition labels, values = corr (NaN filled with 0.0).

sceleto.network.build_gene_network(feature_matrix: DataFrame, k: int = 5, metric: str = 'euclidean') → Graph[source]¶

Build a k-NN gene network from a feature matrix.

Parameters¶

feature_matrix: Gene × conditions matrix (output of build_feature_matrix()).
k: Number of nearest neighbours per gene.
metric: Distance metric passed to scipy.spatial.distance.pdist.

Returns¶

networkx.Graph: Nodes = gene names; edge attributes: dist, weight.

sceleto.network.plot_network(G: Graph, feature_matrix: DataFrame | None = None, condition: str | None = None, pos: dict | None = None, seed: int = 3, figsize: tuple[int, int] = (15, 15), node_size_range: tuple[int, int] = (50, 600), cmap: str = 'coolwarm', ax: Axes | None = None) → Figure[source]¶

Draw a gene network with optional per-condition node coloring.

Parameters¶

G: networkx Graph from build_gene_network().
feature_matrix: Gene × conditions matrix. Required when condition is set.
condition: Column in feature_matrix to use for node color/size.
pos: Pre-computed layout positions. If None, spring layout is computed.
seed: Random seed for spring layout.

figsize node_size_range

(min_size, max_size) when coloring by condition.

cmap: Colormap name for condition coloring.
ax: Existing Axes to draw on.

Returns¶

matplotlib Figure

sceleto.network.plot_clustermap(feature_matrix: DataFrame, figsize: tuple[int, int] = (15, 35), cmap: str = 'coolwarm', max_genes: int = 96) → ClusterGrid[source]¶

Hierarchically clustered heatmap of the feature matrix.

Parameters¶

feature_matrix: Gene × conditions matrix.

figsize cmap max_genes

If more genes than this, keep top max_genes by mean |corr|.

Returns¶

seaborn ClusterGrid

sceleto.network.corr_pangea(gene: str, data_dir: str, cell_types: list[str] | None = None, top_n: int = 10, k: int = 5) → tuple[DataFrame, DataFrame, Graph][source]¶

One-shot gene network from PANGEA pre-computed correlation DB.

Parameters¶

gene: Gene of interest (e.g. "CD55").
data_dir: Directory containing pangea_corr_{CT}_v03.csv.gz files.
cell_types: Subset of cell types. None = all 6.
top_n: Number of top correlated genes per cell type.
k: Number of nearest neighbours for the kNN gene network.

Returns¶

corr_dfpd.DataFrame: Wide table (gene + per-cell-type corr/pval).
feature_matrixpd.DataFrame: Gene × conditions correlation matrix.
Gnetworkx.Graph: kNN gene network.

Correlation Database¶

sceleto.network.list_cell_types(data_dir: str | Path | None = None, name: str = 'pangea', version: str = 'v03') → list[str][source]¶

Return available cell types in a corr database.

Parameters¶

data_dir: Directory containing the DB. If None, returns the PANGEA defaults without touching disk.
name, version: DB identifiers (only used when data_dir is given).

Returns¶

list[str]: Cell type keys (read from {name}_n_obs_{version}.json).

sceleto.network.load_corr_db(gene: str, data_dir: str | Path, cell_types: list[str] | None = None, name: str = 'pangea', version: str = 'v03') → DataFrame[source]¶

Load pre-computed correlations for a gene of interest.

Uses memory-mapped npy files for fast random row access.

Parameters¶

gene: Gene name (must exist in the corr database).
data_dir: Directory containing {name}_corr_{CT}_{version}.npy, {name}_gene_names_{version}.npy, and {name}_n_obs_{version}.json.
cell_types: Subset of cell types to load. None = all (auto-discovered from {name}_n_obs_{version}.json).
name, version: DB identifiers. Defaults are PANGEA ("pangea" / "v03").

Returns¶

pd.DataFrame: Wide table: gene + {CT}_corr + {CT}_pval per cell type. Compatible with select_top_genes().

Legacy¶

sceleto.network.network(adata, n_neighbor=10)[source]¶

sceleto.network.get_grid(bdata, scale=1, border=2, expand=3, select_per_grid=5, min_count=2, n_neighbor=10)[source]¶

sceleto.network.impute_neighbor(bdata, n_neighbor=10)[source]¶

sceleto.network.new_exp_matrix(bdata, idata, select, n_min_exp_cell=10, min_mean=0, min_disp=0.1, ratio_expressed=0.1, example_gene='CDK1', show_filter=None, max_cutoff=0.2, tflist=None, calc_var=False)[source]¶

sceleto.network.generate_gene_network(tfdata)[source]¶

sceleto.network.impute_anno(bdata, select, anno_key, n_neighbor=10)[source]¶

sceleto.network.draw_graph(tfdata, anno_key, anno_uniq, anno_ratio, factor0=100.0, adjust=False, text_fontsize=8, z_score_cut=0.4, color_dict=None, axis='X_draw_graph_kk')[source]¶