Utilities

Dotplot

sceleto.dotplot(adata, var_names: Sequence[str] | Mapping[str, Sequence], groupby: str, *, groups: Sequence[str] | None = None, swap_axes: bool = False, use_raw: bool = True, dendrogram: bool = False, cmap: str = 'OrRd', figsize: Tuple[float, float] | None = None, save: str | None = None, show: bool = True, **kwargs)[source]

Dotplot with per-gene max-normalized color, built on scanpy.pl.dotplot.

Size encodes fraction of cells expressing the gene (scanpy default). Color encodes group_mean(gene) / max_group(group_mean(gene)) per gene, so vmax=1 always corresponds to the highest-expressing group.

Follows scanpy’s default axis orientation: genes on x-axis, groups on y-axis. Pass swap_axes=True to put genes on y-axis, groups on x-axis.

Parameters

adata

AnnData with log1p-normalized expression.

var_names

Gene list or {bracket_name: [gene, ...]} / {bracket_name: [(gene, score), ...]} mapping. Mappings render as bracket-grouped labels via scanpy.

groupby

Column in adata.obs to group cells by.

groups

Subset of groups to display. None shows all.

swap_axes

If True, genes on y-axis, groups on x-axis (swaps scanpy default).

use_raw

If True (default), read from adata.raw.X. If False, read from adata.X. Both sources are checked for log1p normalization.

cmap

Matplotlib colormap for color scale (default OrRd).

figsize

Manual (width, height) in inches.

save

Path to save figure (PDF, dpi=300).

show

Whether to call plt.show().

**kwargs

Forwarded to scanpy.pl.dotplot.

Annotator

class sceleto.Annotator(adata, label_key: str, copy_from: str | None = None)[source]

Bases: object

Build cell-type annotations incrementally on an AnnData object.

Parameters

adata

AnnData object.

label_key

Name of the new column in adata.obs.

copy_from

If given, initialize from an existing adata.obs column.

annotate(obs_key: str, select: str, label: str, unknown_only: bool = False)[source]

Assign label to cells whose obs_key value equals select.

One call = one decision. select is a single string matched exactly against adata.obs[obs_key]; no list or comma splitting. To label multiple groups with the same label, call repeatedly (e.g. in a dict loop).

Parameters

obs_key

Column in adata.obs to match against (e.g. ‘leiden’, ‘leiden_R’).

select

Exact value to match in adata.obs[obs_key].

label

The annotation label to assign.

unknown_only

If True, only update cells still labeled ‘unknown’.

annotate_mask(mask, label: str)[source]

Assign label to cells matching a boolean mask directly.

summary()[source]

Print value counts of current annotations.

UMAP

sceleto.us(adata, gene, groups=None, show=False, exclude=None, figsize=None, **kwargs)[source]
  • 03/10/2022

Create a umap using a list of genes.

adata:AnnData, REQUIRED | AnnData object. gene:list/str, REQUIRED | List of genes to use for UMAP. A coma seperated string can be used instead of a list groups:str, NOT REQUIRED | Restrict to a few categories in categorical observation annotation show:boolean, NOT REQUIRED | Show the plot. Default = False. exclude:list, NOT REQUIRED | List of genes to exclude. figsize:float, NOT REQUIRED | Figure size.

Preprocessing

sceleto.sc_process(adata, steps: str = 'fspkuc', n_pcs: int = 50)[source]

Scanpy preprocessing pipeline controlled by a step string.

Each letter in steps triggers one preprocessing step, executed in order:

n

normalize_total (1e4)

l

log1p + store .raw

f

highly_variable_genes + filter

r

remove cell-cycle genes

s

scale (max_value=10)

p

PCA

k

kNN neighbors

u

UMAP

c

leiden clustering

Parameters

adata : AnnData steps : str

Letters selecting which steps to run. Default "fspkuc".

n_pcsint

Number of PCs for neighbor search. Default 50.

sceleto.read_process(adata, version: str, *, species: str = 'human', sample: str | None = None, define_var: bool = True, call_doublet: bool = True, write: bool = True, min_n_counts: int = 1000, min_n_genes: int = 500, max_n_genes: int = 7000, max_pct_mito: float = 0.5)[source]

QC filtering + optional doublet detection + write.

Parameters

adataAnnData

Raw count matrix.

versionstr

Version tag for the output filename.

speciesstr

"human" or "mouse" (determines mito gene prefix).

samplestr, optional

Sample name stored in adata.obs["Sample"].

define_varbool

If True, copy gene names / Ensembl IDs into adata.var.

call_doubletbool

If True, run scrublet for doublet detection (lazy import).

writebool

If True, save filtered adata as h5ad.

min_n_counts, min_n_genes, max_n_genesint

Cell-level count / gene number thresholds.

max_pct_mitofloat

Maximum mitochondrial fraction (0–1).

sceleto.remove_geneset(adata, geneset)[source]

Remove genes in geneset from adata and return a copy.