sceleto.markers

Entry Points

sceleto.markers.marker(adata, groupby: str, *, thres_fc: float | str = 'auto', specific_A: float = 1.0, specific_B: float = 0.5, specific_only_high_markers: bool = True, specific_score_col: str = 'specific_weight', specific_score_fn: Callable[[DataFrame], object] | None = None, use_raw: bool = True, k: int = 5, exclude: List[str] | None = None, min_cells_per_group: int = 0, min_expr_cells_per_gene: int = 0, eps: float = 0.001, min_mean_any: float = 0.05, min_mean_high: float = 0.5, min_frac_high: float = 0.2, max_mean_low: float = 0.2, min_nexpr_any: int = 0, fc_cutoff: float | None = None, label_k: float = 2.0, sigma_method: str = 'sd', min_gap: float = 0.2, min_margin: float = 0.0, level: int = 3, bidirectional: bool = True, node_size_scale: float = 10.0, batch_key: str | None = None, batch_min_cells: int = 5, batch_ttest_alpha: float = 0.05, batch_ttest_min_batches: int = 3)[source]

Graph-based marker workflow (one-word entry point).

Wraps sceleto.markers.graph.run_marker_graph() with the same parameters. When batch_key is provided, a Welch’s t-test filter is automatically applied after FC filtering (genes that can be tested but fail p-value are dropped).

sceleto.markers.simple(adata, groupby: str, **kwargs) MarkersSimple[source]

Simple (non-graph) marker workflow.

sceleto.markers.hierarchy(adata, marker_runs, **kwargs) HierarchyRun[source]

Wrapper for cross-resolution hierarchy workflow.

sceleto.markers.sweep_fc(adata, groupby: str, **kwargs)[source]

Sweep FC thresholds to help determine thres_fc.

See sceleto.markers.graph.sweep_fc_threshold() for full docs.

Results

class sceleto.markers.graph._onestep.MarkerGraphRun(ctx: Any, edge_gene_df: DataFrame, edge_fc: DataFrame, edge_delta: DataFrame, labels: Any, note_df: DataFrame, G: Any, pos: Any, gene_edge_fc: Dict[str, Dict[Tuple[object, object], float]], gene_to_edges: Dict[str, List[str]], viz: Any, specific_ranking_df: DataFrame, _marker_log: Dict[str, List[str]], batch_key: str | None = None, edge_metric: Literal['fc', 'delta'] = 'fc', sweep_df: DataFrame | None = None, suggested_thres_fc: float | None = None)[source]

Bases: MarkersBase

Container for one-step marker-graph pipeline results.

Notes

  • Keeps intermediate artifacts for debugging and inspection.

ctx: Any
edge_gene_df: DataFrame
edge_fc: DataFrame
edge_delta: DataFrame
labels: Any
note_df: DataFrame
G: Any
pos: Any
gene_edge_fc: Dict[str, Dict[Tuple[object, object], float]]
gene_to_edges: Dict[str, List[str]]
viz: Any
specific_ranking_df: DataFrame
batch_key: str | None = None
edge_metric: Literal['fc', 'delta'] = 'fc'
sweep_df: DataFrame | None = None
suggested_thres_fc: float | None = None
plot_fc_threshold(**kwargs)[source]

Plot threshold sweep results. Only available if the active threshold was “auto”.

plot_gene_edges_fc(gene: str, **kwargs)[source]
plot_gene_levels_with_edges(gene: str, level: int | None = None, **kwargs)[source]
plot_highlight_edges(edges, **kwargs)[source]
property adata
property groupby
property markers: Dict[str, List[str]]

Per-group marker gene lists, ranked by specificity score.

batch_mean_detail(adata, gene: str, group: str)[source]

Return per-batch mean expression for a specific (gene, group).

Parameters

adataAnnData

The same AnnData used in run_marker_graph().

genestr

Marker gene name.

groupstr

Cluster where the gene is highly expressed.

Returns

DataFrame with columns:

edge_start, edge_end, batch, mean_start, mean_end, n_cells_start, n_cells_end.

class sceleto.markers._simple.MarkersSimple(adata, groupby: str, *, single: bool = True, gap_thres: float = 0.2, min_mean: float = 0.2, min_frac: float = 0.2, min_count: int = 10, run_scanpy: bool = False, **kwargs)[source]

Bases: MarkersBase

Simple (non-graph) marker workflow.

Detects markers by comparing mean expression and dropout ratio across clusters without requiring graph structure.

Parameters

adata

AnnData object (must have .raw populated).

groupby

Column in adata.obs to group cells by.

single

If True (default), use single-top mode; otherwise multi-top.

gap_thres, min_mean, min_frac, min_count

Thresholds for marker filtering.

run_scanpy

If True, also run scanpy’s rank_genes_groups for comparison.

property markers: Dict[str, List[str]]

Per-group marker gene lists (gene names only, ranked by score).

property markers_scored: Dict[str, List[Tuple[str, float]]]

Per-group marker gene lists with scores.

property stats: Dict[str, Dict[str, ndarray]]

Access computed cluster statistics.

show_marker(celltype=None, n: int = 40, result: bool = False)[source]
find_markers_groups(groups, **kwargs)[source]

Find markers distinguishing specific groups from the rest.

find_markers_negative(**kwargs)[source]

Find negative markers (lowest expression in one cluster).

class sceleto.markers._hierarchy.HierarchyRun(levels: 'List[str]', params: 'Dict[str, Any]', icls_full_dict: 'Dict[str, str]', icls_path_df: 'pd.DataFrame', marker_rank_df: 'pd.DataFrame', full_gene_lists: 'Dict[str, List[str]]', contexts: 'Dict[str, Any]', batch_expression: 'Optional[Dict[str, BatchExpression]]', batch_key: 'Optional[str]')[source]

Bases: object

levels: List[str]
params: Dict[str, Any]
icls_full_dict: Dict[str, str]
icls_path_df: DataFrame
marker_rank_df: DataFrame
full_gene_lists: Dict[str, List[str]]
contexts: Dict[str, Any]
batch_expression: Dict[str, BatchExpression] | None
batch_key: str | None
interactive_viewer(adata, mgr, *, save: str = 'interactive_viewer.html', n_top: int | None = None) None[source]

Generate an interactive HTML viewer with edge-activation panel.

Layout: icls UMAP (left) + marker comparison heatmap (top-right) + per-gene edge-activation graph (bottom-right). In batch mode the heatmap shows per-batch expression strips instead of presence.

Parameters

adata

AnnData with obs['icls'] (set by hierarchy) and obsm['X_umap'].

mgr

sceleto.markers.graph.MarkerGraphRun driving the bottom-right edge-activation graph. Typically:

mgr = scl.markers.marker(adata, "icls")
save

Output HTML file path.

n_top

Number of top markers per cluster. Defaults to the value used in the hierarchy run.

compare_markers(icls: str, *, figsize=None, gene_filter: GeneFilter | None = None, return_genes: bool = False)[source]

Visualize top-N marker overlap across levels for a given icls.

compare_markers_batch(icls: str, *, figsize=None, gene_filter: GeneFilter | None = None, return_genes: bool = False)[source]

Visualize top-N marker overlap with per-batch expression strips.

Each strip in a (level, gene) cell encodes one batch:

  • grey : batch has no cells in this cluster (no data)

  • white : batch has cells but mean expression is 0

  • redcolored by mean / cell_max, where cell_max is the

    maximum batch mean within that (cluster, gene) cell

Color scale is per-cell normalized (0–1), so each cell’s brightest batch is always 1. This makes batch consistency visible regardless of absolute expression level.

Gene Filter

class sceleto.markers.GeneFilter(exclude: Sequence[str] | None = None, include: Sequence[str] | None = None, name_exclude: Sequence[str] | None = None, dot_filter: bool = False)[source]

Bases: object

Composable gene-name filter.

Parameters

exclude

Category names from gene_categories.json. A gene is dropped if it belongs to any of the listed categories.

include

Category names from gene_categories.json. When provided, only genes present in the union of these categories are kept (after exclude filtering).

name_exclude

List of name patterns to exclude. Multi-character entries use prefix matching (startswith). The special entry "." excludes any gene whose name contains a dot (e.g. AL035401.1).

dot_filter

Deprecated. Use name_exclude=["."] instead.

filter(genes: Sequence[str]) List[str][source]

Return the sublist of genes that pass the filter.