sceleto.markers¶

Entry Points¶

sceleto.markers.marker(adata, groupby: str, *, thres_fc: float | str = 'auto', specific_A: float = 1.0, specific_B: float = 0.5, specific_only_high_markers: bool = True, specific_score_col: str = 'specific_weight', specific_score_fn: Callable[[DataFrame], object] | None = None, use_raw: bool = True, k: int = 5, exclude: List[str] | None = None, min_cells_per_group: int = 0, min_expr_cells_per_gene: int = 0, eps: float = 0.001, min_mean_any: float = 0.05, min_mean_high: float = 0.5, min_frac_high: float = 0.2, max_mean_low: float = 0.2, min_nexpr_any: int = 0, fc_cutoff: float | None = None, label_k: float = 2.0, sigma_method: str = 'sd', min_gap: float = 0.2, min_margin: float = 0.0, level: int = 3, bidirectional: bool = True, node_size_scale: float = 10.0, batch_key: str | None = None, batch_min_cells: int = 5, batch_ttest_alpha: float = 0.05, batch_ttest_min_batches: int = 3)[source]¶

Graph-based marker workflow (one-word entry point).

Wraps sceleto.markers.graph.run_marker_graph() with the same parameters. When batch_key is provided, a Welch’s t-test filter is automatically applied after FC filtering (genes that can be tested but fail p-value are dropped).

sceleto.markers.simple(adata, groupby: str, **kwargs) → MarkersSimple[source]¶: Simple (non-graph) marker workflow.

sceleto.markers.hierarchy(adata, marker_runs, **kwargs) → HierarchyRun[source]¶: Wrapper for cross-resolution hierarchy workflow.

sceleto.markers.sweep_fc(adata, groupby: str, **kwargs)[source]¶

Sweep FC thresholds to help determine thres_fc.

See sceleto.markers.graph.sweep_fc_threshold() for full docs.

Results¶

class sceleto.markers.graph._onestep.MarkerGraphRun(ctx: Any, edge_gene_df: DataFrame, edge_fc: DataFrame, edge_delta: DataFrame, labels: Any, note_df: DataFrame, G: Any, pos: Any, gene_edge_fc: Dict[str, Dict[Tuple[object, object], float]], gene_to_edges: Dict[str, List[str]], viz: Any, specific_ranking_df: DataFrame, _marker_log: Dict[str, List[str]], batch_key: str | None = None, edge_metric: Literal['fc', 'delta'] = 'fc', sweep_df: DataFrame | None = None, suggested_thres_fc: float | None = None)[source]¶

Bases: MarkersBase

Container for one-step marker-graph pipeline results.

Notes¶

Keeps intermediate artifacts for debugging and inspection.

ctx: Any¶

edge_gene_df: DataFrame¶

edge_fc: DataFrame¶

edge_delta: DataFrame¶

labels: Any¶

note_df: DataFrame¶

G: Any¶

pos: Any¶

gene_edge_fc: Dict[str, Dict[Tuple[object, object], float]]¶

gene_to_edges: Dict[str, List[str]]¶

viz: Any¶

specific_ranking_df: DataFrame¶

batch_key: str | None = None¶

edge_metric: Literal['fc', 'delta'] = 'fc'¶

sweep_df: DataFrame | None = None¶

suggested_thres_fc: float | None = None¶

plot_fc_threshold(**kwargs)[source]¶: Plot threshold sweep results. Only available if the active threshold was “auto”.

plot_gene_edges_fc(gene: str, **kwargs)[source]¶

plot_gene_levels_with_edges(gene: str, level: int | None = None, **kwargs)[source]¶

plot_highlight_edges(edges, **kwargs)[source]¶

property adata¶

property groupby¶

property markers: Dict[str, List[str]]¶: Per-group marker gene lists, ranked by specificity score.

batch_mean_detail(adata, gene: str, group: str)[source]¶

Return per-batch mean expression for a specific (gene, group).

Parameters¶

adataAnnData: The same AnnData used in run_marker_graph().
genestr: Marker gene name.
groupstr: Cluster where the gene is highly expressed.

Returns¶

DataFrame with columns:: edge_start, edge_end, batch, mean_start, mean_end, n_cells_start, n_cells_end.

class sceleto.markers._simple.MarkersSimple(adata, groupby: str, *, single: bool = True, gap_thres: float = 0.2, min_mean: float = 0.2, min_frac: float = 0.2, min_count: int = 10, run_scanpy: bool = False, **kwargs)[source]¶

Bases: MarkersBase

Simple (non-graph) marker workflow.

Detects markers by comparing mean expression and dropout ratio across clusters without requiring graph structure.

Parameters¶

adata: AnnData object (must have .raw populated).
groupby: Column in adata.obs to group cells by.
single: If True (default), use single-top mode; otherwise multi-top.
gap_thres, min_mean, min_frac, min_count: Thresholds for marker filtering.
run_scanpy: If True, also run scanpy’s rank_genes_groups for comparison.

property markers: Dict[str, List[str]]¶: Per-group marker gene lists (gene names only, ranked by score).

property markers_scored: Dict[str, List[Tuple[str, float]]]¶: Per-group marker gene lists with scores.

property stats: Dict[str, Dict[str, ndarray]]¶: Access computed cluster statistics.

show_marker(celltype=None, n: int = 40, result: bool = False)[source]¶

find_markers_groups(groups, **kwargs)[source]¶: Find markers distinguishing specific groups from the rest.

find_markers_negative(**kwargs)[source]¶: Find negative markers (lowest expression in one cluster).

class sceleto.markers._hierarchy.HierarchyRun(levels: 'List[str]', params: 'Dict[str, Any]', icls_full_dict: 'Dict[str, str]', icls_path_df: 'pd.DataFrame', marker_rank_df: 'pd.DataFrame', full_gene_lists: 'Dict[str, List[str]]', contexts: 'Dict[str, Any]', batch_expression: 'Optional[Dict[str, BatchExpression]]', batch_key: 'Optional[str]')[source]¶

Bases: object

levels: List[str]¶

params: Dict[str, Any]¶

icls_full_dict: Dict[str, str]¶

icls_path_df: DataFrame¶

marker_rank_df: DataFrame¶

full_gene_lists: Dict[str, List[str]]¶

contexts: Dict[str, Any]¶

batch_expression: Dict[str, BatchExpression] | None¶

batch_key: str | None¶

interactive_viewer(adata, mgr, *, save: str = 'interactive_viewer.html', n_top: int | None = None) → None[source]¶

Generate an interactive HTML viewer with edge-activation panel.

Layout: icls UMAP (left) + marker comparison heatmap (top-right) + per-gene edge-activation graph (bottom-right). In batch mode the heatmap shows per-batch expression strips instead of presence.

Parameters¶

adata

AnnData with obs['icls'] (set by hierarchy) and obsm['X_umap'].

mgr

sceleto.markers.graph.MarkerGraphRun driving the bottom-right edge-activation graph. Typically:

mgr = scl.markers.marker(adata, "icls")

save

Output HTML file path.

n_top

Number of top markers per cluster. Defaults to the value used in the hierarchy run.

compare_markers(icls: str, *, figsize=None, gene_filter: GeneFilter | None = None, return_genes: bool = False)[source]¶: Visualize top-N marker overlap across levels for a given icls.

compare_markers_batch(icls: str, *, figsize=None, gene_filter: GeneFilter | None = None, return_genes: bool = False)[source]¶

Visualize top-N marker overlap with per-batch expression strips.

Each strip in a (level, gene) cell encodes one batch:

grey : batch has no cells in this cluster (no data)

white : batch has cells but mean expression is 0

redcolored by mean / cell_max, where cell_max is the
maximum batch mean within that (cluster, gene) cell

Color scale is per-cell normalized (0–1), so each cell’s brightest batch is always 1. This makes batch consistency visible regardless of absolute expression level.

Gene Filter¶

class sceleto.markers.GeneFilter(exclude: Sequence[str] | None = None, include: Sequence[str] | None = None, name_exclude: Sequence[str] | None = None, dot_filter: bool = False)[source]¶

Bases: object

Composable gene-name filter.

Parameters¶

exclude: Category names from gene_categories.json. A gene is dropped if it belongs to any of the listed categories.
include: Category names from gene_categories.json. When provided, only genes present in the union of these categories are kept (after exclude filtering).
name_exclude: List of name patterns to exclude. Multi-character entries use prefix matching (startswith). The special entry "." excludes any gene whose name contains a dot (e.g. AL035401.1).
dot_filter: Deprecated. Use name_exclude=["."] instead.

filter(genes: Sequence[str]) → List[str][source]¶: Return the sublist of genes that pass the filter.