cellspec.pp.filter_to_snps#
- cellspec.pp.filter_to_snps(adata, chrom_prefix=None, inplace=True)#
Filter to only single nucleotide variants (SNPs).
Removes indels and multi-allelic sites, keeping only sites where both reference and alternate alleles are single nucleotides.
- Parameters:
adata (ad.AnnData) – AnnData object with variants
chrom_prefix (str, optional) – Chromosome prefix for parsing variant IDs. If None (default), accepts any chromosome naming (e.g., ‘chr1’, ‘I’, ‘1’). Use ‘chr’ for human/mouse data if you want to be strict.
inplace (bool, default True) – Modify adata in place or return copy
- Return type:
- Returns:
ad.AnnData or None Filtered AnnData (if inplace=False), otherwise None
Examples
>>> import cellspec as spc >>> # Default: works with any chromosome naming >>> adata = spc.pp.load_vcf("variants.vcf.gz") >>> spc.pp.filter_to_snps(adata) >>> print(f"Retained {adata.n_vars} SNPs") >>> >>> # For human/mouse data, can specify prefix >>> spc.pp.filter_to_snps(adata, chrom_prefix="chr")
Notes
This is typically run before trinucleotide context annotation, as trinuc contexts are only defined for SNPs.
Works with any organism’s chromosome naming: - Human/mouse: ‘chr1-12345-A>T’ - C. elegans: ‘I-12345-A>T’ - Drosophila: ‘2L-12345-A>T’