cycombinepy.io.read_fcs_dir¶

cycombinepy.io.read_fcs_dir(data_dir, pattern='*.fcs', metadata=None, filename_col='filename', sample_key=None, batch_key=None, condition_key=None, anchor_key=None, markers=None, transform=True, cofactor=5.0, derand=True, downsample=None, sampling_type='random', seed=None)[source]¶

Read all FCS files in data_dir into a single AnnData.

Mirrors compile_fcs + convert_flowset + prepare_data from R/01_prepare_data.R. Metadata (a DataFrame or a CSV/Excel path) is joined on the basename of each FCS file via filename_col. Its columns are renamed to batch / sample / condition / anchor if the corresponding *_key argument points at them.

Parameters:

data_dir (str | PathLike) – Directory containing FCS files.
pattern (str) – Glob pattern for selecting files.
metadata (DataFrame | str | PathLike | None) – DataFrame or path to a CSV/TSV/XLSX table. Must contain filename_col matching the FCS basenames.
filename_col (str) – Column in metadata holding the FCS filenames.
anchor_key (str | None) – Columns of metadata to use for sample / batch / condition / anchor respectively. Resulting adata.obs will use those canonical names.
markers (Optional[Iterable[str]]) – Restrict to these var_names after loading (optional).
transform (bool) – If True, apply cycombinepy.transform_asinh() with cofactor.
cofactor (float) – Forwarded to transform_asinh.
derand (bool) – Forwarded to transform_asinh.
downsample (int | None) – If given, downsample each unit (defined by sampling_type) to this many cells.
sampling_type (Literal['random', 'per_batch', 'per_sample']) – How to downsample: uniformly at random, or per batch / per sample.
seed (int | None) – RNG seed.
sample_key (str | None)
batch_key (str | None)
condition_key (str | None)

Return type:

AnnData