cycombinepy.io.read_fcs_dir

cycombinepy.io.read_fcs_dir(data_dir, pattern='*.fcs', metadata=None, filename_col='filename', sample_key=None, batch_key=None, condition_key=None, anchor_key=None, markers=None, transform=True, cofactor=5.0, derand=True, downsample=None, sampling_type='random', seed=None)[source]

Read all FCS files in data_dir into a single AnnData.

Mirrors compile_fcs + convert_flowset + prepare_data from R/01_prepare_data.R. Metadata (a DataFrame or a CSV/Excel path) is joined on the basename of each FCS file via filename_col. Its columns are renamed to batch / sample / condition / anchor if the corresponding *_key argument points at them.

Parameters:
  • data_dir (str | PathLike) – Directory containing FCS files.

  • pattern (str) – Glob pattern for selecting files.

  • metadata (DataFrame | str | PathLike | None) – DataFrame or path to a CSV/TSV/XLSX table. Must contain filename_col matching the FCS basenames.

  • filename_col (str) – Column in metadata holding the FCS filenames.

  • anchor_key (str | None) – Columns of metadata to use for sample / batch / condition / anchor respectively. Resulting adata.obs will use those canonical names.

  • markers (Optional[Iterable[str]]) – Restrict to these var_names after loading (optional).

  • transform (bool) – If True, apply cycombinepy.transform_asinh() with cofactor.

  • cofactor (float) – Forwarded to transform_asinh.

  • derand (bool) – Forwarded to transform_asinh.

  • downsample (int | None) – If given, downsample each unit (defined by sampling_type) to this many cells.

  • sampling_type (Literal['random', 'per_batch', 'per_sample']) – How to downsample: uniformly at random, or per batch / per sample.

  • seed (int | None) – RNG seed.

  • sample_key (str | None)

  • batch_key (str | None)

  • condition_key (str | None)

Return type:

AnnData