tidypolars_extra.funs

Functions

between(x, left, right)

Test if values of a column are between two values

case_when(*args[, _default])

Case when

coalesce(*args)

Coalesce missing values

if_else(condition, true, false)

If Else

is_finite(x)

Test if values are finite

is_in(x, values)

Test if values are in a list

is_infinite(x)

Test if values are infinite

is_not(x)

Negate a boolean expression

is_not_in(x, values)

Test if values are not in a list

is_not_null(x)

Test if values are not null

is_null(x)

Test if values are null

lead(x[, n, default])

Get leading values

map(cols, _fun)

Apply function by row

n_distinct(x)

Get number of distinct values in a column

n_missing(x)

Count the number of null/missing values in a column

pct_missing(x)

Compute the percentage of null/missing values in a column

rep(x[, times])

Replicate the values in x

replace_null(x[, replace])

Replace null values

round(x[, digits])

Round a column to the specified number of decimal places

row_number()

Return row number

Module Contents

tidypolars_extra.funs.between(x, left, right)[source]

Test if values of a column are between two values

Parameters:
  • x (Expr, Series) – Column to operate on

  • left (int) – Value to test if column is greater than or equal to

  • right (int) – Value to test if column is less than or equal to

Examples

>>> df = tp.tibble(x = range(4))
>>> df.filter(tp.between(col('x'), 1, 3))
tidypolars_extra.funs.case_when(*args, _default=None)[source]

Case when

Parameters:
  • *args (Expr) – When called with a single expression, returns pl.when() for chaining (e.g., tp.case_when(cond).then(val).otherwise(val)). When called with paired args (condition, value, condition, value, …), builds the full case expression.

  • _default (optional) – Default value when no condition is met (used with paired args)

Examples

>>> df = tp.tibble(x = range(1, 4))
>>> # Chaining style
>>> df.mutate(case_x = tp.case_when(col('x') < 2).then(0)
...                     .when(col('x') < 3).then(1)
...                     .otherwise(0))
>>> # Paired args style
>>> df.mutate(
>>>    case_x = tp.case_when(col('x') < 2, 1,
>>>                          col('x') < 3, 2,
>>>                          _default = 0)
>>> )
tidypolars_extra.funs.coalesce(*args)[source]

Coalesce missing values

Parameters:

args (Expr) – Columns to coalesce

Examples

>>> df.mutate(abs_x = tp.cast(col('x'), tp.Float64))
tidypolars_extra.funs.if_else(condition, true, false)[source]

If Else

Parameters:
  • condition (Expr) – A logical expression

  • true – Value if the condition is true

  • false – Value if the condition is false

Examples

>>> df = tp.tibble(x = range(1, 4))
>>> df.mutate(if_x = tp.if_else(col('x') < 2, 1, 2))
tidypolars_extra.funs.is_finite(x)[source]

Test if values are finite

Parameters:

x (Expr, Series) – Column to operate on

Examples

>>> df.mutate(finite = tp.is_finite('x'))
tidypolars_extra.funs.is_in(x, values)[source]

Test if values are in a list

Parameters:
  • x (Expr, Series) – Column to operate on

  • values (list) – List of values to check

Examples

>>> df.mutate(in_list = tp.is_in('x', [1, 2]))
tidypolars_extra.funs.is_infinite(x)[source]

Test if values are infinite

Parameters:

x (Expr, Series) – Column to operate on

Examples

>>> df.mutate(infinite = tp.is_infinite('x'))
tidypolars_extra.funs.is_not(x)[source]

Negate a boolean expression

Parameters:

x (Expr) – Boolean expression to negate

Examples

>>> df.mutate(not_finite = tp.is_not(tp.is_finite(col('x'))))
tidypolars_extra.funs.is_not_in(x, values)[source]

Test if values are not in a list

Parameters:
  • x (Expr, Series) – Column to operate on

  • values (list) – List of values to check

Examples

>>> df.mutate(not_in = tp.is_not_in('x', [1, 2]))
tidypolars_extra.funs.is_not_null(x)[source]

Test if values are not null

Parameters:

x (Expr, Series) – Column to operate on

Examples

>>> df.mutate(not_null = tp.is_not_null('x'))
tidypolars_extra.funs.is_null(x)[source]

Test if values are null

Parameters:

x (Expr, Series) – Column to operate on

Examples

>>> df.mutate(null = tp.is_null('x'))
tidypolars_extra.funs.lead(x, n: int = 1, default=None)[source]

Get leading values

Parameters:
  • x (Expr, Series) – Column to operate on

  • n (int) – Number of positions to lead by

  • default (optional) – Value to fill in missing values

Examples

>>> df.mutate(lead_x = tp.lead(col('x')))
>>> df.mutate(lead_x = col('x').lead())
tidypolars_extra.funs.map(cols, _fun)[source]

Apply function by row

Parameters:
  • cols (list of str) – A list with the name of the columns in the data to apply function

  • _fun (a function) – The function to apply to the columns. The function is applied to each row separately

tidypolars_extra.funs.n_distinct(x)[source]

Get number of distinct values in a column

Parameters:

x (Expr, Series) – Column to operate on

Examples

>>> df.summarize(min_x = tp.n_distinct('x'))
>>> df.summarize(min_x = tp.n_distinct(col('x')))
tidypolars_extra.funs.n_missing(x)[source]

Count the number of null/missing values in a column

Parameters:

x (Expr, str) – Column to operate on

Returns:

Count of null values.

Return type:

Expr

Examples

>>> df.summarize(missing = tp.n_missing('x'))
tidypolars_extra.funs.pct_missing(x)[source]

Compute the percentage of null/missing values in a column

Parameters:

x (Expr, str) – Column to operate on

Returns:

Percentage of null values (0 to 100).

Return type:

Expr

Examples

>>> df.summarize(pct = tp.pct_missing('x'))
tidypolars_extra.funs.rep(x, times=1)[source]

Replicate the values in x

Parameters:
  • x (const, Series) – Value or Series to repeat

  • times (int) – Number of times to repeat

Examples

>>> tp.rep(1, 3)
>>> tp.rep(pl.Series(range(3)), 3)
tidypolars_extra.funs.replace_null(x, replace=None)[source]

Replace null values

Parameters:

x (Expr, Series) – Column to operate on

Examples

>>> df = tp.tibble(x = [0, None], y = [None, None])
>>> df.mutate(x = tp.replace_null(col('x'), 1))
tidypolars_extra.funs.round(x, digits=0)[source]

Round a column to the specified number of decimal places

Parameters:
  • x (Expr, Series) – Column to operate on

  • digits (int) – Decimals to round to

Examples

>>> df.mutate(x = tp.round(col('x')))
tidypolars_extra.funs.row_number()[source]

Return row number

Examples

>>> df.mutate(row_num = tp.row_number())