{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Pairwise Comparisons with `stat_pwc`\n",
    "\n",
    "This vignette demonstrates how to use `stat_pwc` to add **pairwise\n",
    "comparison p-values** to plots, similar to\n",
    "[ggpubr's `geom_pwc`](https://rpkgs.datanovia.com/ggpubr/reference/geom_pwc.html).\n",
    "\n",
    "`stat_pwc` performs pairwise statistical tests (t-test, Wilcoxon) between\n",
    "groups and displays the results as **bracket annotations** with p-values.\n",
    "It supports:\n",
    "\n",
    "- All pairwise combinations\n",
    "- Comparisons against a reference group\n",
    "- Explicit comparison pairs\n",
    "- Multiple p-value adjustment methods (Bonferroni, Holm, BH, etc.)\n",
    "- Various label formats (p-value, significance stars)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "from plotnine import (\n",
    "    aes,\n",
    "    geom_boxplot,\n",
    "    geom_jitter,\n",
    "    geom_point,\n",
    "    ggplot,\n",
    "    labs,\n",
    "    scale_y_continuous,\n",
    "    theme_minimal,\n",
    "    facet_wrap,\n",
    ")\n",
    "from plotnine_extra import stat_pwc"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data Preparation\n",
    "\n",
    "We use a dataset inspired by R's `ToothGrowth` dataset, which records\n",
    "tooth length (`len`) across different supplement types (`supp`) and\n",
    "dose levels (`dose`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "np.random.seed(42)\n",
    "\n",
    "# Create a ToothGrowth-like dataset\n",
    "df = pd.DataFrame({\n",
    "    \"dose\": np.tile(np.repeat([\"D0.5\", \"D1\", \"D2\"], 10), 2),\n",
    "    \"supp\": np.repeat([\"OJ\", \"VC\"], 30),\n",
    "    \"len\": np.concatenate([\n",
    "        # OJ groups\n",
    "        np.random.normal(13, 4, 10),  # OJ, dose 0.5\n",
    "        np.random.normal(23, 3, 10),  # OJ, dose 1\n",
    "        np.random.normal(26, 2, 10),  # OJ, dose 2\n",
    "        # VC groups\n",
    "        np.random.normal(8, 3, 10),   # VC, dose 0.5\n",
    "        np.random.normal(17, 4, 10),  # VC, dose 1\n",
    "        np.random.normal(26, 3, 10),  # VC, dose 2\n",
    "    ]),\n",
    "})\n",
    "\n",
    "df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Basic Usage: All Pairwise Comparisons\n",
    "\n",
    "By default, `stat_pwc` performs **all pairwise comparisons** between\n",
    "x-axis groups using the Wilcoxon rank-sum test (Mann-Whitney U). The\n",
    "default geom is `geom_bracket`, which draws brackets with p-value labels."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    ggplot(df, aes(x=\"dose\", y=\"len\"))\n",
    "    + geom_boxplot()\n",
    "    + stat_pwc()\n",
    "    + scale_y_continuous(expand=(0.05, 0, 0.15, 0))\n",
    "    + labs(\n",
    "        title=\"All Pairwise Comparisons (Wilcoxon test)\",\n",
    "        x=\"Dose\",\n",
    "        y=\"Length\",\n",
    "    )\n",
    "    + theme_minimal()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Using t-test Instead of Wilcoxon\n",
    "\n",
    "Set `method=\"t.test\"` to use the independent samples t-test."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    ggplot(df, aes(x=\"dose\", y=\"len\"))\n",
    "    + geom_boxplot()\n",
    "    + stat_pwc(method=\"t.test\")\n",
    "    + scale_y_continuous(expand=(0.05, 0, 0.15, 0))\n",
    "    + labs(\n",
    "        title=\"All Pairwise Comparisons (t-test)\",\n",
    "        x=\"Dose\",\n",
    "        y=\"Length\",\n",
    "    )\n",
    "    + theme_minimal()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Significance Stars\n",
    "\n",
    "Use `label=\"p.signif\"` to display significance stars instead of\n",
    "p-values. The convention is:\n",
    "\n",
    "| Symbol | Meaning |\n",
    "|--------|----------|\n",
    "| `ns`   | p > 0.05 |\n",
    "| `*`    | p ≤ 0.05 |\n",
    "| `**`   | p ≤ 0.01 |\n",
    "| `***`  | p ≤ 0.001 |\n",
    "| `****` | p ≤ 0.0001 |"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    ggplot(df, aes(x=\"dose\", y=\"len\"))\n",
    "    + geom_boxplot()\n",
    "    + stat_pwc(label=\"p.signif\")\n",
    "    + scale_y_continuous(expand=(0.05, 0, 0.15, 0))\n",
    "    + labs(\n",
    "        title=\"Significance Stars\",\n",
    "        x=\"Dose\",\n",
    "        y=\"Length\",\n",
    "    )\n",
    "    + theme_minimal()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Comparisons Against a Reference Group\n",
    "\n",
    "Use `ref_group` to compare each group against a reference. This is\n",
    "useful for comparing treatment groups against a control."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    ggplot(df, aes(x=\"dose\", y=\"len\"))\n",
    "    + geom_boxplot()\n",
    "    + stat_pwc(\n",
    "        ref_group=\"D0.5\",\n",
    "        label=\"p.signif\",\n",
    "        method=\"t.test\",\n",
    "    )\n",
    "    + scale_y_continuous(expand=(0.05, 0, 0.12, 0))\n",
    "    + labs(\n",
    "        title='Comparisons vs Reference Group (D0.5)',\n",
    "        x=\"Dose\",\n",
    "        y=\"Length\",\n",
    "    )\n",
    "    + theme_minimal()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Explicit Comparison Pairs\n",
    "\n",
    "Specify exactly which pairs to compare with the `comparisons` parameter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    ggplot(df, aes(x=\"dose\", y=\"len\"))\n",
    "    + geom_boxplot()\n",
    "    + stat_pwc(\n",
    "        comparisons=[(\"D0.5\", \"D1\"), (\"D0.5\", \"D2\")],\n",
    "        label=\"p.signif\",\n",
    "    )\n",
    "    + scale_y_continuous(expand=(0.05, 0, 0.12, 0))\n",
    "    + labs(\n",
    "        title=\"Selected Comparisons\",\n",
    "        x=\"Dose\",\n",
    "        y=\"Length\",\n",
    "    )\n",
    "    + theme_minimal()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## P-value Adjustment\n",
    "\n",
    "`stat_pwc` adjusts p-values for multiple comparisons by default using\n",
    "the Holm method. You can change the adjustment method with\n",
    "`p_adjust_method`.\n",
    "\n",
    "Use `label=\"p.adj.format\"` to display adjusted p-values, or\n",
    "`label=\"p.adj.signif\"` for adjusted significance stars."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    ggplot(df, aes(x=\"dose\", y=\"len\"))\n",
    "    + geom_boxplot()\n",
    "    + stat_pwc(\n",
    "        label=\"p.adj.signif\",\n",
    "        p_adjust_method=\"bonferroni\",\n",
    "        method=\"t.test\",\n",
    "    )\n",
    "    + scale_y_continuous(expand=(0.05, 0, 0.15, 0))\n",
    "    + labs(\n",
    "        title=\"Bonferroni-adjusted Significance Stars\",\n",
    "        x=\"Dose\",\n",
    "        y=\"Length\",\n",
    "    )\n",
    "    + theme_minimal()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Hiding Non-significant Comparisons\n",
    "\n",
    "Set `hide_ns=True` to remove non-significant comparisons (p > 0.05)\n",
    "from the plot."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    ggplot(df, aes(x=\"dose\", y=\"len\"))\n",
    "    + geom_boxplot()\n",
    "    + stat_pwc(\n",
    "        label=\"p.signif\",\n",
    "        hide_ns=True,\n",
    "    )\n",
    "    + scale_y_continuous(expand=(0.05, 0, 0.12, 0))\n",
    "    + labs(\n",
    "        title=\"Only Significant Comparisons Shown\",\n",
    "        x=\"Dose\",\n",
    "        y=\"Length\",\n",
    "    )\n",
    "    + theme_minimal()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Box Plot with Jittered Points\n",
    "\n",
    "`stat_pwc` works with any geom. Here we combine box plots with jittered\n",
    "data points for a richer visualization."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    ggplot(df, aes(x=\"dose\", y=\"len\"))\n",
    "    + geom_boxplot(outlier_shape=\"\")\n",
    "    + geom_jitter(width=0.15, alpha=0.5)\n",
    "    + stat_pwc(\n",
    "        method=\"t.test\",\n",
    "        label=\"p.signif\",\n",
    "    )\n",
    "    + scale_y_continuous(expand=(0.05, 0, 0.15, 0))\n",
    "    + labs(\n",
    "        title=\"Box Plot with Jittered Points\",\n",
    "        x=\"Dose\",\n",
    "        y=\"Length\",\n",
    "    )\n",
    "    + theme_minimal()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Customising Bracket Appearance\n",
    "\n",
    "You can adjust the bracket layout with:\n",
    "\n",
    "- `step_increase` – gap between stacked brackets (fraction of y-range)\n",
    "- `bracket_nudge_y` – vertical offset for all brackets\n",
    "- `tip_length` – length of bracket tips"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    ggplot(df, aes(x=\"dose\", y=\"len\"))\n",
    "    + geom_boxplot()\n",
    "    + stat_pwc(\n",
    "        label=\"p.signif\",\n",
    "        step_increase=0.08,\n",
    "        bracket_nudge_y=0.02,\n",
    "        tip_length=0.01,\n",
    "    )\n",
    "    + scale_y_continuous(expand=(0.05, 0, 0.12, 0))\n",
    "    + labs(\n",
    "        title=\"Customised Bracket Spacing\",\n",
    "        x=\"Dose\",\n",
    "        y=\"Length\",\n",
    "    )\n",
    "    + theme_minimal()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Faceted Plots\n",
    "\n",
    "`stat_pwc` works with faceted plots. Each facet panel gets its own\n",
    "set of pairwise comparisons."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "(\n",
    "    ggplot(df, aes(x=\"dose\", y=\"len\"))\n",
    "    + geom_boxplot()\n",
    "    + stat_pwc(\n",
    "        label=\"p.signif\",\n",
    "        method=\"t.test\",\n",
    "    )\n",
    "    + facet_wrap(\"supp\")\n",
    "    + scale_y_continuous(expand=(0.05, 0, 0.15, 0))\n",
    "    + labs(\n",
    "        title=\"Pairwise Comparisons in Faceted Plots\",\n",
    "        x=\"Dose\",\n",
    "        y=\"Length\",\n",
    "    )\n",
    "    + theme_minimal()\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Available Label Formats\n",
    "\n",
    "| `label` value | Description |\n",
    "|---------------|-------------|\n",
    "| `\"p.format\"` | Formatted raw p-value (default) |\n",
    "| `\"p.signif\"` | Significance symbols (ns, \\*, \\*\\*, \\*\\*\\*, \\*\\*\\*\\*) |\n",
    "| `\"p.adj.format\"` | Formatted adjusted p-value |\n",
    "| `\"p.adj.signif\"` | Adjusted significance symbols |\n",
    "| `\"p.format.signif\"` | Raw p-value with significance symbol |\n",
    "\n",
    "## Available P-value Adjustment Methods\n",
    "\n",
    "| `p_adjust_method` | Description |\n",
    "|-------------------|-------------|\n",
    "| `\"holm\"` | Holm (default, step-down) |\n",
    "| `\"bonferroni\"` | Bonferroni |\n",
    "| `\"hochberg\"` | Hochberg |\n",
    "| `\"BH\"` or `\"fdr\"` | Benjamini-Hochberg (FDR) |\n",
    "| `\"BY\"` | Benjamini-Yekutieli |\n",
    "| `\"hommel\"` | Hommel |\n",
    "| `\"none\"` | No adjustment |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "`stat_pwc` makes it easy to add statistical comparison annotations to\n",
    "any plotnine chart. Key features:\n",
    "\n",
    "- **Automatic pairwise testing** between all groups, or against a\n",
    "  reference group\n",
    "- **Multiple test methods**: Wilcoxon rank-sum (default) and t-test\n",
    "- **P-value adjustment** for multiple comparisons\n",
    "- **Flexible labelling**: p-values, significance stars, or both\n",
    "- **Bracket customisation**: spacing, tips, and nudging\n",
    "- **Works with facets** — each panel gets its own comparisons\n",
    "\n",
    "For more details, see the\n",
    "[API reference](../api/stats.rst#stat-pwc)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}