Diligence

Responsibility, disclosure, and reproducibility when using AI

NoteLearning objectives
  • Apply norms of disclosure for AI use in research writing, code, and analysis.
  • Build a workflow that preserves provenance of AI-assisted outputs.
  • Reason about the ethics of AI use: authorship, privacy, reproducibility.

The core idea

Diligence is the downstream responsibility. You own the output, regardless of who or what produced it. This D connects AI fluency to scientific integrity.

Diligence is not fear. It is a set of practices that make your use of AI:

  • Disclosable. You can say what you did.
  • Reproducible. Someone else could audit your process.
  • Defensible. You can stand behind the result on technical and ethical grounds.

Disclosure norms

As of early 2026, the three major policy-setting bodies have converged on three shared principles. AI cannot be listed as an author (it has no accountability and cannot give consent). AI use must be disclosed in methods or acknowledgments. AI-generated text used verbatim should be marked as such. What differs across journals is the level of specificity required and the definition of “material use”.

Policy source Core requirement Specificity expected Where to verify
ICMJE (governs most major clinical and biomedical journals) Disclose AI use in methods. AI may not be listed as author Tool name, version, and purpose. State what was generated icmje.org/recommendations, a living document
Nature Portfolio Disclose large language model use in the methods section. AI may not be listed as author or in acknowledgments Specific model, version, and how it was used nature.com editorial policies on AI, current online version
Science family Disclose any AI-generated text. AI may not be listed as author Broad. Current guidance specifies “detail in methods” Science editorial guidelines. Check the current online version before submission
Warning

These policies change faster than course materials. Always read the journal’s current submission guidelines before finalising a disclosure. Do not rely on a cached or summarised version. The ICMJE and Nature Portfolio pages linked above are the authoritative sources.

For this course, use the following template as your starting point:

NoteTemplate: AI disclosure statement

I used [tool name, version or model, access tier] for the following purposes in this work: [list specific uses, e.g., “drafted the Scanpy QC scaffold (cells 3–9 of qc.ipynb); suggested candidate citations for the introduction (3 of 5 were verified and retained; 2 were not found in PubMed and were discarded); polished one paragraph of the discussion”]. I verified all code by running it on [describe test conditions]. I verified all citations against PubMed or the publisher’s DOI. I rejected [describe at least one AI suggestion and why]. The scientific interpretations, statistical decisions, and conclusions are my own.

Score your draft against the course disclosure rubric before submitting. The rubric has four scored dimensions: tools listed, use described, verification stated, and rejections noted.

Provenance practices

  • Save the prompts and outputs that shaped a deliverable, not just the final code.
  • Version your prompts alongside your code.
  • When an AI proposes a method, cite the method (paper, package), not the AI.
  • Treat AI like a very fast collaborator with unreliable memory and no accountability.

Privacy and confidentiality

  • Do not paste patient-identifiable data, embargoed manuscripts, or collaborator data into third-party LLMs without explicit permission.
  • Prefer on-device or enterprise-tier deployments for sensitive data.
  • Know your institution’s data-handling policy.

Exercises

  1. Draft a one-paragraph AI-disclosure statement for your most recent AI-assisted project.
  2. Reconstruct the prompt history for a recent AI-assisted task. What is missing from your records?
  1. The three convergent disclosure principles say AI use must be disclosed, AI cannot be an author, and verbatim AI text must be marked. Why does this set converge on disclosure rather than prohibition?
  2. The course disclosure rubric scores four dimensions. Name them, and explain which is most often skipped on a first-draft disclosure.
  3. A colleague writes: “We used ChatGPT to help with this analysis.” Score this against the rubric. What dimensions does it satisfy, and what would a stronger version add?

Answers: 1. Disclosure preserves the human accountability chain. The human author is still answerable for every claim, and AI use is visible to readers and reviewers. Prohibition would push use underground and remove the audit signal that makes the practice safer. The norm is: AI is a collaborator with no accountability, the human stays accountable, and the relationship is on the record. 2. Tools listed, use described, verification stated, rejections noted. Rejections noted is the most often skipped. Disclosing what you accepted feels natural. Disclosing what you rejected requires having actually evaluated the suggestions, and it is the strongest signal of real engagement. 3. The colleague’s sentence partially satisfies tools listed (names ChatGPT but no version, model, or tier) and weakly satisfies use described (vague “help with analysis”). It fails verification stated and rejections noted outright. A stronger version names the model and tier (“ChatGPT, GPT-4o, Pro tier, web UI”), the concrete uses (“drafted the QC scaffold; suggested 4 citations, 3 verified and retained, 1 fabricated and discarded”), the verification (“ran code on a 10-cell test slice; resolved each DOI on doi.org and confirmed authors and title against PubMed”), and at least one rejection (“rejected the AI’s proposed mt threshold of 5% for our brain-tissue dataset; used 20% per Heumos et al. 2023”).

Further reading