Required
A laptop with a terminal
macOS, Linux, or Windows with WSL. You need a working terminal where you can run shell commands. On Windows, install WSL 2 first.
Git
We use git throughout the workshop. Install it if you do not have it already.
# macOS
xcode-select --install
# Ubuntu / Debian
sudo apt install git
# Check
git --version
An API key
You need access to at least one LLM. We recommend OpenRouter (one key, many models). Or get a key from Anthropic or OpenAI directly.
Budget about $5–10 of credit.
Install one coding agent
Pick at least one. We demo with Claude Code, but OpenCode is a strong open-source alternative that works with any provider.
Claude Code
Anthropic's agentic coding tool. Requires Node.js 18+ and an Anthropic API key (or a Claude Pro/Max subscription).
# Install via npm
npm install -g @anthropic-ai/claude-code
# Verify
claude --version
OpenCode
Open-source AI coding agent. Works with any LLM provider — OpenRouter, Anthropic, OpenAI, or local models.
# Install
curl -fsSL https://opencode.ai/install | bash
# Verify
opencode --version
Before you bring data
An LLM API is a third party. Anything you send leaves your machine. Decide what is safe to share before the workshop — especially if you work with restricted administrative or human-subjects data.
Can this data leave your machine?
Data use agreements, IRB protocols, and statistical-agency contracts often forbid sending microdata to an external service. If in doubt, treat it as a no — the exercises work fine on synthetic or public data.
Mock or anonymized data
Ask the agent to generate a synthetic dataset with the same schema, or anonymize a sample. You debug the code against fake data, then run it yourself on the real data, locally.
Local models + guardrails
For restricted data, run an open-weight model locally (Ollama, LM Studio) or your institution's private instance. Either way, set permissions so the agent cannot read or delete files outside its sandbox.
Optional but useful
DuckDB
We use DuckDB in the dodo demo and exercise. Install it to follow along hands-on in Act 3.
# macOS
brew install duckdb
Make
GNU Make for running pipelines. Already installed on most macOS and Linux systems.
make --version
A project to work on
The exercises work best with a real research project — ideally a git repo with data-wrangling scripts (.do, .py, .R, or .jl). We provide a sample dataset if you do not have one.