name: r-anti-slop description: > Enforce production-quality R code standards. Prevents generic AI patterns through namespace qualification, explicit returns, and tidyverse conventions. Use when writing or reviewing R code for data analysis or packages. applies_to:

"**/*.R"
"**/*.Rmd"
"**/*.qmd" tags: [r, tidyverse, code-quality, data-science] related_skills:
quarto/anti-slop
text/anti-slop version: 2.0.0

R Anti-Slop: Stop Writing `df <- data`

When to Use This

Use this for:

✓ Any R code leaving your machine (analysis, packages, scripts)
✓ AI-generated code review (catches df, result, missing ::)
✓ CRAN submissions (they'll reject generic code anyway)
✓ Team code standards

Skip for:

Quick console experiments (though habits form fast)
Legacy code you can't touch
Bioconductor or other style guides that override this

Quick Example

Before (AI Slop):

# Load the library
library(dplyr)

# Read the data
df <- read.csv("data.csv")

# Filter the data
result <- df %>% filter(x > 0)

After (Anti-Slop):

customer_data <- readr::read_csv("data/customers.csv")

active_customers <- customer_data |>
  dplyr::filter(status == "active", revenue > 0)

return(active_customers)

What changed:

✓ Descriptive names (customer_data not df)
✓ Namespace qualification (dplyr::, readr::)
✓ Native pipe (|> not %>%)
✓ No obvious comments
✓ Explicit return

When to Use What

If you need to...	Do this	Details
Name variables	Use `snake_case`, no `df`/`data`/`result`	reference/naming.md
Call tidyverse functions	Always use `::` (e.g., `dplyr::filter()`)	reference/tidyverse.md
Return from function	Always explicit `return()` statement	reference/naming.md
Write pipe chains	Use `\|>`, break at 8+ operations	reference/tidyverse.md
Document functions	Specific `@param`, `@return`, no circular text	reference/documentation.md
Handle missing data	Explicit strategy + report data loss	reference/statistical-rigor.md
Validate data	Check assumptions with `stopifnot()`	reference/statistical-rigor.md
Format code	Use `styler::style_file()`	reference/tidyverse.md
Check code quality	Use `lintr::lint()`	reference/tidyverse.md

Core Workflow

5-Step Quality Check

Namespace qualification - All external functions use ::

# Good
dplyr::filter(data, x > 0)
# Bad
filter(data, x > 0)

Explicit returns - Every function has return()

# Good
my_function <- function(x) {
  result <- x + 1
  return(result)
}
# Bad
my_function <- function(x) {
  x + 1
}

Naming conventions - All objects use snake_case

# Good
customer_lifetime_value <- calculate_clv(data)
# Bad
df <- calculate_clv(data)
customerLifetimeValue <- calculate_clv(data)

Documentation quality - No generic descriptions

# Good
#' @param deaths Data frame with `age_group` and `count` columns
# Bad
#' @param data The data

Code formatting - Run styler and lintr

styler::style_file("script.R")
lintr::lint("script.R")

Quick Reference Checklist

Before committing R code, verify:

All external functions qualified with ::
All functions have explicit return()
All objects use snake_case
No generic names (df, data, result, temp)
Pipes (|>) have space before, end lines
Long pipelines (>8 ops) broken into named steps
Complex operations have WHY comments
Data validated after transformations
Seeds set before random operations
Uncertainty reported (SE, CI) for statistical models
No attach() calls
No right-hand assignment (->)
Roxygen documentation is specific
Examples are realistic and run

Common Workflows

Workflow 1: Clean Up AI-Generated R Script

Context: AI generated an analysis script with generic patterns.

Steps:

Run detection script

Rscript toolkit/scripts/detect_slop.R analysis.R --verbose

Fix high-priority issues first

# Replace df, data, result with descriptive names
# Before
df <- readr::read_csv("data.csv")
result <- df %>% filter(x > 0)

# After
customer_data <- readr::read_csv("data/customers.csv")
active_customers <- customer_data |> dplyr::filter(status == "active")

Add namespace qualification

# Before
data %>% filter(x > 0) %>% summarize(mean(y))

# After
data |>
  dplyr::filter(x > 0) |>
  dplyr::summarize(mean_y = mean(y))

Add explicit returns

# Before
calculate_rate <- function(numerator, denominator) {
  numerator / denominator
}

# After
calculate_rate <- function(numerator, denominator) {
  rate <- numerator / denominator
  return(rate)
}

Break long pipes

# Before (12 operations in one chain)
result <- data |>
  filter(...) |> mutate(...) |> group_by(...) |>
  summarize(...) |> arrange(...) |> [7 more ops]

# After
clean_data <- data |>
  dplyr::filter(!is.na(value)) |>
  dplyr::mutate(category = categorize(value))

summary_stats <- clean_data |>
  dplyr::group_by(category) |>
  dplyr::summarize(mean_val = mean(value))

Format and validate

styler::style_file("analysis.R")
lintr::lint("analysis.R")

Expected outcome: Score drops from 60+ to <20

Workflow 2: Fix Generic Package Documentation

Context: R package has generic roxygen documentation.

Steps:

Identify generic patterns

# Bad
#' Process Data
#'
#' @description This function processes the data.
#' @param data The data.
#' @return The result.

Make description specific

# Good
#' Calculate age-adjusted mortality rates
#'
#' Computes mortality rates per 100,000 population, standardized to the
#' 2000 US Census age distribution using direct standardization.

Describe parameter structure

# Good
#' @param deaths Data frame with columns `age_group` and `count`.
#' @param population Data frame with columns `age_group` and `pop_size`.

Specify return value

# Good
#' @return A tibble with columns:
#'   \describe{
#'     \item{county}{County FIPS code}
#'     \item{rate}{Age-adjusted rate per 100,000}
#'     \item{se}{Standard error of the rate}
#'   }

Add realistic examples

# Good
#' @examples
#' counties <- data.frame(
#'   county = c("A", "B"),
#'   deaths = c(150, 200),
#'   population = c(50000, 80000)
#' )
#'
#' adjust_rates(counties, rate_per = 100000)
#' #> # A tibble: 2 x 3
#' #>   county  rate    se
#' #> 1 A       312.  25.4
#' #> 2 B       258.  18.2

Expected outcome: Documentation that teaches, not restates

Workflow 3: Prepare Package for CRAN

Context: Final checks before CRAN submission.

Steps:

Run all quality checks

# Standard checks
devtools::check()

# Anti-slop checks
lapply(list.files("R", full.names = TRUE), function(f) {
  system(paste("Rscript toolkit/scripts/detect_slop.R", f))
})

Fix documentation
- Check all @param descriptions are specific
- Verify @examples run and are realistic
- Ensure @return describes structure

Validate code quality

# Format all files
styler::style_dir("R/")

# Check lints
lintr::lint_package()

Check CRAN-specific requirements
- Validate URLs in DESCRIPTION and documentation
- Check examples run in < 5 seconds
- Verify package structure meets CRAN standards

Expected outcome: Clean R CMD check with no slop patterns

Mandatory Rules Summary

1. Namespace Qualification

ALWAYS use :: for external packages

Exceptions (don't need ::):

Base R: mean(), sum(), log(), etc.
stats: lm(), glm(), t.test(), etc.
utils: head(), tail(), str(), etc.

2. Explicit Returns

ALWAYS use return() - never implicit

3. Naming: snake_case

All objects use snake_case

Variables: customer_data not customerData or df
Functions: calculate_rate not calculateRate
Arguments: input_data not inputData

4. Native Pipe

Prefer |> over %>% (unless R < 4.1)

5. No Generic Names

Never use: df, data, result, temp, x, n (except standard math notation)

Tidyverse Philosophy

Follow Tidyverse Style Guide as primary reference:

Design for humans - Code should be readable and intuitive
Reuse existing data structures - Work with tibbles and data frames
Compose simple functions with pipes - Build complexity through composition
Embrace functional programming - Functions are first-class objects

See reference/tidyverse.md for complete tidyverse conventions.

Resources & Advanced Topics

Reference Files

reference/naming.md - Complete naming conventions and forbidden patterns
reference/tidyverse.md - Pipe conventions, formatting, ggplot2 standards
reference/documentation.md - Roxygen2, vignettes, README quality
reference/statistical-rigor.md - Validation, uncertainty, reproducibility
reference/forbidden-patterns.md - Complete antipattern catalog

Related Skills

text/anti-slop - For cleaning prose in documentation
quarto/anti-slop - For cleaning vignettes and documentation

Tools

styler::style_file() - Auto-format code
lintr::lint() - Check code quality
Rscript toolkit/scripts/detect_slop.R - Detect AI patterns

Integration with Posit Skills

This skill focuses on code quality and avoiding generic patterns.

Use together with Posit skills for complete coverage:

Task	Use This Skill	+ Posit Skill
Write error messages	r/anti-slop (quality)	+ r-lib/cli (structure)
Write tests	r/anti-slop (code quality)	+ r-lib/testing (test patterns)
Prepare for CRAN	r/anti-slop (no slop)	+ r-lib/cran-extrachecks (requirements)
Document lifecycle	r/anti-slop (doc quality)	+ r-lib/lifecycle (deprecation)

ナビゲーション

Skillsとは？

リンク

r-anti-slop

R Anti-Slop: Stop Writing `df <- data`

When to Use This

Quick Example

When to Use What

Core Workflow

5-Step Quality Check

Quick Reference Checklist

Common Workflows

Workflow 1: Clean Up AI-Generated R Script

Workflow 2: Fix Generic Package Documentation

Workflow 3: Prepare Package for CRAN

Mandatory Rules Summary

1. Namespace Qualification

2. Explicit Returns

3. Naming: snake_case

4. Native Pipe

5. No Generic Names

Tidyverse Philosophy

Resources & Advanced Topics

Reference Files

Related Skills

Tools

Integration with Posit Skills

関連スキル(🔧 開発ツール)

ナビゲーション

Skillsとは？

リンク

r-anti-slop

R Anti-Slop: Stop Writing df <- data

When to Use This

Quick Example

When to Use What

Core Workflow

5-Step Quality Check

Quick Reference Checklist

Common Workflows

Workflow 1: Clean Up AI-Generated R Script

Workflow 2: Fix Generic Package Documentation

Workflow 3: Prepare Package for CRAN

Mandatory Rules Summary

1. Namespace Qualification

2. Explicit Returns

3. Naming: snake_case

4. Native Pipe

5. No Generic Names

Tidyverse Philosophy

Resources & Advanced Topics

Reference Files

Related Skills

Tools

Integration with Posit Skills

関連スキル(🔧 開発ツール)

R Anti-Slop: Stop Writing `df <- data`