Free · All Tools

Data Pipelines Agent Skill

ETL patterns, cron scheduling, queue-based processing, and batch jobs with Inngest/BullMQ.

etlpipelinesqueuesbatch

The Skill

Full content, every format. Copy it, download it, or install with one command.

SKILL.md

---
description: ETL patterns, cron scheduling, queue-based processing, and batch jobs with Inngest/BullMQ.
homepage: https://yepapi.com/skills/data-pipelines
metadata:
  tags: [etl, pipelines, queues, batch]
---

# Data Pipelines

## Rules

- Inngest for serverless event-driven pipelines: define functions that trigger on events, built-in retries and scheduling
- BullMQ for self-hosted queues: Redis-backed, supports delayed jobs, rate limiting, priorities, and concurrency control
- Cron scheduling: use Inngest `cron` trigger or `node-cron` — store schedule in config, not hardcoded
- ETL pattern: Extract (fetch from source) -> Transform (validate, clean, reshape) -> Load (write to destination) — each step is a separate function
- Idempotency: every pipeline step must be safe to retry — use unique job IDs and upserts, not inserts
- Batch processing: chunk large datasets (100-1000 items per batch), process in parallel with concurrency limits
- Error handling: dead letter queue for failed jobs — log the payload and error, alert after N failures
- Backpressure: limit concurrent workers — don't let a burst of events overwhelm downstream services
- Observability: log `jobId`, `step`, `duration`, `status` for every pipeline run — track success rate and p95 latency
- Graceful shutdown: handle `SIGTERM` — finish current job before exiting, don't accept new jobs
- Data validation: validate input schema at Extract, validate output schema at Load — fail fast on bad data

## Avoid

- Long-running HTTP requests as pipeline steps — use queues for anything over 30 seconds
- Processing unbounded datasets without pagination — always paginate source data with cursor or offset
- Storing job payloads in the queue — store a reference ID, fetch data in the worker
- Cron jobs without distributed locks — use Redis locks or Inngest to prevent duplicate execution

Install

Why Use the Data Pipelines Skill?

Without this skill, your AI guesses at data pipelines patterns. It might hallucinate deprecated APIs, use outdated conventions, or miss best practices entirely. With it, your AI follows a proven ruleset — every suggestion aligns with current standards.

Drop this skill into your project and your AI instantly knows the rules. Better code suggestions, fewer errors, faster shipping.

Try These Prompts

These prompts work better with the Data Pipelines skill installed. Your AI knows the context and writes code that fits.

"Build an ETL pipeline with cron scheduling, queue-based processing, and error recovery"

"Create a data import system that handles CSV uploads, validation, and batch processing"

"Set up an Inngest-based pipeline for transforming and loading data from external APIs"

Works Great With

Background Jobs

Queue workers with BullMQ/Inngest/Trigger.dev, job retries, dead letter queues, concurrency.

View skill

Cron Jobs

Scheduled tasks with Vercel Cron, Inngest, node-cron, crontab syntax, and error handling.

View skill

Spreadsheet Export

CSV/Excel/PDF generation from app data, streaming large datasets, and formatting.

View skill

Data Pipelines skill — FAQ

It covers ETL patterns, cron-based scheduling, queue processing, and batch jobs with Inngest or BullMQ. Your AI builds reliable data pipelines with proper error handling and monitoring.

Run `npx skills add YepAPI/skills --skill data-pipelines` in your project root. This copies the skill file into your repo where your AI coding tool can read it automatically.

Use cron for scheduled batch processing (daily reports, nightly imports). Use event-driven with Inngest or BullMQ for real-time processing triggered by user actions or webhooks.

Want more skills?

Browse all 110 free skills for builders.

See All Skills

Command Palette

Data Pipelines Agent Skill

The Skill

Why Use the Data Pipelines Skill?

Try These Prompts

Works Great With

Background Jobs

Cron Jobs

Spreadsheet Export

Data Pipelines skill — FAQ

Want more skills?