baoyu-image-gen

$npx skills add jimliu/baoyu-skills --skill baoyu-image-gen
SKILL.md

Baoyu Image Gen

Official API-based image generation. Supports OpenAI, Google and DashScope (阿里通义万象) providers. **Agent Execution**: 1. `SKILL_DIR` = this SKILL.md file's directory

Image Generation (AI SDK)

Official API-based image generation. Supports OpenAI, Google and DashScope (阿里通义万象) providers.

Script Directory

Agent Execution:
  1. SKILL_DIR = this SKILL.md file's directory
  2. Script path = ${SKILL_DIR}/scripts/main.ts

Preferences (EXTEND.md)

Use Bash to check EXTEND.md existence (priority order):
# Check project-level first
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"

# Then user-level (cross-platform: $HOME works on macOS/Linux/WSL)
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"
┌──────────────────────────────────────────────────┬───────────────────┐ │ Path │ Location │ ├──────────────────────────────────────────────────┼───────────────────┤ │ .baoyu-skills/baoyu-image-gen/EXTEND.md │ Project directory │ ├──────────────────────────────────────────────────┼───────────────────┤ │ $HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md │ User home │ └──────────────────────────────────────────────────┴───────────────────┘
┌───────────┬───────────────────────────────────────────────────────────────────────────┐ │ Result │ Action │ ├───────────┼───────────────────────────────────────────────────────────────────────────┤ │ Found │ Read, parse, apply settings │ ├───────────┼───────────────────────────────────────────────────────────────────────────┤ │ Not found │ Use defaults │ └───────────┴───────────────────────────────────────────────────────────────────────────┘
EXTEND.md Supports: Default provider | Default quality | Default aspect ratio

Usage

# Basic
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png

# With aspect ratio
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9

# High quality
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k

# From prompt files
npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png

# With reference images (Google multimodal only)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png

# Specific provider
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider openai

# DashScope (阿里通义万象)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "一只可爱的猫" --image out.png --provider dashscope

Options

Option
Description
--prompt <text>, -p
Prompt text
--promptfiles <files...>
Read prompt from files (concatenated)
--image <path>
Output image path (required)
--provider google|openai|dashscope
Force provider (default: google)
--model <id>, -m
Model ID
--ar <ratio>
Aspect ratio (e.g., 16:9, 1:1, 4:3)
--size <WxH>
Size (e.g., 1024x1024)
--quality normal|2k
Quality preset (default: 2k)
--imageSize 1K|2K|4K
Image size for Google (default: from quality)
--ref <files...>
Reference images (Google multimodal only)
--n <count>
Number of images
--json
JSON output

Environment Variables

Variable
Description
OPENAI_API_KEY
OpenAI API key
GOOGLE_API_KEY
Google API key
DASHSCOPE_API_KEY
DashScope API key (阿里云)
OPENAI_IMAGE_MODEL
OpenAI model override
GOOGLE_IMAGE_MODEL
Google model override
DASHSCOPE_IMAGE_MODEL
DashScope model override (default: z-image-turbo)
OPENAI_BASE_URL
Custom OpenAI endpoint
GOOGLE_BASE_URL
Custom Google endpoint
DASHSCOPE_BASE_URL
Custom DashScope endpoint
Load Priority: CLI args > env vars > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env

Provider Selection

  1. --provider specified → use it
  2. Only one API key available → use that provider
  3. Multiple available → default to Google

Quality Presets

Preset
Google imageSize
OpenAI Size
Use Case
normal
1K
1024px
Quick previews
2k (default)
2K
2048px
Covers, illustrations, infographics
Google imageSize: Can be overridden with --imageSize 1K|2K|4K

Aspect Ratios

Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1
  • Google multimodal: uses imageConfig.aspectRatio
  • Google Imagen: uses aspectRatio parameter
  • OpenAI: maps to closest supported size

Generation Mode

Default: Sequential generation (one image at a time). This ensures stable output and easier debugging.
Parallel Generation: Only use when user explicitly requests parallel/concurrent generation.
Mode
When to Use
Sequential (default)
Normal usage, single images, small batches
Parallel
User explicitly requests, large batches (10+)
Parallel Settings (when requested):
Setting
Value
Recommended concurrency
4 subagents
Max concurrency
8 subagents
Use case
Large batch generation when user requests parallel
Agent Implementation (parallel mode only):
# Launch multiple generations in parallel using Task tool
# Each Task runs as background subagent with run_in_background=true
# Collect results via TaskOutput when all complete

Error Handling

  • Missing API key → error with setup instructions
  • Generation failure → auto-retry once
  • Invalid aspect ratio → warning, proceed with default
  • Reference images with non-multimodal model → warning, ignore refs

Extension Support

Custom configurations via EXTEND.md. See Preferences section for paths and supported options.