Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Generate images via OpenAI, Google, OpenRouter, DashScope, Jimeng, Seedream, and Replicate APIs with batch support.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
references/config/first-time-setup.md
1---2name: first-time-setup3description: First-time setup and default model selection flow for baoyu-image-gen4---56# First-Time Setup78## Overview910Triggered when:111. No EXTEND.md found → full setup (provider + model + preferences)122. EXTEND.md found but `default_model.[provider]` is null → model selection only1314## Setup Flow1516```17No EXTEND.md found EXTEND.md found, model null18│ │19▼ ▼20┌─────────────────────┐ ┌──────────────────────┐21│ AskUserQuestion │ │ AskUserQuestion │22│ (full setup) │ │ (model only) │23└─────────────────────┘ └──────────────────────┘24│ │25▼ ▼26┌─────────────────────┐ ┌──────────────────────┐27│ Create EXTEND.md │ │ Update EXTEND.md │28└─────────────────────┘ └──────────────────────┘29│ │30▼ ▼31Continue Continue32```3334## Flow 1: No EXTEND.md (Full Setup)3536**Language**: Use user's input language or saved language preference.3738Use AskUserQuestion with ALL questions in ONE call:3940### Question 1: Default Provider4142```yaml43header: "Provider"44question: "Default image generation provider?"45options:46- label: "Google (Recommended)"47description: "Gemini multimodal - high quality, reference images, flexible sizes"48- label: "OpenAI"49description: "GPT Image 2 - latest OpenAI image model, reference-image workflows"50- label: "Azure OpenAI"51description: "Azure-hosted GPT Image deployments with resource-specific routing"52- label: "OpenRouter"53description: "Router for Gemini/FLUX/OpenAI-compatible image models"54- label: "DashScope"55description: "Alibaba Cloud - Qwen-Image, strong Chinese/English text rendering"56- label: "Z.AI"57description: "GLM-image, strong poster and text-heavy image generation"58- label: "MiniMax"59description: "MiniMax image generation with subject-reference character workflows"60- label: "Replicate"61description: "Curated Replicate image families - nano-banana-2, Seedream, and Wan image models"62- label: "Agnes"63description: "Sapiens AI Agnes - optimized for high information density, complex layouts, reference-image support"64```6566### Question 2: Default Google Model6768Only show if user selected Google or auto-detect (no explicit provider).6970```yaml71header: "Google Model"72question: "Default Google image generation model?"73options:74- label: "gemini-3-pro-image (Recommended)"75description: "Highest quality, best for production use"76- label: "gemini-3.1-flash-image"77description: "Fast generation, good quality, lower cost"78- label: "gemini-3-flash-preview"79description: "Fast generation, balanced quality and speed"80```8182### Question 2b: Default OpenRouter Model8384Only show if user selected OpenRouter.8586```yaml87header: "OpenRouter Model"88question: "Default OpenRouter image generation model?"89options:90- label: "google/gemini-3.1-flash-image (Recommended)"91description: "Best general-purpose OpenRouter image model with reference-image workflows"92- label: "google/gemini-2.5-flash-image-preview"93description: "Fast Gemini preview model on OpenRouter"94- label: "black-forest-labs/flux.2-pro"95description: "Strong text-to-image quality through OpenRouter"96```9798### Question 2c: Default Azure Deployment99100Only show if user selected Azure OpenAI.101102```yaml103header: "Azure Deploy"104question: "Default Azure image deployment name?"105options:106- label: "gpt-image-2 (Recommended)"107description: "Use if your Azure deployment uses the GPT Image 2 model name"108- label: "gpt-image-1.5"109description: "Previous GPT Image deployment name"110- label: "gpt-image-1"111description: "Earlier GPT Image deployment name"112```113114### Question 2d: Default MiniMax Model115116Only show if user selected MiniMax.117118```yaml119header: "MiniMax Model"120question: "Default MiniMax image generation model?"121options:122- label: "image-01 (Recommended)"123description: "Best default, supports aspect ratios and custom width/height"124- label: "image-01-live"125description: "Faster variant, use aspect ratio instead of custom size"126```127128### Question 2e: Default Z.AI Model129130Only show if user selected Z.AI.131132```yaml133header: "Z.AI Model"134question: "Default Z.AI image generation model?"135options:136- label: "glm-image (Recommended)"137description: "Best default for posters, diagrams, and text-heavy images"138- label: "cogview-4-250304"139description: "Legacy Z.AI image model on the same endpoint"140```141142### Question 3: Default Quality143144```yaml145header: "Quality"146question: "Default image quality?"147options:148- label: "2k (Recommended)"149description: "2048px - covers, illustrations, infographics"150- label: "normal"151description: "1024px - quick previews, drafts"152```153154### Question 4: Save Location155156```yaml157header: "Save"158question: "Where to save preferences?"159options:160- label: "Project (Recommended)"161description: ".baoyu-skills/ (this project only)"162- label: "User"163description: "~/.baoyu-skills/ (all projects)"164```165166### Save Locations167168| Choice | Path | Scope |169|--------|------|-------|170| Project | `.baoyu-skills/baoyu-image-gen/EXTEND.md` | Current project |171| User | `$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md` | All projects |172173### EXTEND.md Template174175```yaml176---177version: 1178default_provider: [selected provider or null]179default_quality: [selected quality]180default_aspect_ratio: null181default_image_size: null182default_image_api_dialect: null183default_model:184google: [selected google model or null]185openai: null186azure: [selected azure deployment or null]187openrouter: [selected openrouter model or null]188dashscope: null189zai: [selected Z.AI model or null]190minimax: [selected minimax model or null]191replicate: null192agnes: null193---194```195196If the user selects `OpenAI` but says their endpoint is only OpenAI-compatible and fronts another image model family, save `default_image_api_dialect: ratio-metadata` when they explicitly confirm the gateway expects aspect-ratio `size` plus metadata-based resolution. Otherwise leave it `null` / `openai-native`.197198## Flow 2: EXTEND.md Exists, Model Null199200When EXTEND.md exists but `default_model.[current_provider]` is null, ask ONLY the model question for the current provider.201202### Google Model Selection203204```yaml205header: "Google Model"206question: "Choose a default Google image generation model?"207options:208- label: "gemini-3-pro-image (Recommended)"209description: "Highest quality, best for production use"210- label: "gemini-3.1-flash-image"211description: "Fast generation, good quality, lower cost"212- label: "gemini-3-flash-preview"213description: "Fast generation, balanced quality and speed"214```215216### OpenAI Model Selection217218```yaml219header: "OpenAI Model"220question: "Choose a default OpenAI image generation model?"221options:222- label: "gpt-image-2 (Recommended)"223description: "Latest GPT Image model, flexible sizes up to 4K, high-fidelity image inputs"224- label: "gpt-image-1.5"225description: "Previous GPT Image model"226- label: "gpt-image-1"227description: "Earlier GPT Image model"228```229230### Azure Deployment Selection231232```yaml233header: "Azure Deploy"234question: "Choose a default Azure image deployment name?"235options:236- label: "gpt-image-2 (Recommended)"237description: "Use when your Azure deployment name matches the GPT Image 2 model"238- label: "gpt-image-1.5"239description: "Use when your Azure deployment name matches the GPT Image 1.5 model"240- label: "gpt-image-1"241description: "Use when your Azure deployment name matches GPT-image-1"242```243244Notes for Azure setup:245246- In `baoyu-image-gen`, Azure `--model` / `default_model.azure` should be the Azure deployment name, not just the underlying model family.247- If the deployment name is custom, save that exact deployment name in `default_model.azure`.248249### OpenRouter Model Selection250251```yaml252header: "OpenRouter Model"253question: "Choose a default OpenRouter image generation model?"254options:255- label: "google/gemini-3.1-flash-image (Recommended)"256description: "Recommended for image output and reference-image edits"257- label: "google/gemini-2.5-flash-image-preview"258description: "Fast preview-oriented image generation"259- label: "black-forest-labs/flux.2-pro"260description: "High-quality text-to-image through OpenRouter"261```262263### DashScope Model Selection264265```yaml266header: "DashScope Model"267question: "Choose a default DashScope image generation model?"268options:269- label: "qwen-image-2.0-pro (Recommended)"270description: "Best DashScope model for text rendering and custom sizes"271- label: "qwen-image-2.0"272description: "Faster 2.0 variant with flexible output size"273- label: "qwen-image-max"274description: "Legacy Qwen model with five fixed output sizes"275- label: "qwen-image-plus"276description: "Legacy Qwen model, same current capability as qwen-image"277- label: "wan2.7-image-pro"278description: "Wan 2.7 Pro — supports up to 4K text-to-image and reference-image editing"279- label: "wan2.7-image"280description: "Wan 2.7 base — faster generation, up to 2K, supports reference-image editing"281- label: "z-image-turbo"282description: "Legacy DashScope model for compatibility"283- label: "z-image-ultra"284description: "Legacy DashScope model, higher quality but slower"285```286287Notes for DashScope setup:288289- Prefer `qwen-image-2.0-pro` when the user needs custom `--size`, uncommon ratios like `21:9`, or strong Chinese/English text rendering.290- `qwen-image-max` / `qwen-image-plus` / `qwen-image` only support five fixed sizes: `1664*928`, `1472*1104`, `1328*1328`, `1104*1472`, `928*1664`.291- `wan2.7-image-pro` and `wan2.7-image` are the only DashScope models that accept `--ref`. Pick one of these when the user wants reference-image editing or multi-image fusion via DashScope.292- In `baoyu-image-gen`, `quality` is a compatibility preset. It is not a native DashScope parameter.293294### Z.AI Model Selection295296```yaml297header: "Z.AI Model"298question: "Choose a default Z.AI image generation model?"299options:300- label: "glm-image (Recommended)"301description: "Current flagship image model with better text rendering and poster layouts"302- label: "cogview-4-250304"303description: "Legacy model on the sync image endpoint"304```305306Notes for Z.AI setup:307308- Prefer `glm-image` for posters, diagrams, and Chinese/English text-heavy layouts.309- In `baoyu-image-gen`, Z.AI currently exposes text-to-image only; reference images are not wired for this provider.310- The sync Z.AI image API returns a downloadable image URL, which the runtime saves locally after download.311312### Replicate Model Selection313314```yaml315header: "Replicate Model"316question: "Choose a default Replicate image generation model?"317options:318- label: "google/nano-banana-2 (Recommended)"319description: "Current default for general Replicate image generation in baoyu-image-gen"320- label: "bytedance/seedream-4.5"321description: "Replicate Seedream 4.5 with validated local size/ref guardrails"322- label: "bytedance/seedream-5-lite"323description: "Replicate Seedream 5 Lite with validated local size/ref guardrails"324- label: "wan-video/wan-2.7-image-pro"325description: "Replicate Wan 2.7 Image Pro with 4K text-to-image support"326```327328### MiniMax Model Selection329330```yaml331header: "MiniMax Model"332question: "Choose a default MiniMax image generation model?"333options:334- label: "image-01 (Recommended)"335description: "Best general-purpose MiniMax image model with custom width/height support"336- label: "image-01-live"337description: "Lower-latency MiniMax image model using aspect ratios"338```339340Notes for MiniMax setup:341342- `image-01` is the safest default. It supports official `aspect_ratio` values and documented custom `width` / `height` output sizes.343- `image-01-live` is useful when the user prefers faster generation and can work with aspect-ratio-based sizing.344- MiniMax subject reference currently uses `subject_reference[].type = character`; docs recommend front-facing portrait references in JPG/JPEG/PNG under 10MB.345346### Update EXTEND.md347348After user selects a model:3493501. Read existing EXTEND.md3512. If `default_model:` section exists → update the provider-specific key3523. If `default_model:` section missing → add the full section:353354```yaml355default_model:356google: [value or null]357openai: [value or null]358azure: [value or null]359openrouter: [value or null]360dashscope: [value or null]361zai: [value or null]362minimax: [value or null]363replicate: [value or null]364agnes: [value or null]365```366367Only set the selected provider's model; leave others as their current value or null.368369## After Setup3703711. Create directory if needed3722. Write/update EXTEND.md with frontmatter3733. Confirm: "Preferences saved to [path]"3744. Continue with image generation375