name: nanobanana description: "Generate and edit images using Nano Banana (Gemini image generation). Use when users want to create images, generate visuals, edit photos, design mockups, produce thumbnails, create logos, make hero images, or integrate Nano Banana into their codebase." metadata: author: mgiovani version: 1.0.0 source: https://github.com/mgiovani/skills allowed-tools:

Bash
Read
Glob
Grep

Nanobanana — Nano Banana Image Generation

Generate and edit images using Google's Nano Banana (Gemini image generation API). This skill handles direct image generation, iterative editing, and expert guidance for integrating the API into codebases.

Core differentiator: A prompt enhancement system that analyzes user intent and project context to craft optimized prompts before calling the API.

Phase 0: Environment Check

Before anything else, verify the environment is ready.

1. Check API key:

echo "${GEMINI_API_KEY:0:10}..."  # Show first 10 chars only (security)

If GEMINI_API_KEY is empty or unset:

Read references/integration-guide.md (the setup section)
Present setup instructions to the user
Stop here until the key is configured

2. Check uv is available:

uv --version 2>&1

If uv is not installed, direct the user to https://docs.astral.sh/uv/getting-started/installation/ and stop. uv handles dependency installation automatically via PEP 723 inline metadata — no manual pip install needed.

Phase 1: Understand Intent & Detect Mode

Mine the conversation for:

Subject/scene: What is the image of?
Purpose: What is it for? (hero image, icon, mockup, blog post, etc.)
Style: Photorealistic, illustration, minimalist, etc.
Technical requirements: Aspect ratio, resolution, specific dimensions
Mood/atmosphere: Energetic, calm, professional, playful, etc.

Detect Mode

Expert Integration Mode — if the user wants to integrate Nano Banana into their codebase (e.g., "how do I add image generation to my app", "show me the API", "I'm building a feature that generates images"):

Read references/integration-guide.md
Provide SDK examples, authentication patterns, and production best practices
Skip to guidance — do not call the API

Generation Mode — if the user wants an image generated now:

Continue to Phase 2

Analyze Project Context (Generation Mode Only)

If invoked within a project directory, gather context to improve prompts:

# Identify project type
ls package.json pyproject.toml README.md 2>/dev/null | head -5

# Find project description
head -20 README.md 2>/dev/null || head -20 pyproject.toml 2>/dev/null

# Find existing images (identify style conventions)
find . -name "*.png" -o -name "*.jpg" -o -name "*.svg" 2>/dev/null | grep -v node_modules | head -10

# Find color schemes (Tailwind, CSS variables, theme files)
grep -r "primary\|brand\|#[0-9a-fA-F]\{6\}" --include="*.css" --include="*.ts" --include="*.json" -l 2>/dev/null | head -5

Use this context to make the generated image fit the project's visual language.

Classify Request Type

Choose the most fitting category:

photorealistic — scenes, portraits, product photos, landscapes
stylized — illustrations, art, cartoon, concept art
text-heavy — posters, banners, infographics with text
product-marketing — commercial product shots
ui-mockup — app screens, website designs, wireframes
diagram — technical illustrations, flowcharts, architecture
minimalist — abstract, logos, icon concepts

Ask Only for Missing Info

Only ask for information the conversation did not already provide. If the user said "a minimalist logo for my SaaS app", you already know: subject (logo), style (minimalist), purpose (SaaS branding). Don't ask for things you already know.

Phase 2: Enhance Prompt

Read the relevant section from references/prompt-engineering.md based on the request category.

Enhancement Process

Apply category-specific enhancements:

Category	Add to Prompt
`photorealistic`	Camera angle, lens type, lighting setup, depth of field, atmosphere
`stylized`	Art style, quality level, shading approach, color palette reference
`text-heavy`	Exact text in quotes, font style, weight, color, placement
`product-marketing`	Studio lighting setup, surface material, background type
`ui-mockup`	Device frame, design language, project colors if known
`diagram`	Diagram type, color coding scheme, label style, clean lines
`minimalist`	Background color (exact), element positioning, size proportions

Incorporate any project context found in Phase 1 (brand colors, design system, domain).

Present Enhanced Prompt for Approval

ALWAYS show this before generating. Never skip this step.

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 PROMPT REVIEW
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
ORIGINAL: [user's original prompt]

ENHANCED: [improved prompt with additions]

CHANGES:
  + [what was added]
  + [why it was added]

MODEL:    [Selected model name]
ASPECT:   [e.g., 16:9]
RESOLUTION: [e.g., 2K]
EST. COST: ~$[estimate]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Proceed with enhanced prompt? (yes / modify / use original)

If the user wants modifications, update the enhanced prompt and show the review block again before proceeding.

Phase 3: Select Model & Parameters

Default: Nano Banana 2 (gemini-3.1-flash-image-preview) at 2K resolution.

See references/model-guide.md for full details. Quick reference:

Use Case	Model	Resolution
Quick drafts / iteration	`gemini-2.5-flash-image`	512 or 1K
Most production assets (DEFAULT)	`gemini-3.1-flash-image-preview`	2K
Text-heavy images	`gemini-3-pro-image-preview`	2K–4K
Print / high-DPI	`gemini-3-pro-image-preview`	4K

Aspect ratio defaults by use case:

Hero/banner: 16:9
Profile/avatar: 1:1
Stories/mobile: 9:16
Portrait/pin: 2:3
Standard web: 4:3

Always present the model and resolution choice to the user as part of the Phase 2 review block and allow them to override.

Phase 4: Generate Image

Determine the output path (default to ./generated-image.png if not specified, or a contextually appropriate name like ./hero-image.png or ./logo-concept.png).

Text-to-Image

uv run "$(dirname "$0")/scripts/generate.py" \
  --prompt "ENHANCED_PROMPT_HERE" \
  --model "MODEL_ID_HERE" \
  --aspect-ratio "ASPECT_RATIO_HERE" \
  --resolution "RESOLUTION_HERE" \
  --output "OUTPUT_PATH_HERE"

Image Editing (when user provides an existing image)

uv run "$(dirname "$0")/scripts/generate.py" \
  --prompt "EDIT_INSTRUCTION_HERE" \
  --input-image "INPUT_IMAGE_PATH_HERE" \
  --model "MODEL_ID_HERE" \
  --aspect-ratio "ASPECT_RATIO_HERE" \
  --resolution "RESOLUTION_HERE" \
  --output "OUTPUT_PATH_HERE"

Parse the JSON Output

The script outputs a JSON object. Parse and handle each case:

Success:

{"status": "success", "output_path": "/abs/path/image.png", "model_used": "...", "text_response": "...", "size_bytes": 245760}

→ Report the file path. Use Read on image files if the platform supports inline display.

Error cases:

`error_code`	Meaning	Action
`CONTENT_POLICY`	Prompt blocked by safety filters	Suggest rephrasing; remove sensitive elements
`RATE_LIMIT`	API quota exceeded	Wait before retrying; suggest lower-cost model
`AUTH_ERROR`	Invalid or missing API key	Direct user to `references/integration-guide.md` setup section
`NO_IMAGE_GENERATED`	Model returned no image	Try rephrasing prompt; try different model
`DEPENDENCY_ERROR`	`google-genai` not installed	Ensure `uv` is available; `uv run` handles deps automatically via PEP 723 metadata
`FILE_NOT_FOUND`	Input image path invalid	Verify the path and re-run

Phase 5: Iterate (Optional)

After a successful generation, offer iteration options based on user feedback:

Minor tweaks (color, brightness, small compositional changes): → Use image editing mode — pass the previous output as --input-image

Major changes (completely different subject, style change): → Modify the enhanced prompt and regenerate from scratch

Rapid exploration (testing multiple concepts): → Use gemini-2.5-flash-image at 512 resolution for all iterations → Identify the winning concept, then regenerate with gemini-3.1-flash-image-preview at 2K

For iterative editing sessions, keep track of the prompt evolution so the user can revert to a previous version if needed.

Expert Integration Mode

When the user wants to add image generation to their codebase:

Read references/integration-guide.md
Identify the user's tech stack (Python, JavaScript/TypeScript, REST API needed)
Provide the relevant SDK example from the guide
Tailor the example to their project structure:
- Python FastAPI/Flask → show as an endpoint
- Next.js → show as an API route
- Plain script → show standalone function
Highlight critical production concerns from the guide:
- Never expose API key in frontend
- Implement rate limiting per user
- Cache by prompt hash
- Handle 429 with exponential backoff
Suggest environment variable setup appropriate for their project type

Reference Files

references/prompt-engineering.md — Photography terms, style guides, sparse→rich examples by category
references/model-guide.md — Model comparison, pricing, rate limits, resolution options
references/integration-guide.md — SDK examples (Python/JS/REST), setup, production best practices
scripts/generate.py — Core API caller with retry logic and JSON output
scripts/requirements.txt — google-genai>=1.0.0

ナビゲーション

Skillsとは？

リンク

nanobanana