name: nanobanana description: "Generate and edit images using Nano Banana (Gemini image generation). Use when users want to create images, generate visuals, edit photos, design mockups, produce thumbnails, create logos, make hero images, or integrate Nano Banana into their codebase." metadata: author: mgiovani version: 1.0.0 source: https://github.com/mgiovani/skills allowed-tools:
- Bash
- Read
- Glob
- Grep
Nanobanana — Nano Banana Image Generation
Generate and edit images using Google's Nano Banana (Gemini image generation API). This skill handles direct image generation, iterative editing, and expert guidance for integrating the API into codebases.
Core differentiator: A prompt enhancement system that analyzes user intent and project context to craft optimized prompts before calling the API.
Phase 0: Environment Check
Before anything else, verify the environment is ready.
1. Check API key:
echo "${GEMINI_API_KEY:0:10}..." # Show first 10 chars only (security)
If GEMINI_API_KEY is empty or unset:
- Read
references/integration-guide.md(the setup section) - Present setup instructions to the user
- Stop here until the key is configured
2. Check uv is available:
uv --version 2>&1
If uv is not installed, direct the user to https://docs.astral.sh/uv/getting-started/installation/ and stop. uv handles dependency installation automatically via PEP 723 inline metadata — no manual pip install needed.
Phase 1: Understand Intent & Detect Mode
Mine the conversation for:
- Subject/scene: What is the image of?
- Purpose: What is it for? (hero image, icon, mockup, blog post, etc.)
- Style: Photorealistic, illustration, minimalist, etc.
- Technical requirements: Aspect ratio, resolution, specific dimensions
- Mood/atmosphere: Energetic, calm, professional, playful, etc.
Detect Mode
Expert Integration Mode — if the user wants to integrate Nano Banana into their codebase (e.g., "how do I add image generation to my app", "show me the API", "I'm building a feature that generates images"):
- Read
references/integration-guide.md - Provide SDK examples, authentication patterns, and production best practices
- Skip to guidance — do not call the API
Generation Mode — if the user wants an image generated now:
- Continue to Phase 2
Analyze Project Context (Generation Mode Only)
If invoked within a project directory, gather context to improve prompts:
# Identify project type
ls package.json pyproject.toml README.md 2>/dev/null | head -5
# Find project description
head -20 README.md 2>/dev/null || head -20 pyproject.toml 2>/dev/null
# Find existing images (identify style conventions)
find . -name "*.png" -o -name "*.jpg" -o -name "*.svg" 2>/dev/null | grep -v node_modules | head -10
# Find color schemes (Tailwind, CSS variables, theme files)
grep -r "primary\|brand\|#[0-9a-fA-F]\{6\}" --include="*.css" --include="*.ts" --include="*.json" -l 2>/dev/null | head -5
Use this context to make the generated image fit the project's visual language.
Classify Request Type
Choose the most fitting category:
photorealistic— scenes, portraits, product photos, landscapesstylized— illustrations, art, cartoon, concept arttext-heavy— posters, banners, infographics with textproduct-marketing— commercial product shotsui-mockup— app screens, website designs, wireframesdiagram— technical illustrations, flowcharts, architectureminimalist— abstract, logos, icon concepts
Ask Only for Missing Info
Only ask for information the conversation did not already provide. If the user said "a minimalist logo for my SaaS app", you already know: subject (logo), style (minimalist), purpose (SaaS branding). Don't ask for things you already know.
Phase 2: Enhance Prompt
Read the relevant section from references/prompt-engineering.md based on the request category.
Enhancement Process
Apply category-specific enhancements:
| Category | Add to Prompt |
|---|---|
photorealistic | Camera angle, lens type, lighting setup, depth of field, atmosphere |
stylized | Art style, quality level, shading approach, color palette reference |
text-heavy | Exact text in quotes, font style, weight, color, placement |
product-marketing | Studio lighting setup, surface material, background type |
ui-mockup | Device frame, design language, project colors if known |
diagram | Diagram type, color coding scheme, label style, clean lines |
minimalist | Background color (exact), element positioning, size proportions |
Incorporate any project context found in Phase 1 (brand colors, design system, domain).
Present Enhanced Prompt for Approval
ALWAYS show this before generating. Never skip this step.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
PROMPT REVIEW
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
ORIGINAL: [user's original prompt]
ENHANCED: [improved prompt with additions]
CHANGES:
+ [what was added]
+ [why it was added]
MODEL: [Selected model name]
ASPECT: [e.g., 16:9]
RESOLUTION: [e.g., 2K]
EST. COST: ~$[estimate]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Proceed with enhanced prompt? (yes / modify / use original)
If the user wants modifications, update the enhanced prompt and show the review block again before proceeding.
Phase 3: Select Model & Parameters
Default: Nano Banana 2 (gemini-3.1-flash-image-preview) at 2K resolution.
See references/model-guide.md for full details. Quick reference:
| Use Case | Model | Resolution |
|---|---|---|
| Quick drafts / iteration | gemini-2.5-flash-image | 512 or 1K |
| Most production assets (DEFAULT) | gemini-3.1-flash-image-preview | 2K |
| Text-heavy images | gemini-3-pro-image-preview | 2K–4K |
| Print / high-DPI | gemini-3-pro-image-preview | 4K |
Aspect ratio defaults by use case:
- Hero/banner:
16:9 - Profile/avatar:
1:1 - Stories/mobile:
9:16 - Portrait/pin:
2:3 - Standard web:
4:3
Always present the model and resolution choice to the user as part of the Phase 2 review block and allow them to override.
Phase 4: Generate Image
Determine the output path (default to ./generated-image.png if not specified, or a contextually appropriate name like ./hero-image.png or ./logo-concept.png).
Text-to-Image
uv run "$(dirname "$0")/scripts/generate.py" \
--prompt "ENHANCED_PROMPT_HERE" \
--model "MODEL_ID_HERE" \
--aspect-ratio "ASPECT_RATIO_HERE" \
--resolution "RESOLUTION_HERE" \
--output "OUTPUT_PATH_HERE"
Image Editing (when user provides an existing image)
uv run "$(dirname "$0")/scripts/generate.py" \
--prompt "EDIT_INSTRUCTION_HERE" \
--input-image "INPUT_IMAGE_PATH_HERE" \
--model "MODEL_ID_HERE" \
--aspect-ratio "ASPECT_RATIO_HERE" \
--resolution "RESOLUTION_HERE" \
--output "OUTPUT_PATH_HERE"
Parse the JSON Output
The script outputs a JSON object. Parse and handle each case:
Success:
{"status": "success", "output_path": "/abs/path/image.png", "model_used": "...", "text_response": "...", "size_bytes": 245760}
→ Report the file path. Use Read on image files if the platform supports inline display.
Error cases:
error_code | Meaning | Action |
|---|---|---|
CONTENT_POLICY | Prompt blocked by safety filters | Suggest rephrasing; remove sensitive elements |
RATE_LIMIT | API quota exceeded | Wait before retrying; suggest lower-cost model |
AUTH_ERROR | Invalid or missing API key | Direct user to references/integration-guide.md setup section |
NO_IMAGE_GENERATED | Model returned no image | Try rephrasing prompt; try different model |
DEPENDENCY_ERROR | google-genai not installed | Ensure uv is available; uv run handles deps automatically via PEP 723 metadata |
FILE_NOT_FOUND | Input image path invalid | Verify the path and re-run |
Phase 5: Iterate (Optional)
After a successful generation, offer iteration options based on user feedback:
Minor tweaks (color, brightness, small compositional changes):
→ Use image editing mode — pass the previous output as --input-image
Major changes (completely different subject, style change): → Modify the enhanced prompt and regenerate from scratch
Rapid exploration (testing multiple concepts):
→ Use gemini-2.5-flash-image at 512 resolution for all iterations
→ Identify the winning concept, then regenerate with gemini-3.1-flash-image-preview at 2K
For iterative editing sessions, keep track of the prompt evolution so the user can revert to a previous version if needed.
Expert Integration Mode
When the user wants to add image generation to their codebase:
- Read
references/integration-guide.md - Identify the user's tech stack (Python, JavaScript/TypeScript, REST API needed)
- Provide the relevant SDK example from the guide
- Tailor the example to their project structure:
- Python FastAPI/Flask → show as an endpoint
- Next.js → show as an API route
- Plain script → show standalone function
- Highlight critical production concerns from the guide:
- Never expose API key in frontend
- Implement rate limiting per user
- Cache by prompt hash
- Handle 429 with exponential backoff
- Suggest environment variable setup appropriate for their project type
Reference Files
references/prompt-engineering.md— Photography terms, style guides, sparse→rich examples by categoryreferences/model-guide.md— Model comparison, pricing, rate limits, resolution optionsreferences/integration-guide.md— SDK examples (Python/JS/REST), setup, production best practicesscripts/generate.py— Core API caller with retry logic and JSON outputscripts/requirements.txt—google-genai>=1.0.0