Per-Model Prompting Guides
Optimize prompts for Claude, GPT, Gemini & open-source models
The Problem: You write a great prompt that works perfectly on ChatGPT, but when you try it on Claude, the output is worse. When you try it on Llama, it completely ignores your system message. Why?
The Solution: Speak Each Model's Dialect
Each LLM is trained with different data, formats, and optimization techniques. Claude was trained with XML tags in its data, making <document>, <example> tags especially effective. GPT models prefer markdown and resolve conflicting instructions by prioritizing later ones. Gemini excels with multimodal inputs placed at the start. Open-source models need explicit chat templates. Using the right "dialect" for each model can improve output quality by 20-40% compared to generic prompts.
Think of it like speaking different dialects of the same language:
- 1. Learn the model's native format: Claude → XML tags, GPT → markdown/JSON, Gemini → structured templates, Open-source → chat templates with special tokens
- 2. Use model-specific features: Claude: prefilled assistant responses. GPT: function calling, JSON mode. Gemini: search grounding, image-first multimodal. Llama: LoRA adapters.
- 3. Adapt prompt length and structure: Claude and Gemini handle very long contexts well (200K-1M). GPT works best with focused, concise prompts. Open-source models struggle beyond 8-32K tokens.
- 4. Test and compare: Run the same task on multiple models, compare outputs, then optimize the prompt for your chosen model's strengths
Model-Specific Prompt Formats
- Claude (Anthropic): XML tags for structure, extended thinking, 200K context, prefilled responses, strong system prompt adherence
- GPT-4 / GPT-5 (OpenAI): JSON mode, function calling, markdown formatting preferred, instruction prioritization (later instructions win)
- Gemini (Google): True multimodal (images first in prompt), 1M+ token context, search grounding, structured prompt templates boost accuracy 40%
- Open-Source (Llama, DeepSeek, Qwen): Chat templates required (im_start/im_end), explicit formatting, shorter prompts work better, model-specific system prompt formats
- Common Pitfall: One Prompt Fits All: A prompt optimized for GPT-4 may underperform on Claude by 20-30% because Claude expects XML structure, not markdown headers. Always adapt to the model.
Fun Fact: Claude was specifically trained on XML-structured data, which is why wrapping your prompt sections in tags like <instructions>, <context>, <output_format> dramatically improves performance. GPT models, on the other hand, tend to perform better with markdown headers (## Instructions) — using XML on GPT actually hurts readability for the model.
Try It Yourself!
Explore the interactive comparison below to see how the same task is prompted differently for each model and learn their unique features.
Per-Model Prompting Guide
- • Long context (200K tokens)
- • XML-structured prompts
- • Extended thinking
- • Strong system prompt adherence
XML tags: <instructions>, <context>, <examples>, <output_format>
- ✦ Prefilled assistant responses
- ✦ XML tag parsing trained into model
- ✦ Chain-of-thought via extended thinking
- • Function calling & tool use
- • JSON mode for structured output
- • Strong instruction following
- • Later instructions prioritized
Markdown headers: ## Role, ## Instructions, ## Examples, ## Output
- ✦ JSON mode (guaranteed valid JSON)
- ✦ Function/tool calling API
- ✦ Structured Outputs schema
- • Massive context (1M+ tokens)
- • Native multimodal (images, video, audio)
- • Search grounding
- • Structured templates +40% accuracy
Structured templates with clear sections. Images/media at the START of the prompt.
- ✦ Search grounding (live web data)
- ✦ Image-first multimodal processing
- ✦ 1M+ context for entire codebases
- • Full local control, no API costs
- • Fine-tuning with LoRA/QLoRA
- • Custom deployment options
- • No data leaves your servers
Chat templates with special tokens: <|im_start|>system, <|im_start|>user, <|im_start|>assistant
- ✦ LoRA/QLoRA fine-tuning for custom tasks
- ✦ Quantization for edge deployment
- ✦ No rate limits or usage restrictions
Create a free account to solve challenges
3 AI-verified challenges for this lesson
This lesson is part of a structured LLM course.
My Learning Path