Abhi

Posted on Oct 30 • Edited on Nov 11

Advanced Prompt Engineering for goose: A Comprehensive Guide

#goose #promptengineering #ai #opensource

Introduction

While basic prompting can get you started with goose, advanced prompt engineering techniques can dramatically improve the quality, accuracy, and efficiency of your AI-powered development sessions. goose is an open-source AI agent framework developed by Block that automates development tasks, from debugging to deployment, and mastering sophisticated prompting strategies will help you unlock its full potential.

This guide explores expert-level techniques for maximizing goose capabilities through strategic prompt design, context management, and workflow optimization.

Understanding goose architecture

Before diving into advanced techniques, it's crucial to understand how goose works. goose operates through an interactive loop where it receives user input, passes it to an LLM for processing, executes tool calls based on the model's response, and manages context revision to optimize token usage.

Key architectural components:

Interface: Desktop app or CLI where you interact with goose
Agent: Core logic that manages the interactive loop
Extensions: Tools that enable specific capabilities (file management, shell commands, etc.)
Context Management: Automatic revision to keep relevant information and manage token costs

Refer goose architecture for more information.

Core Principles of Advanced Prompting

1. Agentic Thinking: Task-Oriented vs. Chat-Oriented

goose is agentic, not conversational. Understanding this distinction is fundamental:

❌ Conversational approach:

"Can you help me with my tests?"

✅ Agentic approach:

goose, add comprehensive test coverage for src/services/paymentService.js

**Current coverage:** 45% (unacceptable)
**Target coverage:** 90%+

**Testing requirements:**
1. Unit tests for all public methods
2. Integration tests for payment gateway interactions
3. Edge cases and error scenarios
4. Mock external API calls (Stripe)
5. Test success and failure paths
6. Validate input handling
7. Check error message clarity

**Test structure to follow:**
- Use Jest as testing framework
- Follow AAA pattern (Arrange, Act, Assert)
- Group tests by method using describe blocks
- Use descriptive test names (it should...)
- Mock external dependencies properly
- Test one thing per test case

**Specific scenarios to cover:**
- Valid payment processing
- Invalid payment details
- Network failures
- Timeout scenarios
- Insufficient funds
- Currency validation
- Amount validation (positive, not zero, within limits)
- Idempotency (duplicate payment handling)

Key differences:

State the task clearly upfront
Provide measurable success criteria
Include structural requirements
Specify testing/validation approaches
List edge cases explicitly

2. Context Architecture: The .goosehints Strategy

.goosehints is a text file that provides additional context about your project to improve communication with Goose. Advanced users leverage hierarchical hints files strategically.

Hierarchical Context Strategy

All .goosehints files from your current directory up to the root directory are automatically loaded and combined. Structure them by scope:

my-project/
├── .git/
├── .goosehints                 # Project-wide standards
├── frontend/
│   ├── .goosehints            # Frontend-specific hints
│   └── components/
│       ├── .goosehints        # Component-specific hints
│       └── Button.tsx
└── backend/
    ├── .goosehints            # Backend-specific hints
    └── api/
        └── routes.py

Root-level .goosehints example:

# Project Standards

## Code Style
- Use TypeScript strict mode
- Follow Airbnb style guide
- Maximum function length: 50 lines
- Prefer functional components

## Testing Philosophy
- Minimum 80% coverage for new code
- Integration tests for all API endpoints
- Unit tests for business logic
- E2E tests for critical user flows

## Documentation Requirements
- JSDoc comments for all public functions
- README updates for new features
- API documentation in OpenAPI format

## Architecture Patterns
- Use repository pattern for data access
- Implement CQRS for complex domains
- Event-driven architecture for async operations

Component-level .goosehints example:

# Component Guidelines

## Button Component Patterns
- All buttons must support loading state
- Use semantic HTML (<button> not <div>)
- Include aria-label for icon-only buttons
- Follow design system color tokens

## Testing Requirements
- Snapshot tests for visual regression
- Interaction tests with Testing Library
- Accessibility tests with jest-axe

Advanced .goosehints Techniques

1. @-mentions for Dynamic Context Loading

For frequently-needed documentation, use @filename.md or @relative/path/testing.md to automatically include file content in context.

# API Development

When working with API endpoints, reference:
@docs/api-standards.md
@docs/authentication-flow.md

For database operations, see:
@backend/database/schema.sql
@backend/database/migration-guide.md

2. Environment-Specific Hints

# Development Environment

Current environment: ${ENVIRONMENT}

{% if ENVIRONMENT == "production" %}
⚠️ PRODUCTION ENVIRONMENT
- All changes require review
- Database migrations must be reversible
- Feature flags required for new features
{% else %}
Development mode active
- Auto-formatting enabled
- Verbose logging enabled
{% endif %}

3. Custom Command Definitions

You can define useful tasks in .goosehints that benefit from language model-based agent execution flow, like idea generation, summarization, or structured data extraction.

# Custom Commands

## /review
Perform comprehensive code review:
1. Check for security vulnerabilities
2. Verify test coverage
3. Validate documentation
4. Assess performance implications
5. Check accessibility compliance

## /optimize
Analyze and optimize:
1. Identify performance bottlenecks
2. Suggest caching strategies
3. Recommend database query improvements
4. Propose code splitting opportunities

## /summarize [timeframe]
Generate summary of changes:
- Scan git history for [timeframe]
- Group changes by category
- Highlight breaking changes
- Create changelog format
- Save to docs/changelog/YYYY-MM.md

3. Context vs. Memory: Strategic Information Management

Every line in your .goosehints file gets sent with every request to goose, consuming input tokens. This is a critical consideration for advanced users.

Use .goosehints for:

Coding standards and conventions
Project architecture patterns
Testing requirements
Documentation standards
Technology stack information
File structure guidelines

Use Memory Extension for:

User-specific preferences
Dynamic project state
Learned patterns from past interactions
Frequently accessed but changing information
Personal coding habits
Context that needs tags/keywords for retrieval

Token Optimization Strategy:

# Minimal .goosehints (Optimized)

Stack: React 18, TypeScript 5, Node 20
Style: Airbnb, Prettier
Tests: Jest + RTL, 80% coverage
Docs: JSDoc + OpenAPI

[Store detailed standards in Memory Extension]
Use tags: #testing-patterns, #api-conventions, #component-structure

Instead of storing 200+ lines of detailed conventions in .goosehints (processed every request), store them in Memory Extension and retrieve on-demand with tags.

Advanced Prompting Patterns

Pattern 1: Structured Decomposition

Break complex tasks into explicit phases with validation gates.

Refactor the authentication system with the following phased approach:

PHASE 1 - ANALYSIS (Do not proceed to Phase 2 until this is complete)
1. Map current authentication flow
2. Identify all dependencies
3. List potential breaking changes
4. Create rollback strategy
5. Present findings for review

PHASE 2 - TEST COVERAGE (Requires Phase 1 approval)
1. Add tests for existing behavior
2. Achieve 95%+ coverage
3. Document test scenarios
4. Run full test suite
5. Present coverage report

PHASE 3 - IMPLEMENTATION (Requires Phase 2 passing tests)
1. Create feature branch
2. Implement new authentication layer
3. Maintain backward compatibility
4. Update integration tests
5. Run performance benchmarks

PHASE 4 - VALIDATION (Requires Phase 3 success)
1. Execute full test suite
2. Perform security audit
3. Check performance metrics
4. Validate error handling
5. Review documentation

PHASE 5 - DEPLOYMENT (Requires Phase 4 approval)
1. Create deployment plan
2. Set up monitoring
3. Configure feature flags
4. Document rollback procedure
5. Prepare incident response

**Validation Gate**: Pause after each phase and wait for explicit approval before proceeding.

Pattern 2: Perspective-Based Review

Review from multiple perspectives using different expert personas.

Review src/api/payment-processor.ts from multiple expert perspectives:

PERSPECTIVE 1 - SECURITY ANALYST
You are a security expert specializing in payment systems.
Focus on:
- Authentication and authorization
- Data encryption in transit and at rest
- PCI DSS compliance
- Input validation and sanitization
- SQL injection prevention
- XSS vulnerabilities
- Rate limiting
- Audit logging

PERSPECTIVE 2 - PERFORMANCE ENGINEER
You are a performance optimization specialist.
Focus on:
- Database query efficiency
- Caching opportunities
- Memory leaks
- Connection pooling
- Async operation handling
- Load testing considerations
- Scalability bottlenecks

PERSPECTIVE 3 - RELIABILITY ENGINEER
You are an SRE focused on system reliability.
Focus on:
- Error handling and recovery
- Retry logic and backoff strategies
- Circuit breaker patterns
- Timeout configurations
- Monitoring and alerting
- Graceful degradation
- Disaster recovery

PERSPECTIVE 4 - ACCESSIBILITY EXPERT
You are an accessibility specialist.
Focus on:
- Error message clarity
- User feedback mechanisms
- Progressive enhancement
- Screen reader compatibility
- Keyboard navigation
- WCAG compliance

For each perspective:
1. List specific findings (with line numbers)
2. Assess severity (Critical/High/Medium/Low)
3. Provide actionable recommendations
4. Suggest specific code changes

Save complete findings to docs/reviews/payment-processor-review.md

Pattern 3: Constraint-Driven Development

Define explicit constraints that guide implementation decisions.

Implement real-time notification system with these non-negotiable constraints:

PERFORMANCE CONSTRAINTS:
- Latency: < 100ms for notification delivery
- Throughput: Handle 10,000 concurrent connections
- Memory: < 512MB per instance
- CPU: < 40% average utilization

RELIABILITY CONSTRAINTS:
- Uptime: 99.95% SLA
- Message delivery: At-least-once guarantee
- Failover: < 30 seconds recovery time
- Data loss: Zero tolerance

SCALABILITY CONSTRAINTS:
- Horizontal scaling: Auto-scale from 2-20 instances
- Database connections: Pool size max 20
- WebSocket connections: Support 500 per instance
- Queue depth: Max 10,000 messages

SECURITY CONSTRAINTS:
- Authentication: JWT with 15-minute expiry
- Authorization: Role-based access control
- Encryption: TLS 1.3 for all connections
- Rate limiting: 100 requests/minute per user

OPERATIONAL CONSTRAINTS:
- Monitoring: Prometheus metrics exposed
- Logging: Structured JSON logs
- Tracing: OpenTelemetry integration
- Deployment: Zero-downtime deployments

Implementation approach:
1. Choose technology stack that meets ALL constraints
2. Create proof-of-concept demonstrating constraint adherence
3. Implement comprehensive benchmarking
4. Document constraint validation approach
5. Set up automated constraint testing

**Validation**: After implementation, provide evidence that each constraint is met, including benchmark results and test output.

Pattern 4: Error-Driven Refinement

Anticipate and specify error handling comprehensively.

Implement data import pipeline with exhaustive error handling:

ERROR CATEGORIES TO HANDLE:

1. INPUT VALIDATION ERRORS
   - Empty file
   - Wrong file format
   - Corrupted data
   - Encoding issues
   - Schema mismatches
   - Missing required fields
   - Invalid data types
   - Constraint violations

2. PROCESSING ERRORS
   - Timeout during processing
   - Memory overflow
   - Circular dependencies
   - Duplicate detection
   - Data transformation failures
   - Business rule violations

3. EXTERNAL SERVICE ERRORS
   - API rate limit exceeded
   - Service unavailable (503)
   - Authentication failures (401)
   - Authorization failures (403)
   - Network timeouts
   - SSL certificate errors

4. STORAGE ERRORS
   - Database connection lost
   - Transaction rollback needed
   - Constraint violations
   - Deadlock detection
   - Disk space exhausted
   - Write permission denied

For EACH error category:
- Define specific exception types
- Implement retry logic where appropriate
- Set up proper logging with context
- Create user-friendly error messages
- Design recovery strategies
- Add monitoring/alerting
- Write comprehensive tests

Error Response Format:
{
  "status": "error",
  "code": "ERR_VALIDATION_001",
  "message": "User-friendly message",
  "details": "Technical details",
  "timestamp": "ISO-8601",
  "trace_id": "uuid",
  "suggestions": ["Actionable", "recovery", "steps"]
}

Test coverage requirement: 100% for all error paths

Pattern 5: Documentation-First Development

Generate comprehensive documentation before implementation.

Design and document a feature before implementation:

FEATURE: Advanced search with filters

Step 1: Create comprehensive design document at docs/features/advanced-search.md

REQUIRED SECTIONS:

1. OVERVIEW
   - Business objective
   - User story
   - Success metrics
   - Timeline and milestones

2. TECHNICAL DESIGN
   - Architecture diagram
   - Data flow
   - API specifications (OpenAPI)
   - Database schema changes
   - Cache strategy
   - Security considerations

3. USER EXPERIENCE
   - Wireframes (describe in detail)
   - User flows
   - Error states
   - Loading states
   - Edge cases
   - Accessibility requirements

4. IMPLEMENTATION PLAN
   - Task breakdown
   - Dependencies
   - Testing strategy
   - Deployment approach
   - Rollback plan
   - Feature flag configuration

5. ACCEPTANCE CRITERIA
   - Functional requirements (Given/When/Then)
   - Non-functional requirements
   - Browser compatibility
   - Performance benchmarks
   - Security validation

6. RISKS AND MITIGATIONS
   - Technical risks
   - Business risks
   - Mitigation strategies
   - Contingency plans

After documentation is complete and approved:
- Generate test cases from acceptance criteria
- Implement features matching documentation exactly
- Keep documentation updated with any changes
- Create user-facing documentation

Advanced Workflow Patterns

Pattern 6: Iterative Refinement Loop

Build in explicit review cycles.

Implement OAuth2 authentication with iterative refinement:

ITERATION 1 - BASIC IMPLEMENTATION
Implement minimal OAuth2 flow:
- Authorization endpoint
- Token endpoint
- Basic token validation

After implementation, STOP and:
- Run security scan (npm audit, Snyk)
- Execute test suite
- Check performance
- Review code with OWASP guidelines
Present findings before continuing.

ITERATION 2 - ERROR HANDLING
Based on iteration 1 review:
- Add comprehensive error handling
- Implement retry logic
- Add request logging
- Set up monitoring

After implementation, STOP and:
- Test failure scenarios
- Verify error messages
- Check logging output
- Validate monitoring alerts
Present findings before continuing.

ITERATION 3 - SECURITY HARDENING
Based on iteration 2 review:
- Implement rate limiting
- Add CSRF protection
- Set up token rotation
- Configure secure headers

After implementation, STOP and:
- Run penetration tests
- Verify security headers
- Test rate limiting
- Review audit logs
Present findings before continuing.

ITERATION 4 - OPTIMIZATION
Based on iteration 3 review:
- Add caching layer
- Optimize database queries
- Implement connection pooling
- Configure CDN

After implementation:
- Run load tests
- Compare performance metrics
- Document optimizations
Final review and approval.

Pattern 7: Test-Driven Validation

Generate tests that validate requirements explicitly.

Create comprehensive test suite for user registration:

REQUIREMENTS-BASED TEST GENERATION:

For each requirement in docs/requirements/user-registration.md:
1. Generate test case name following pattern: "should [requirement] when [condition]"
2. Write test implementation
3. Add edge case variations
4. Include performance assertions where relevant

REQUIRED TEST CATEGORIES:

1. HAPPY PATH TESTS
   - Valid registration completes successfully
   - User receives confirmation email
   - User can log in with new credentials
   - Profile data is correctly stored

2. VALIDATION TESTS
   For each field (email, password, name, etc.):
   - Empty value rejection
   - Invalid format rejection
   - Boundary value testing
   - SQL injection attempt rejection
   - XSS attempt rejection
   - Unicode character handling

3. BUSINESS LOGIC TESTS
   - Duplicate email rejection
   - Password strength validation
   - Email verification requirement
   - Terms acceptance validation
   - Age verification (if applicable)

4. SECURITY TESTS
   - Password hashing validation
   - Token generation security
   - Session management
   - Rate limiting enforcement
   - CAPTCHA validation

5. INTEGRATION TESTS
   - Email service integration
   - Database transaction handling
   - External API interactions
   - Event publishing verification

6. PERFORMANCE TESTS
   - Registration completes in < 2 seconds
   - Database queries < 100ms
   - Email queuing < 50ms

TEST QUALITY REQUIREMENTS:
- Each test must be independent
- Use factories for test data
- Clean up after each test
- Mock external dependencies
- Include descriptive failure messages
- Assert all observable outcomes
- Achieve 100% branch coverage

Generate tests BEFORE implementation to serve as specification.

Pattern 8: Contextual Problem-Solving

Provide rich context for better solutions.

Debug production issue with full context:

PROBLEM STATEMENT:
Users report intermittent 500 errors during checkout process.

ENVIRONMENTAL CONTEXT:
- Environment: Production
- Region: US-East-1
- Instance count: 12
- Load: ~5,000 requests/minute
- Error rate: 0.5% (started 2 hours ago)
- No recent deployments

TECHNICAL CONTEXT:
- Stack: Node.js 20, Express 4.18, PostgreSQL 14
- Architecture: Microservices with API Gateway
- Cache: Redis 7.0
- Message Queue: RabbitMQ
- Monitoring: DataDog

ERROR DETAILS:
Error: Connection pool exhausted
  at Pool.connect (pg-pool:123)
  at PaymentService.processPayment (payment-service.js:45)
  at OrderController.checkout (order-controller.js:89)

AVAILABLE LOGS:
- Application logs: /var/log/app/*.log
- Database logs: CloudWatch PostgreSQL logs
- API Gateway logs: /var/log/nginx/access.log
- Queue logs: RabbitMQ management console

MONITORING DATA:
- CPU: 45% average (normal)
- Memory: 78% (slightly elevated)
- Database connections: 95/100 (suspicious)
- Cache hit rate: 85% (normal)
- Response time: 250ms avg (normal, except errors)

RECENT CHANGES:
- Marketing campaign launched 3 hours ago
- Traffic increased 30%
- New product category added yesterday

INVESTIGATION APPROACH:
1. Analyze connection pool configuration
2. Check for connection leaks
3. Review recent query patterns
4. Examine transaction handling
5. Check for long-running queries
6. Verify connection timeout settings
7. Review error correlation with traffic spikes

Provide:
- Root cause analysis
- Immediate mitigation steps
- Long-term solution
- Prevention strategy
- Monitoring improvements

Performance Optimization Techniques

Technique 1: Token-Aware Prompting

.goosehints files consume input tokens with every request, which impacts costs for users paying for LLM access. Optimize for token efficiency.

Before (Token-Heavy):

# Detailed Style Guide (1,500+ tokens)

## Function Naming Conventions
Functions should be named using camelCase...
[200 lines of detailed conventions]

## Variable Naming Conventions
Variables should follow these patterns...
[150 lines of detailed patterns]

## Code Organization
Files should be organized in the following manner...
[300 lines of structure details]

After (Token-Optimized):

# Style Guide (Core - 200 tokens)

Style: Airbnb + Prettier
Naming: camelCase functions, PascalCase classes
Structure: feature-based modules
Tests: co-located *.test.ts
Docs: JSDoc public APIs only

[Detailed conventions in Memory Extension]
Tags: #naming, #structure, #testing

Technique 2: Progressive Context Loading

Use @-mentions for frequently-needed documentation to include file content in context, but reference without @ for optional or very large files.

# Project Context (Smart Loading)

## Always Available
@docs/api-core.md              (Essential - auto-load)
@docs/security-requirements.md (Critical - auto-load)

## Load On Demand
Reference docs/api-extended.md      (Use when working on API)
Reference docs/deployment-guide.md  (Use when deploying)
Reference docs/architecture.md      (Use for system design)

## Tags for Memory
Common patterns: #auth-patterns, #db-patterns, #error-handling
Debugging guides: #debug-auth, #debug-database, #debug-api

Technique 3: Session Management

LLMs have context windows, which are limits on how much conversation history they can retain; monitor token usage and start new sessions as needed.

Session Planning Strategy:

# Session 1: Architecture Design
- Design system architecture
- Create API specifications
- Document data models
Total estimated tokens: ~20,000
[Start new session after this]

# Session 2: Implementation (Backend)
- Implement API endpoints
- Write tests
- Set up database
Total estimated tokens: ~30,000
[Start new session after this]

# Session 3: Implementation (Frontend)
- Build UI components
- Integrate with API
- Write E2E tests
Total estimated tokens: ~25,000

# Session 4: Review & Refinement
- Code review
- Performance optimization
- Documentation updates

Advanced Extension Management

Selective Extension Enablement

Turning on too many extensions can degrade performance; enable only essential extensions and tools to improve tool selection accuracy and save context window space.

Strategic Extension Configuration:

# Task-Specific Extension Profiles

## Profile: Backend Development
Extensions:
- Developer (core)
- Database MCP
- Git MCP
- Testing tools
Disabled: Browser, Design tools

## Profile: Frontend Development
Extensions:
- Developer (core)
- Browser Controller
- Component library MCP
- Design system MCP
Disabled: Database tools, Infrastructure

## Profile: DevOps
Extensions:
- Developer (core)
- Docker MCP
- Kubernetes MCP
- Cloud provider MCP
Disabled: UI tools, Testing frameworks

Switch profiles based on task:
`goose configure` → Enable relevant extensions only

Custom MCP Integration

Build custom MCP servers for domain-specific tools.

Create custom MCP server for internal tools:

PURPOSE: Integrate proprietary deployment system

TOOLS TO EXPOSE:
1. deploy_to_staging(service, version, config)
2. promote_to_production(service, version, approvals)
3. rollback_deployment(service, target_version)
4. get_deployment_status(service, environment)
5. trigger_health_check(service, environment)

IMPLEMENTATION REQUIREMENTS:
- OAuth authentication with internal IdP
- Rate limiting (10 requests/minute)
- Comprehensive error handling
- Audit logging for all operations
- Dry-run mode for testing
- Approval workflow integration

INTEGRATION WITH goose:
- Add to .goose/config.yaml
- Document in .goosehints
- Create usage examples
- Set up monitoring

Enable goose to handle deployment workflows end-to-end:
"Deploy user-service v2.3.4 to staging, run tests, and promote to production if tests pass"

Security Best Practices

Pattern 9: .gooseignore for Protection

goose can be eager to make changes; you can stop it from changing specific files by creating a .gooseignore file.

# .gooseignore - Critical File Protection

# Secrets and credentials
.env
.env.*
**/secrets/**
**/credentials/**
**/*_key.pem
**/*_secret.*

# Configuration
**/production.config.*
**/prod.*.yaml
kubernetes/prod/**

# Database
**/migrations/*.sql
**/seeds/production/**

# CI/CD
.github/workflows/production.yml
.circleci/config.yml

# Documentation
docs/architecture/decisions/**
CHANGELOG.md
LICENSE

# Generated files
**/dist/**
**/build/**
**/coverage/**

Pattern 10: Permission-Based Workflows

You can customize supervision levels - Auto Mode for full autonomy, or require approval before actions.

Risk-Based Permission Strategy:

High-Risk Operations (Require Approval):
- Database migrations
- Production deployments
- Dependency updates
- Security-related changes
- API contract modifications

Medium-Risk Operations (Notify):
- Test modifications
- Documentation updates
- Configuration changes
- Refactoring

Low-Risk Operations (Auto-Approve):
- Code formatting
- Comment additions
- Variable renaming
- Import organization

Measuring Success

Prompt Quality Metrics

Track improvement over time:

Evaluate prompt effectiveness:

METRIC 1: First-Try Success Rate
- Track: % of tasks completed without iteration
- Target: > 80% for routine tasks
- Measure: Weekly review of session logs

METRIC 2: Token Efficiency
- Track: Average tokens per completed task
- Target: Reduce by 20% over 3 months
- Measure: Monitor LLM API costs

METRIC 3: Context Relevance
- Track: % of .goosehints content used per session
- Target: > 60% utilization
- Measure: Review context window usage

METRIC 4: Error Recovery
- Track: Steps required to recover from errors
- Target: < 3 steps average
- Measure: Count correction cycles

METRIC 5: Task Complexity Handling
- Track: Successfully completed complex tasks
- Target: Increase by 50% over 6 months
- Measure: Define complexity rubric

Conclusion

Advanced prompt engineering for goose requires understanding its agentic nature, strategically managing context, and designing prompts that guide rather than restrict. By implementing these techniques:

Think in tasks, not conversations - Structure prompts as clear, actionable tasks with success criteria
Architect your context - Use hierarchical .goosehints, Memory Extension, and @-mentions strategically
Optimize for tokens - Balance comprehensive context with cost efficiency
Build in validation - Include explicit review gates and multi-perspective analysis
Iterate systematically - Use structured refinement loops for complex implementations
Measure and improve - Track metrics to continuously enhance your prompting strategy

The most effective goose users treat prompting as a systematic engineering discipline, applying the same rigor to AI interaction as they do to software development. Start with these patterns, experiment with variations, and develop your own sophisticated techniques based on your specific workflows.

DEV Community