Common Workflows

Learn the most effective ways to use Sageloop for your specific needs.

Workflow 1: Evaluating a New Feature

Time: 30 minutes | Goal: Define quality for a new AI feature

Steps

Create Project (2 min)
- Name: “Feature Name - Evaluation”
- Write initial system prompt based on requirements
Add Scenarios (5 min)
- 15-20 real examples from product spec
- Include edge cases and boundary conditions
Generate & Rate (15 min)
- Generate outputs
- Rate all outputs (5 min)
- Add feedback on low ratings
Extract Patterns (2 min)
- Run extraction
- Review quality patterns
Document Findings (5 min)
- Export golden examples
- Share with engineering team

Output

Clear quality definition ready for implementation.

Workflow 2: Iterating on Existing Prompt

Time: 20 minutes per iteration | Goal: Improve underperforming feature

Steps

Rate Current Outputs (5 min)
- Open existing project
- Rate all outputs if not already done
Extract Patterns (2 min)
- Identify failure clusters
- Review root causes
Apply Suggested Fix (3 min)
- Click “Apply Fix & Retest”
- Review prompt changes
- Confirm update
Rate New Outputs (5 min)
- Rate regenerated scenarios
- Check if quality improved
Iterate or Ship (5 min)
- If >90% success: Export and ship
- If <90%: Repeat steps 2-5

Expected Result

Success rate improves 10-20% per iteration.

Workflow 3: Collaborative Evaluation

Time: Ongoing | Goal: Team alignment on quality standards

Setup

Create project (PM)
Add scenarios (PM or team)
Generate outputs (PM)
Share project with team (PM)
Each team member rates independently (all)

Why It Works

Multiple perspectives on quality
Discover subjective vs. objective failures
Align team on standards

Deliverable

Consensus on quality definition
Points of disagreement documented
Clear behavioral spec

Workflow 4: Comparing Two Models

Time: 30 minutes | Goal: Decide between GPT-4 vs Claude

Steps

Create Two Projects
- Project A: GPT-4
- Project B: Claude
- Same system prompt and scenarios
Generate Outputs (3 min)
- Run Generation on both
Rate Independently (15 min)
- Rate Project A outputs
- Rate Project B outputs
Compare Results (5 min)
- Check success rates
- Read actual outputs
- Assess quality differences
Document Decision (7 min)
- Model choice
- Success rate difference
- Quality rationale

Quick Decision Matrix

Metric	GPT-4	Claude
Quality (%)	92%	88%
Speed	Slower	Faster

Decision: Choose based on quality needs and performance trade-offs.

Workflow 5: Export for CI/CD Integration

Time: 10 minutes | Goal: Get test suite into engineering workflow

Prerequisites

Achieved >90% success rate
Have golden examples and failure patterns

Steps

Go to Insights
Click “Export”
Choose “Test Suite (pytest)”
Download JSON
Share with engineering

Engineering Integration

Engineers can now run:

pytest sageloop_tests.json

Automatically tests AI behavior in CI/CD pipeline.

Workflow 6: Testing Multiple Variations

Time: 40 minutes | Goal: A/B test different prompt versions

Steps

Create Base Project (5 min)
- Name: “Support Bot - Base”
- Initial system prompt
- Add 15 scenarios
Generate & Rate Base (15 min)
- Generate outputs
- Rate all outputs
- Success rate: 70%
Create Variation Project (5 min)
- Clone scenarios
- Modify system prompt (e.g., “be more casual”)
- Generate
Rate Variation (10 min)
- Rate new outputs
- Compare success rates
Choose Winner (5 min)
- Which variation performed better?
- Update production prompt

Example Variations

Version A: Formal tone
Version B: Casual tone
Version C: Balanced

Results might show Version B gets higher ratings, so you’d use that.

Quick Reference: Common Scenarios by Role

Product Manager

Daily: Rate outputs as they’re generated
Weekly: Run extraction to find patterns
Monthly: Export insights to engineering

Design Lead

Discovery: Define tone and personality
Evaluation: Rate based on brand alignment
Feedback: Add feedback explaining brand misalignment

Engineering Lead

Review: Examine extracted patterns
Implement: Build bot using specifications
Test: Run exported test suite in CI/CD

Customer Support Lead

Input: Provide real support questions
Rating: Rate responses from customer perspective
Feedback: Explain what customers expect

Tips for Smooth Workflows

Naming Convention: Use consistent project naming

Feature Name - Version Number - Date
Example: "Support Bot - v2 - Jan 2026"

Feedback Standards: Always explain 1-2 star ratings

❌ "Bad"
✅ "Too formal; customers expect casual, friendly tone"

Rating Speed: Use keyboard shortcuts for fast rating

Press 1-5 to rate, ↓ to next output = 5 min for 15 outputs

Version History: Keep all projects for comparison

Support Bot v1 (65% success)
Support Bot v2 (85% success)
Support Bot v3 (95% success)

Team Sharing: Export results, not just project links

Share PDF/JSON export for offline review
Easier to reference and discuss with stakeholders

Troubleshooting Common Workflow Issues

Issue: Patterns Not Found

Cause: Fewer than 15 rated outputs or all high ratings Fix: Add more scenarios and rate them. Need minimum 15-30 ratings.

Issue: Team Has Different Standards

Cause: Subjective quality definitions Fix:

Compare ratings
Discuss disagreements
Document final standard
Re-rate together if needed

Issue: Iterations Not Improving Quality

Cause: Root cause not properly addressed Fix:

Review failure cluster reasoning
Make larger prompt changes
Add more specific instructions

Getting Started

Guides

Use Cases

Reference

Workflow 1: Evaluating a New Feature

Steps

Output

Workflow 2: Iterating on Existing Prompt

Steps

Expected Result

Workflow 3: Collaborative Evaluation

Setup

Why It Works

Deliverable

Workflow 4: Comparing Two Models

Steps

Quick Decision Matrix

Workflow 5: Export for CI/CD Integration

Prerequisites

Steps

Engineering Integration

Workflow 6: Testing Multiple Variations

Steps

Example Variations

Quick Reference: Common Scenarios by Role

Product Manager

Design Lead

Engineering Lead

Customer Support Lead

Tips for Smooth Workflows

Troubleshooting Common Workflow Issues

Issue: Patterns Not Found

Issue: Team Has Different Standards

Issue: Iterations Not Improving Quality

Next Steps

Getting Started

Guides

Use Cases

Reference

​Workflow 1: Evaluating a New Feature

​Steps

​Output

​Workflow 2: Iterating on Existing Prompt

​Steps

​Expected Result

​Workflow 3: Collaborative Evaluation

​Setup

​Why It Works

​Deliverable

​Workflow 4: Comparing Two Models

​Steps

​Quick Decision Matrix

​Workflow 5: Export for CI/CD Integration

​Prerequisites

​Steps

​Engineering Integration

​Workflow 6: Testing Multiple Variations

​Steps

​Example Variations

​Quick Reference: Common Scenarios by Role

​Product Manager

​Design Lead

​Engineering Lead

​Customer Support Lead

​Tips for Smooth Workflows

​Troubleshooting Common Workflow Issues

​Issue: Patterns Not Found

​Issue: Team Has Different Standards

​Issue: Iterations Not Improving Quality

​Next Steps

Workflow 1: Evaluating a New Feature

Steps

Output

Workflow 2: Iterating on Existing Prompt

Steps

Expected Result

Workflow 3: Collaborative Evaluation

Setup

Why It Works

Deliverable

Workflow 4: Comparing Two Models

Steps

Quick Decision Matrix

Workflow 5: Export for CI/CD Integration

Prerequisites

Steps

Engineering Integration

Workflow 6: Testing Multiple Variations

Steps

Example Variations

Quick Reference: Common Scenarios by Role

Product Manager

Design Lead

Engineering Lead

Customer Support Lead

Tips for Smooth Workflows

Troubleshooting Common Workflow Issues

Issue: Patterns Not Found

Issue: Team Has Different Standards

Issue: Iterations Not Improving Quality

Next Steps