Gareth Shelbourne | UX/UI & Product Designer

[Case Studies]

UX Meets AI:
Crafting Feedback for Ai Coach

[Product Research]

[Overview]

The Context

Coachi is an AI-powered ski coaching app designed to help recreational skiers improve their technique through video-based analysis and personalised guidance. The app uses two forms of AI:

1. Body-Tracking AI – Analyses user-submitted ski videos to detect key movement metrics such as fore–aft balance, lateral balance, stance width, and arm position.

2. AI Coach (Chat AI) – Interprets the output from the body-tracking system, along with contextual inputs like slope difficulty, turn type, and confidence level, to generate natural-language coaching feedback — mimicking the tone, clarity, and instructional approach of a real ski instructor.

My role focused on testing and refining the AI Coach’s feedback system — ensuring it delivers advice that is technically accurate, emotionally intelligent, and easy for beginner and intermediate skiers to act on.

https://www.coachiapp.com/

The Context

Coachi is an AI-powered ski coaching app designed to help recreational skiers improve their technique through video-based analysis and personalised guidance. The app uses two forms of AI:

1. Body-Tracking AI – Analyses user-submitted ski videos to detect key movement metrics such as fore–aft balance, lateral balance, stance width, and arm position.

https://www.coachiapp.com/

Problem Statement

AI feedback needs to feel human, useful, and correct: Early versions of the AI Coach produced feedback that was overly technical, occasionally inaccurate, and difficult for beginners to follow.

Users need clarity, not complexity: Many learners were unfamiliar with ski jargon, and drills were sometimes too detailed or misaligned with their skill level.

The challenge: humanise and validate the AI’s responses: The AI needed to deliver concise, emotionally intelligent guidance that reflected real ski instruction — not just algorithmic logic.

Problem Statement

AI feedback needs to feel human, useful, and correct: Early versions of the AI Coach produced feedback that was overly technical, occasionally inaccurate, and difficult for beginners to follow.

Users need clarity, not complexity: Many learners were unfamiliar with ski jargon, and drills were sometimes too detailed or misaligned with their skill level.

The challenge: humanise and validate the AI’s responses: The AI needed to deliver concise, emotionally intelligent guidance that reflected real ski instruction — not just algorithmic logic.

Skills Demonstrated

UX Testing for AI

Content Design

Human Computer Interaction

HCI

Human-Centred AI

UX Research

[Impact]

Simplified Feedback Logic

Refined biomechanical metrics and feedback language to remove jargon, align with BASI standards, and improve learner comprehension.

Realistic, Actionable Drills

Tested and corrected AI-generated drills to ensure they matched real-world ski instruction and were safe, simple, and effective.

Human-Centred Coaching Tone

Redesigned the AI’s tone to feel supportive, confident, and natural — replicating the encouragement learners expect from instructors.

[My Process]

1. Designing a Realistic Test Framework

Collaborated with the lead developer to create a custom browser-based simulator that allowed me to input mock skier profiles and test how the AI Coach generated feedback based on different performance scenarios.

The simulator was built around three core input areas that the AI Coach uses to construct its response:

1. Skier Choices:

Slope gradient (e.g. green, blue)
Intended turn type (Snowplough or Parallel)
Confidence level (Likert scale)

2. Detected Properties:

Whether the Body-Tracking AI detected Snowplough or Parallel skis

3. Metrics:

Individual scores for four movement metrics: fore–aft balance, lateral balance, stance width, and arm position
An overall performance score

Once the inputs were set, the AI Coach would generate a text-based feedback response, structured into three sections:

Summary – A short headline outlining what the skier did well and what to improve
Comment – A more detailed explanation of the performance
Drill – A suggested practice task to help the skier improve

This simulation setup allowed me to rigorously test how the AI handled varied performance profiles — ensuring the feedback felt accurate, structured, and context-aware.

1. Designing a Realistic Test Framework

The simulator was built around three core input areas that the AI Coach uses to construct its response:

1. Skier Choices:

Slope gradient (e.g. green, blue)
Intended turn type (Snowplough or Parallel)
Confidence level (Likert scale)

2. Detected Properties:

Whether the Body-Tracking AI detected Snowplough or Parallel skis

3. Metrics:

Individual scores for four movement metrics: fore–aft balance, lateral balance, stance width, and arm position
An overall performance score

Once the inputs were set, the AI Coach would generate a text-based feedback response, structured into three sections:

Summary – A short headline outlining what the skier did well and what to improve
Comment – A more detailed explanation of the performance
Drill – A suggested practice task to help the skier improve

This simulation setup allowed me to rigorously test how the AI handled varied performance profiles — ensuring the feedback felt accurate, structured, and context-aware.

2. Evaluating the AI Coach’s Responses

Once the test environment was in place, I began evaluating the AI Coach’s responses across three critical dimensions to ensure feedback was both credible and user-friendly:

Technical Accuracy:

I compared biomechanical advice (e.g. edge use, stance, arm position) against the BASI instructor manual and internal performance scoring logic, identifying any incorrect or misleading cues.

Clarity & Tone:

I assessed whether the feedback sounded human, supportive, and confidence-building — rewriting sections that used jargon or felt overly robotic or clinical.

Instructional Usefulness:

I reviewed drills to confirm they were realistic, safe to perform on the slope, and appropriate for the skier’s confidence level and ability.

2. Evaluating the AI Coach’s Responses

Once the test environment was in place, I began evaluating the AI Coach’s responses across three critical dimensions to ensure feedback was both credible and user-friendly:

Technical Accuracy:

I compared biomechanical advice (e.g. edge use, stance, arm position) against the BASI instructor manual and internal performance scoring logic, identifying any incorrect or misleading cues.

Clarity & Tone:

I assessed whether the feedback sounded human, supportive, and confidence-building — rewriting sections that used jargon or felt overly robotic or clinical.

Instructional Usefulness:

I reviewed drills to confirm they were realistic, safe to perform on the slope, and appropriate for the skier’s confidence level and ability.

3. Prioritising and Actioning Fixes

Following initial testing, I prioritised key issues and made targeted refinements to improve clarity, tone, and instructional quality. The most critical fixes included:

• A – Non-humanised Language: Reworked robotic phrasing to sound more natural, encouraging, and instructor-like.

• B – Referencing Instructional Models: Removed confusing mentions of teaching frameworks (e.g. “based on the Central Theme”) that added cognitive load.

• C – Prescribing Turn Counts: Replaced rigid instructions like “do 6 turns” with more flexible, intent-based guidance.

• D – Overly Wordy Feedback: Tightened long blocks of text into concise, scannable coaching points.

Testing is ongoing, and improvements have been made across all four areas — the AI Coach now feels more natural, focused, and learner-friendly. However, it still occasionally references pedagogical models in its feedback, which remains a priority for refinement in future iterations.

3. Prioritising and Actioning Fixes

Following initial testing, I prioritised key issues and made targeted refinements to improve clarity, tone, and instructional quality. The most critical fixes included:

• A – Non-humanised Language: Reworked robotic phrasing to sound more natural, encouraging, and instructor-like.

• B – Referencing Instructional Models: Removed confusing mentions of teaching frameworks (e.g. “based on the Central Theme”) that added cognitive load.

• C – Prescribing Turn Counts: Replaced rigid instructions like “do 6 turns” with more flexible, intent-based guidance.

• D – Overly Wordy Feedback: Tightened long blocks of text into concise, scannable coaching points.

[Key Learnings]

Designing for AI Requires Instructor Empathy

Testing an AI coach means thinking like both a UX designer and a ski instructor — balancing accuracy with approachability.

Tone Matters as Much as Accuracy

Even correct feedback can fall flat if the tone isn’t encouraging. Language design is crucial when replicating human expertise.

Simulation is a Powerful Tool for UX Testing AI

Creating realistic test conditions allowed me to stress-test the system, spot logic failures, and iterate content without needing live users.

[Persona]

Jhon Roberts

Marketing Manager

Content

Age: 29

Location: New York City

Tech Proficiency: Moderate

Gender: Male

[Goal]

Quickly complete purchases without interruptions.

Trust the platform to handle her payment securely.

Access a seamless mobile shopping experience.

[Frustrations]

Long or confusing checkout processes.

Error messages that don’t explain the issue.

Poor mobile optimization that slows her down.

[Case Studies]

UX Meets AI:
Crafting Feedback for Ai Coach

[Product Research]

[Overview]

The Context

Coachi is an AI-powered ski coaching app designed to help recreational skiers improve their technique through video-based analysis and personalised guidance. The app uses two forms of AI:

1. Body-Tracking AI – Analyses user-submitted ski videos to detect key movement metrics such as fore–aft balance, lateral balance, stance width, and arm position.

The Context

Coachi is an AI-powered ski coaching app designed to help recreational skiers improve their technique through video-based analysis and personalised guidance. The app uses two forms of AI:

1. Body-Tracking AI – Analyses user-submitted ski videos to detect key movement metrics such as fore–aft balance, lateral balance, stance width, and arm position.

Problem Statement

AI feedback needs to feel human, useful, and correct: Early versions of the AI Coach produced feedback that was overly technical, occasionally inaccurate, and difficult for beginners to follow.

Users need clarity, not complexity: Many learners were unfamiliar with ski jargon, and drills were sometimes too detailed or misaligned with their skill level.

The challenge: humanise and validate the AI’s responses: The AI needed to deliver concise, emotionally intelligent guidance that reflected real ski instruction — not just algorithmic logic.

Problem Statement

AI feedback needs to feel human, useful, and correct: Early versions of the AI Coach produced feedback that was overly technical, occasionally inaccurate, and difficult for beginners to follow.

Users need clarity, not complexity: Many learners were unfamiliar with ski jargon, and drills were sometimes too detailed or misaligned with their skill level.

The challenge: humanise and validate the AI’s responses: The AI needed to deliver concise, emotionally intelligent guidance that reflected real ski instruction — not just algorithmic logic.

Skills Demonstrated

UX Testing for AI

Content Design

HCI

Human-Centred AI

[Impact]

Simplified Feedback Logic

Refined biomechanical metrics and feedback language to remove jargon, align with BASI standards, and improve learner comprehension.

Realistic, Actionable Drills

Tested and corrected AI-generated drills to ensure they matched real-world ski instruction and were safe, simple, and effective.

Human-Centred Coaching Tone

Redesigned the AI’s tone to feel supportive, confident, and natural — replicating the encouragement learners expect from instructors.

[My Process]

1. Designing a Realistic Test Framework

The simulator was built around three core input areas that the AI Coach uses to construct its response:

1. Skier Choices:

Slope gradient (e.g. green, blue)
Intended turn type (Snowplough or Parallel)
Confidence level (Likert scale)

2. Detected Properties:

Whether the Body-Tracking AI detected Snowplough or Parallel skis

3. Metrics:

Individual scores for four movement metrics: fore–aft balance, lateral balance, stance width, and arm position
An overall performance score

Once the inputs were set, the AI Coach would generate a text-based feedback response, structured into three sections:

Summary – A short headline outlining what the skier did well and what to improve
Comment – A more detailed explanation of the performance
Drill – A suggested practice task to help the skier improve

This simulation setup allowed me to rigorously test how the AI handled varied performance profiles — ensuring the feedback felt accurate, structured, and context-aware.

1. Designing a Realistic Test Framework

The simulator was built around three core input areas that the AI Coach uses to construct its response:

1. Skier Choices:

Slope gradient (e.g. green, blue)
Intended turn type (Snowplough or Parallel)
Confidence level (Likert scale)

2. Detected Properties:

Whether the Body-Tracking AI detected Snowplough or Parallel skis

3. Metrics:

Individual scores for four movement metrics: fore–aft balance, lateral balance, stance width, and arm position
An overall performance score

Once the inputs were set, the AI Coach would generate a text-based feedback response, structured into three sections:

Summary – A short headline outlining what the skier did well and what to improve
Comment – A more detailed explanation of the performance
Drill – A suggested practice task to help the skier improve

This simulation setup allowed me to rigorously test how the AI handled varied performance profiles — ensuring the feedback felt accurate, structured, and context-aware.

2. Evaluating the AI Coach’s Responses

Once the test environment was in place, I began evaluating the AI Coach’s responses across three critical dimensions to ensure feedback was both credible and user-friendly:

Technical Accuracy:

I compared biomechanical advice (e.g. edge use, stance, arm position) against the BASI instructor manual and internal performance scoring logic, identifying any incorrect or misleading cues.

Clarity & Tone:

I assessed whether the feedback sounded human, supportive, and confidence-building — rewriting sections that used jargon or felt overly robotic or clinical.

Instructional Usefulness:

I reviewed drills to confirm they were realistic, safe to perform on the slope, and appropriate for the skier’s confidence level and ability.

2. Evaluating the AI Coach’s Responses

Once the test environment was in place, I began evaluating the AI Coach’s responses across three critical dimensions to ensure feedback was both credible and user-friendly:

Technical Accuracy:

I compared biomechanical advice (e.g. edge use, stance, arm position) against the BASI instructor manual and internal performance scoring logic, identifying any incorrect or misleading cues.

Clarity & Tone:

I assessed whether the feedback sounded human, supportive, and confidence-building — rewriting sections that used jargon or felt overly robotic or clinical.

Instructional Usefulness:

I reviewed drills to confirm they were realistic, safe to perform on the slope, and appropriate for the skier’s confidence level and ability.

3. Prioritising and Actioning Fixes

Following initial testing, I prioritised key issues and made targeted refinements to improve clarity, tone, and instructional quality. The most critical fixes included:

• A – Non-humanised Language: Reworked robotic phrasing to sound more natural, encouraging, and instructor-like.

• B – Referencing Instructional Models: Removed confusing mentions of teaching frameworks (e.g. “based on the Central Theme”) that added cognitive load.

• C – Prescribing Turn Counts: Replaced rigid instructions like “do 6 turns” with more flexible, intent-based guidance.

• D – Overly Wordy Feedback: Tightened long blocks of text into concise, scannable coaching points.

3. Prioritising and Actioning Fixes

Following initial testing, I prioritised key issues and made targeted refinements to improve clarity, tone, and instructional quality. The most critical fixes included:

• A – Non-humanised Language: Reworked robotic phrasing to sound more natural, encouraging, and instructor-like.

• B – Referencing Instructional Models: Removed confusing mentions of teaching frameworks (e.g. “based on the Central Theme”) that added cognitive load.

• C – Prescribing Turn Counts: Replaced rigid instructions like “do 6 turns” with more flexible, intent-based guidance.

• D – Overly Wordy Feedback: Tightened long blocks of text into concise, scannable coaching points.

[Key Learnings]

Designing for AI Requires Instructor Empathy

Testing an AI coach means thinking like both a UX designer and a ski instructor — balancing accuracy with approachability.

Tone Matters as Much as Accuracy

Even correct feedback can fall flat if the tone isn’t encouraging. Language design is crucial when replicating human expertise.

Simulation is a Powerful Tool for UX Testing AI

Creating realistic test conditions allowed me to stress-test the system, spot logic failures, and iterate content without needing live users.