Immersive Music Experience and Lyric Translation With Vision Pro
Domain
Vision Pro · Apple Music · VR
What Happened
I explored how Apple Vision Pro could transform music interaction by designing immersive experiences that go beyond listening. The concept introduced multimodal features such as humming-based song search, real-time lyric translation, immersive video playback, and an AI-powered music chatbot to expand how users discover and enjoy music.
Role
Product Design · UI/UX · Research
What I Did
Defined core music interaction scenarios and mapped user journeys in a spatial interface. Designed and prototyped flows for humming search, immersive video playback, and real-time lyric translation. Built a concept model to unify current and future music features, and created high-fidelity screens to demonstrate how Vision Pro can deliver a seamless and immersive music experience.

overview
Eager to be surrounded by a space that matches the vibe of your favorite song?
Curious about the deeper meaning of foreign lyrics?
With Vision Pro, dive into an immersive music experience with real-time lyric translations, allowing you to feel every beat and word visually.
Problem
When listening to foreign music, constantly looking up lyric translations can break the flow and diminish immersion. Listeners need a seamless way to understand lyrics in real-time, staying fully engaged in the music.


Platforms like Genius and Musixmatch highlight the growing demand for lyric access, allowing millions of users to better connect with foreign music.
Focus
How might we create a seamless and immersive music experience that visually connects lyrics to the song’s atmosphere while enhancing emotional and cultural understanding?
Solution
Step into the music, feel the meaning
Immersive and in real-time with Vision Pro
Solution
Step into the music, feel the meaning
Immersive and in real-time with Vision Pro
I designed an immersive music experience for Apple Vision Pro, prioritizing user needs and seamless interaction. This project revealed that users value immersive music enjoyment over standalone lyric translation. It allowed me to explore the mental models of new devices and adapt designs to Vision Pro's unique capabilities.
Competitor Analysis
I conducted a detailed analysis of key players in the lyric translation space, focusing on Musixmatch, Genius, and YouTube personal lyric videos. The goal was to identify strengths, weaknesses, and opportunities for innovation to create a more immersive and seamless experience for foreign music listeners.
Opportunities
Immersive Music Experience
Create a visually immersive experience that connects the music's atmosphere with real-time lyrics.
Seamless User Experience
Ensure all features are integrated within a single platform to maintain uninterrupted music flow.
Context-Rich Lyric Translation
Provide translations that convey emotional and cultural depth, enriching the music experience beyond simple text.
YouTube Lyric Videos
Quality and accuracy depend on the creator, and there are potential copyright issues.
Combines visuals and lyrics for an enhanced viewing experience.
User-created videos featuring translated lyrics alongside the music, often with engaging visual elements.

Limited support for non-English languages and lacks consistent lyric translations.

Provides lyrics with in-depth annotations and explanations. Integrated with Spotify to show lyrics and background insights simultaneously.
Rich explanations and background information for a deeper understanding of music.
Genius
Extensive language support and crowd-sourced accuracy.
Requires a separate app and disrupts the listening experience by switching between platforms.

The world's largest lyrics platform, integrated with Spotify and Apple Music for real-time synced lyrics. Users can contribute by translating or editing lyrics, supporting over 62 languages.
Musixmatch
user interview
User interviews revealed that it’s not just about translating lyrics. Listeners are looking for a deeper, more immersive music experience where lyrics add to the mood without breaking the flow.
This shifted the focus to creating a seamless, visually engaging environment that lets users fully dive into the music.
🗣️
“When I'm curious about the meaning of lyrics while listening to foreign music, I have to leave the streaming site and search for a translation.”
Listening Disruption
“I prefer translated lyrics that match the mood of the music, ideally presented visually, rather than simple translations from a translator.”
Mood Alignment
“Lyrics are secondary; feeling the atmosphere of the song is more important for fully enjoying the music.”
Immersive Experience
Research Synthesis
I used an Affinity Diagram to organize and analyze user insights, which allowed me to identify key themes and brainstorm feature ideas. From this process, I derived several essential components for this service
core ideas
Seamless Music Experience
Features that ensure uninterrupted music flow, like integrated real-time lyric translations.
Immersive Visuals
Elements that visually connect lyrics to the music’s atmosphere, enhancing user engagement.



Choosing the Right Context
Platform and Device Decisions
Based on the insights, we determined that Apple Music would be the ideal streaming platform for integration, given its widespread use and compatibility.
To deliver the immersive experience we envisioned, we chose Apple Vision Pro as our primary device, leveraging its advanced visual and spatial capabilities.
User Scenario
In crafting this user scenario, I outlined how each feature contributes to a seamless and immersive music experience. I detailed the user’s journey from discovering songs through humming to engaging with real-time lyric translations and a visually rich environment.
Opportunities
#1 Melody Recall
#2 Humming Search
#3 Song Match
The user remembers a catchy foreign song but can't recall the title.
The user hums the melody, and the system quickly identifies the song.
The matched song is displayed, ready to be played.
#7 Chatbot Insight
#8 Deeper Context
#9 Mood Visuals
The user remembers a catchy foreign song but can't recall the title.
The chatbot provides cultural references and deeper meanings behind the lyrics.
The immersive environment enhances the song's mood with matching visuals.
#4 Playback
#5 Lyric Sync
#6 Gesture Query
The user starts listening to the song.
The lyrics appear in real-time, perfectly synchronized with the music.
Curious about a specific lyric, the user uses a hand gesture to explore further.
#10 Full Immersion
#11 Seamless Switch
#12 Experience Reflection
The user feels fully engaged, enjoying both the audio and visual experience.
The user easily switches to another song, repeating the immersive experience.
The user leaves feeling enriched, fully enjoying music from any language.
concept Model
I created this concept model to capture not only the key features I designed, such as humming-based song search, real-time lyric translation, an AI-powered chatbot, and immersive video playback, but also future possibilities, showcasing my vision for a seamless, immersive music experience with Apple Vision Pro.

Task Flow
Seamless Music Interaction Journey
Users can hum a melody to discover songs effortlessly, view real-time lyric translations, and dive into immersive environments that match the music’s mood. The journey connects music discovery with deeper enjoyment, making listening more engaging and seamless.

usability testing
Heuristic evaluation revealed low scores in system visibility, logical flow of tasks, and visual feedback during progress. To address this, we made improvements to the chatbot and lyric selection UI to enhance the overall user experience.
To-be
I added a “Edit” button that allows users to change their selected lyrics even after entering the chatbot.

To-be
As-is
Once users entered the chatbot after selecting a lyric, they could not change their selection, leading to frustration if they wanted to explore different lyrics.

As-is


Users were uncertain about how to enter the chatbot during the lyrics selection process, leading to confusion. They were unsure where to select the lyrics, which resulted in unclear navigation.
I introduced a clear instruction to guide users through the lyrics selection process. Additionally, I used a box format to highlight the selectable lyrics area and incorporated interactive elements to encourage users to make a selection.
Final outcome
Humming Search
Users can find a song just by humming its melody, even when they don’t know the title or lyrics.


Can’t remember the title or lyrics?
Cover one ear and hum the melody then we’ll find the perfect match for you.
We provide clear and friendly guidance for using gestures to make your experience seamless.
As you hum, a sound wave visualization appears at the bottom, giving you real-time feedback on your input.
Final outcome
Immersive Video Playback
Users can listen to music in visually immersive environments that match the song’s mood or setting, enhancing the overall experience.
Step into a world that matches the vibe of your music.
Experience your playlist in stunning, immersive environments.
Preview and choose your desired virtual space to match the mood of your music.

While listening to music, clench and release your fist facing forward to enter the immersive music experience space.
Final outcome
Real-Time Lyric Translation
Users can see lyrics instantly translated in real-time, allowing them to understand and enjoy foreign songs without leaving the platform.

Pinch and Drag real-time translated lyrics to explore deeper meanings with our chatbot.

Enjoy foreign music without language barriers.
See lyrics translated in real time, synced perfectly with the song.
Final outcome
AI-Powered Chatbot
Users can ask questions about lyrics and receive cultural, contextual, or detailed explanations through an interactive chatbot.

Auto-complete suggested questions are provided for a smoother experience.
Curious about a lyric’s meaning or context?
Highlight the text and let our chatbot provide the answers you need.

reflection
This project underscored the critical role of user research. Initially, I assumed the main problem was the inconvenience of lyric translation, but research revealed that users were more interested in a fully immersive music experience. Additionally, designing for an emerging device like Vision Pro challenged me to understand new mental models and consider unique aspects of VR interface design. It was also a valuable opportunity to deepen my understanding of Apple’s design system and how to adapt existing guidelines for a cutting-edge platform.