Smarter Streaming

As streaming libraries grow, viewers increasingly search by remembering moments, scenes, or visual details rather than exact titles. This project explores how AI-driven search experiences can support natural discovery through Smart Search and frame-based exploration in OTT platforms.

Concept Project

Visual/UX Design

Duration - 1 Month

Tools Used - Figma, ChatGPT

From search to discovery

Traditional search feels like a rigid filing system; if you don't recall the exact title or actor, you're stuck. It fails to bridge the gap between a vivid memory of a scene and the actual movie

Searching Like We Remember: Search should speak the viewer’s language, prioritizing how we actually recall stories over how databases categorize them

The screen is currently a one-way street where spotting something you love leads to a dead end or a disruptive manual search

A Living, Interactive Frame: Turning every frame into a two-way conversation, allowing viewers to satisfy their curiosity instantly without ever breaking the story's flow

Design goals

Reduce search friction without disrupting viewing experience

Support discovery through natural language and visual thinking

Introduce AI without disrupting familiar OTT behaviors

Keep AI optional, explainable, and non-intrusive

Smart Search

Triptych of mobile screens showing the transition from the standard search landing page to the dedicated Smart Search interface and active keyboard input

Enables viewers to search naturally, using scene descriptions, dialogue, themes, and situations.

Enables viewers to search naturally, using scene descriptions, dialogue, themes, and situations.

This allows discovery based on how viewers remember content, not just what it’s called.

This allows discovery based on how viewers remember content, not just what it’s called.

Lives in a separate mode to keep normal search familiar and giving viewers full control over when AI steps in.

Lives in a separate mode to keep normal search familiar and giving viewers full control over when AI steps in.

Understanding the flow

Search Input

Search starts with whatever the viewer remembers, a scene, a line, or just an idea, through text or voice

Example inputs are shown upfront to help viewers understand how to search, not what to search

Search Results

Results are grouped and ranked based on how closely they match the viewer’s input, from closest to more loosely related

Each result explains why it appeared, with confidence cues to set clear expectations

Follow Up

Follow-up suggestions appear only when they add clarity or help exploration

These suggestions build on the original input instead of replacing it, keeping the flow lightweight and familiar

To show how Smart Search works in practice, I mapped viewer inputs into six intent types, helping the system understand whether someone is browsing broadly, recalling a moment, or searching with specific cues.

Generic Input

These are broad, low detail searches that express intent but lack specific context. They rely on the system to show a small set of top results first, then help viewers refine or explore

"A person standing alone"

Specific Visual Input

These are detailed recall inputs that describe distinct visual elements instead of facts or story themes. They help the system surface the closest scene matches with higher confidence

"A person standing alone in a bar leaning on the counter"

Dialogue Based Input

These are recall inputs where viewers search using memorable lines or spoken intent instead of titles

"I will look for you; I will find you and I will kill you"

Theme/Mood Based Input

These are abstract discovery inputs that express a story’s core idea or emotional tone. Results are grouped by how strongly they match the theme, then ranked by tone and relevance

"The protagonist loses everything and rebuilds"

Three mobile screens showing AI-powered thematic search results for a query about a protagonist losing everything, including top matches, a detailed theme-based match explanation for Maid, and related narrative suggestions with follow-up options

Metadata Based Input

These are factual searches based on known attributes like actors, awards, year, language, or genre. Results start with factual data and ranking shifts only when intent is clear

"Oscar winning movies"

Hybrid Input

These combine two or more inputs in one search. Results are ranked by the strongest clear input first

"Oscar-winning movies about making it big in life"

Three mobile screens showing hybrid AI search results for “Oscar-winning movies about making it big,” combining award metadata and theme matching, with a detailed view for A Beautiful Mind and related narrative refinements

Other possible Inputs

Non-sensical Input

System error state for unintelligible text input with suggested valid discovery prompts

Illicit Input

AI safety intervention message for restricted content queries with alternative discovery suggestions

General Input

System response redirecting non-entertainment queries back to movie and TV show discovery

Searching a vivid idea that may not exist in a real movie or show. Results lean on closest conceptual parallels

"A movie where time runs backward while the character walks forward"

Three mobile screens showing AI-powered search results for the query “A movie where time runs backward,” including closest thematic matches, a detailed view for Tenet with narrative similarity explanation, and related suggestions about time-based storytelling

Real-world impact

We’ve all spent thirty minutes scrolling through menus only to give up; Smart Search fixes this by letting you find a movie based on a feeling, a half-remembered line of dialogue, or a visual description

By cutting out the "choice paralysis" that ruins movie nights, platforms keep us happy and engaged, ensuring that even the most hidden gems in their library get the spotlight they deserve

Scan The Frame

Triptych of mobile screens showing the "In This Frame" AI feature identifying Scene, Style, and Props across different movie scenes

Allows viewers to explore cast, style, locations, and props directly from a paused moment - turning curiosity into seamless, in-context discovery

Allows viewers to explore cast, style, locations, and props directly from a paused moment - turning curiosity into seamless, in-context discovery

Video pauses automatically on tap and a panel slides in to show details

Video pauses automatically on tap and a panel slides in to show details

Viewers can explore the information and return to the video instantly after dismissing it

Viewers can explore the panel and return to the video instantly after dismissing it

Understanding scanned information

In this interface, cognitive load is high because a single frame can contain dozens of information. That's why, results are categorized into Scene, Style and Stuff to make sure viewers always know exactly where to look

Scene

The 'who' and 'where'

Categorized into Cast - Identifies the actors and the characters they are playing in that moment, and Setting - Provides the location, giving geographical or narrative context to the action

Style

The character’s Look

Categorized into Featured - Displays specific garments or accessories that have brand affiliations or paid placements, and Outfits and Accessories for every character

Stuff

The objects in the scene

Categorized into Featured - High-priority display for sponsored items, such as specific vehicles or tech gadgets with brand partnerships, and rest of the items found in the frame

Non-featured results are sorted by how distinctly visible they are, ensuring the most prominent items appear first

AI analysis panel identifying style descriptions for characters in a frame

Style - Featured & Outfits and Accessories

To ensure a secure and ethical experience, the system automatically filters out sensitive items like personally identifiable information, hate symbols, and explicit imagery. It also restricts the display of content involving self-harm, medical gore, or instructional data on dangerous objects to protect viewers and prevent real-world harm

Real-world impact

When you see a stunning dress or a cool gadget in a scene, Scan The Frame bridges the gap between "I want that" and "I own that"

This turns every frame into an interactive storefront, allowing brands to connect with us through the things we already love, without annoying commercial breaks, and creating a whole new way for platforms to grow beyond just monthly subscription fees

Smarter Streaming

From search to discovery

Design goals

Smart Search

Understanding the flow

Generic Input

Specific Visual Input

Dialogue Based Input

Theme/Mood Based Input

Metadata Based Input

Hybrid Input

Other possible Inputs

Real-world impact

Scan The Frame

Understanding scanned information

Real-world impact