Evaluation of Buffering Events on Android devices

Like many others, you probably hate when your video stream is interrupted. Or even worse, when you need to get up from your couch to check if you have issues with your internet connection.

It’s frustrating to watch the buffering circle and wait till your favorite show loads or continues. That’s why we constantly fight buffering and at Showmax the buffering metrics are among the most important KPIs for the user experience.

Typology of buffering events

Each time a user interacts with our application or an action takes place, an event is generated in our analytics. A buffering event is an event where the stream is interrupted by loading data. Buffering events are broken down into two categories:

Natural buffering events occur when the buffer doesn’t load enough data while the user is watching and the video stops.
Seek-induced buffering events occur when the user seeks and wants to move to another timestamp in the video and the buffer must load more data.

As you can imagine, natural buffering is more annoying and has a greater negative impact on the user experience. It feels like the system is not ready to stream the video that the user wishes to watch. Seek-induced buffering is a bit more understandable as it follows a user action that the system needs to respond to. While we are trying to minimize both types of buffering events, we focus more on the natural ones. Because, as said, they degrade the user experience more and we can control them better too.

Where to start

To tackle buffering issues we must first know more about them, at least to see which buffering events are most frequent and to collect their parameters.
In late December, the Android team released a new flag called naturally in the app version v75. This flag allows us to distinguish both types of buffering events for android devices. It’s very helpful, especially for tracking buffering on SmartTVs where, due to a large number of various types, we are not capable of implementing a unified solution. Since this is a newly released feature which is just POC, we also need to verify its accuracy.

Event Pipeline

But first, how do we track “events”, and what are “flags”?

As we said, each user action triggers an event. For example, when a user moves to a new point in the asset they are watching (e.g. movie, episode, trailer, etc.), a “Seek” event is generated. If the buffer starts to load while the user is streaming, a “BufferUnderrun” event is created.
All of these events have additional parameters that help us analyze them and are transmitted from the client device to our various databases in a JSON format.
For instance, some of the parameters of a seek event may look like the following (many parameters have been omitted for this example):

{
 "event_data":
     {
      "subscription_status": "mobile_only",
      "device_code": "123456789ABCDEFGHIJ123456789ABCDEFGHIJ1234"
,
      "subsession_start": 1643899146806,
      "event_category": "Playback",
      "position": 1391,
      "playback_type": "STREAMING",
      "session_id": "deadbeaf-dead-beaf-dead-beafdeadbeaf",
      "user_id": "deadbeaf-dead-beaf-dead-beafdeadbeaf",
      "client_id": "123456789ABCDEFGHIJ123456789",
      "asset_id": "deadbeaf-dead-beaf-dead-beafdeadbeaf",
      "version": "75.5.0d33e51344",
      "showmax_viewing_mode": "paid",
      "video_usage": "MAIN",
      "profile_showmax_rating": "18-plus",
      "connection_type": "mobile_4g",
      "event": "Seek",
      "prev_position": 1180,
      "asset_duration": 1391,
    },
    "platform": "android",
    "environment": "production",
    "ua_os": "android",
    "client_timestamp": "2022-02-03T14:40:57.276000022+00:00",
}

From these parameters, we can get useful information about:

  • The event and its category:
    event: Seek and event_category: Playback tell us that the user performed a seeking action during Playback
  • The user, their session, and what they were watching:
    user_id, session_id, and asset_id tell us who the user is based on their user id, what the session was which can help identify other events with same session id, and what the id of the asset they were watching was, which we can identify from our internal library of assets
  • Where the users were when they seeked (i.e. changed the time point of streaming) and to what point they moved:
    prev_position: 1180 and position: 1391 tell us that the user was at the 1180th second of the video and moved to the 1391st second
  • The device of the user:
    platform: android tells this is an android device
  • The time the user took that action:
    client_timestamp gives us the exact timestamp from the user’s device when that action was taken

Similarly, a buffering event is generated every time the video is loading because there is not enough data in the buffer. Some of its parameters are:

{
    "event_data": {
      "asset_type": "episode",
      "event_category": "Playback",
      "event": "BufferUnderrun",
      "subscription_status": "mobile_only",
      "user_id": "deadbeaf-dead-beaf-dead-beafdeadbeaf",
      "position": 1298,
      "trial_subscription": false,
      "playback_type": "STREAMING",
      "asset_id": "deadbeaf-dead-beaf-dead-beafdeadbeaf",
      "asset_duration": 3382,
      "session_id": "deadbeaf-dead-beaf-dead-beafdeadbeaf",
      "profile_showmax_rating": "18-plus",
    },
    "user_id": "deadbeaf-dead-beaf-dead-beafdeadbeaf",
    "client_timestamp": "2022-02-03T14:58:37.450000047+00:00",
    "platform": "ios"
}

These specific events sent from the user devices end up in the ElasticSearch database, as shown below in the partial snapshot of our data flow:

Data flowData flow from user devices to ElasticSearch database

The Android parameter

The android team has implemented a solution to identify the buffering events as natural or seek-induced by adding a new parameter called naturally.

Assigning the parameter depends on the “onPositionDiscontinuity’’ and “onPlaybackStateChanged” callbacks from the ExoPlayer. It checks if the user performed a seek event and in which state the current player / playback is. Then, it sets the naturally flag in buffering events based on that.

Analysis of Buffering Events and Verification

In order to analyze the buffering events and verify the Android team’s solution against our results, we examined the buffering events in various 2-hour time ranges picked between January 8, 2022, to January 15, 2022.

In those various time frames, we evaluated the buffering events as seek-induced or natural using our own definition and tracking on Android devices and compared them with the android flag naturally.

Our definition and approach

If a buffering event and seek event occur at the exact same timestamp, then the buffering event is considered to be seek induced. Similarly, if a buffering event (“BufferUnderrun”) occurred within a specific time window from a seek event (“Seek”) then it can also be classified as seek induced.

As shown above, both events have parameters that allow us to determine the time they occurred (client_timestamp) and the session in which they occurred (session_id).
First, we match each buffering event with its corresponding seek event via the session_id parameter. This means these events occurred during the same streaming session.
Then, we examine each buffering event to see if any of its corresponding seek events occurred X ms before that buffering event, where X is a range of time windows from 10ms to 10s. If there are seek events within that time window, the buffering event is classified as seek-induced, otherwise it is classified as natural.

Tagging the events

For each time window, we determine the percentage of buffering events for each of the two categories and verify the results from the android solution against our benchmarks.

Based on our evaluation the same as on Android algorithm classification each event is tagged as:

  • True Positive: The buffering event is natural and is correctly classified as natural by the android algorithm
  • True Negative: The buffering event is not natural and is correctly classified as not natural by the android algorithm
  • False Positive: The buffering event is not natural but is incorrectly classified as natural by the android algorithm
  • False Negative: The buffering event is natural but is incorrectly classified as not natural by the android algorithm

To check the effectiveness of the android classification we determined 4 parameters:

  1. Accuracy
    The simplest way of reporting the effectiveness of an algorithm is by calculating its accuracy. Accuracy is calculated by finding the total number of correctly classified points and dividing by the total number of points.
    In other words, accuracy can be defined as:
    accuracy = (True Positives + True Negatives) / (True Positives + True Negatives + False Positives + False Negatives)

  2. Recall
    However, since 90% of our events belong in one category, accuracy is not the best parameter to determine the preciseness of the algorithm. If the android algorithm classified all events as natural it would be 90% accurate. Thus, we need to evaluate other parameters as well.
    Recall measures the percentage of events that the Android classifier correctly considered to be natural divided by the total number of natural events.
    recall = True Positives / (True Positives + False Negatives)

  3. Precision
    Precision is the number of events the algorithm correctly predicted as natural divided by the total number of times it predicted an event as natural.
    precision = True Positives / (True Positives + False Positives)

  4. F1 Score
    It is useful to consider the precision and recall of an algorithm, however, we still don’t have one number that can sufficiently describe how effective our algorithm is. This is the job of the F1 score, which is the harmonic mean of precision and recall. The harmonic mean of a group of numbers is a way to average them together. The F1 score combines both precision and recall into a single statistic. We use the harmonic mean rather than the traditional arithmetic mean because we want the F1 score to have a low value when either precision or recall is 0.
    F1 = 2 * precision*recall / (precision + recall)

Results

The android naturally flag shows:

  • 92.8% of buffer events as seek induced
  • 7.2% as natural

Below are the results of our method for the various time frames we examined and the effectiveness parameters of the android solution.

ResultsResults for the various time frames and the effectiveness parameters of the android solution

Our method shows:

Time frame (ms) Seek-induced events (%) Accuracy (%) Recall (%) Precision (%) F1 Score (%)
10 88 95.2 59.7 100.0 74.8
50 92.2 99.4 92.6 100.0 96.2
100 92.5 99.7 95.4 100.0 97.7
500 92.6 99.8 96.7 100.0 98.3
1000 92.6 99.8 96.9 100.0 98.4

Conclusions

The important (and, for user experience, very relevant) conclusion is that for a time window of 1 second between a seek event and a buffering event, approximately 90% of buffering events on android devices are seek-induced.

The conclusion that will help us effectively distinguish the two types of buffering events is that the solution implemented by the android team – the algorithm-based assigning of the naturally flag – seems to be accurately flagging buffering events as seek-induced or natural.
Future Work
The parameter introduced by the android team could potentially also be implemented by other teams, iOS, website, and others.

We will also focus more on preventing seek-induced buffering events in the future. One of the projects at the latest Showmax Hackathon was about introducing thumbnails in the player session which could help. Using the thumbnails, the user can more easily navigate to the desired point in the video – less seeking actions to be processed, less buffering events to occur. But we will definitely come up with more ideas, stay tuned.

Please check the original version of this article at