Improving Player Performance with Low Latency as Evident from FPS Aim Trainer Experiments

We’ve been collaborating with The Meta, makers of the popular KovaaK’s FPS aim trainer game, for some time now to distribute experiments to their players. Our most recent set of experiments was designed to test a player’s aiming ability under changing latency and to give players a chance to compete for the top spot on the leaderboards, at both lower and higher latencies.

During the week-long promotional period in December 2021, players could get rewards for participating, and over 12,000 players tried one of these new latency experiments. This post uses the data provided by over 15,000 players, including results from the promotional period through April 17, 2022. We’d like to thank all of the players for their enthusiastic participation, and we hope you’ll enjoy seeing the results as much as we do.

To tease with our most interesting result upfront, Figure 1 shows how the top 10% most-skilled players moved their entire score distribution higher and were able to better display their skill at the lowest latency.

A chart showing the distribution of Latency Flicking scores with median 31 for 25 milliseconds, 22 for 55 milliseconds and 15 for 85 milliseconds of latency. — *Figure 1. Latency Flicking top 10% score distribution. On this difficult task, the most skilled players increased their median score by more than 2x at 25 ms compared to 85 ms.*

The Experiments

We designed two experiments for this release, one meant to be fun and exciting and the other designed to be challenging and test the limits of the most skilled and capable players.

The purpose of these experiments was to highlight the importance of computer system latency and give players a chance to experience it for themselves at home without complicated equipment. To this end, both experiments vary the latency among a low-, middle-, and high-latency value.

You can find the description and short videos of these experiments later in this post, and you can go try them out in the game if you want to experience them for yourself. We plan to keep them available in the game for some time to come.

All participants in the NVIDIA experiments mode in KovaaK’s are required to submit informed consent before voluntarily participating and are welcome to stop their participation at any point.

Both experiments were structured to have a 15-second warm-up period followed by a 45-second experiment stage for each of the latency conditions that we tested. Only the 45-second experiment stage scores were used for entry on the leaderboards, and we only consider those results in our analysis. This is based on the well-known principle that people new to a task take some time to learn it. Thus, the warm-up period was intended to serve as the training period for the players.

Ideally, there would a much longer training period and experiment period, but the durations were selected to balance the quality of data that we collected with the enjoyment of the players. One minute of gameplay per condition felt good in our testing, and we believe it has worked well for the players.

Controlling latency

The three latency conditions that we settled on were 25 ms, 55 ms, and 85 ms (Figure 2). These were selected to mirror the latency settings tested in our prior SIGGRAPH Asia publication, though the aiming tasks we used were different from that in the prior work. For more information, see Latency of 30 ms Benefits First Person Targeting Tasks More Than Refresh Rate Above 60 Hz.

Diagram showing known baseline of 25 milliseconds with Reflex and unknown baseline without Reflex. — *Figure 2. Conditions used depending on the Reflex GPU. For non-reflex PCs, the baseline latency was unknown, so the results were omitted from leaderboards and analysis.*

In this experiment, we used the Reflex integration in KovaaK’s to measure and control the latency for each of the conditions. This means that the full latency control was only available to systems with Reflex-capable GPUs in them, thus non-Reflex results were omitted from the leaderboards as well as the majority of our analysis.

For these non-Reflex systems, we still did our best to give the players a similar experience, instead treating their system’s default latency as the baseline (25 ms displayed to the player) and the other two conditions being effectively base+30 ms and base+60 ms.

We can’t be sure whether the baseline from one computer to another is similar without the markers we get from the Reflex integration. We also did a best-effort estimate of external latency contributions, including the mouse and monitor.

Experiment 1: Latency Frenzy

The first experiment was designed to be fun and accessible for nearly anyone, and we placed it at the top of the list. The majority of players (95%) tried it.

This experiment is based on popular frenzy modes where a set of targets spawn in a grid against a wall, and the user has to plan the order in which to shoot at each target. After a target is killed with a single click, a new target spawns somewhere else on the grid after a small pause.

This frenzy mode was set to have three simultaneous targets visible. The player’s score was equal to the number of targets that they destroyed within 45 seconds multiplied by their accuracy (shots/hits); we used that as the primary measure.

Leaderboard placement was determined based on the combined score across all three phases (25 ms, 55 ms, and 85 ms).

Video 1. Latency Frenzy experiment. The task was completed in three phases with a randomized phase order per attempt.

As this experiment combines accuracy and planning, we expected latency to not be the only factor affecting the number of targets that a player can hit in quick succession. Skill level is obviously important, but so is the strategy that players employ to achieve the fastest, most accurate path through the targets.

Many players develop their aiming strategy over time. Thus they may quickly improve as they learn how to plan their path. The hope is that the warm-up period gives them a chance to select a strategy, though players who repeated the experiment may have adjusted their strategies.

Experiment 2: Latency Flicking

The second experiment was designed to be much more challenging. It highlighted a situation where computer system latency had a large impact on aiming performance.

As you can see from the results, we succeeded in crafting a challenging task, especially when playing with high latency. About 60% of the players who tried Latency Frenzy or Latency Flicking participated in the latter.

The flicking task is to start with the player’s aim at the center of the screen, where a dummy target is placed. When the player clicks on that target, a second target is spawned at a random place away from the center, and the player is given 600 ms in which to aim at that target and eliminate it.

The primary metric of success for this task was the number of these center-aim-kill target loops that the player was able to complete in the 45-second duration. Again, we used the number of targets killed as the score and place players on the leaderboard based on their combined score across all three latency levels.

Video 2. Latency Flicking experiment. This task was completed in three phases with a randomized order per attempt.

While this was a fair task as everyone had to play by the same rules, the actual number of attempts on target definitely varied from person to person given that the 45-second timer continued to run even while the player resets the aim to the dummy target at the center. As a result, a player who gets used to the 600 ms cadence and is skilled at returning to center gets more attempts and has a higher maximum score possible.

In our initial analysis of these results, we haven’t looked at how many attempts each player could make, but we may run that analysis in the future.

Results

Since we first released our experiment mode in February 2021, over 45,000 people have tried one or more of our experiments, completing more than 470,000 experiment sessions. Between the release of these new latency experiments in December 2021, and April 17th, 2022, over 18,000 players have completed at least one of these new experiments.

We focus on these results in our analysis, though players like you can continue to play and contribute data for any future analysis. In any case, the experiments are available for anyone to try and compare results.

Because the control of latency was considered completely fair and valid for those systems with a Reflex-enabled GPU, only Reflex-enabled results were allowed to be posted to the leaderboards. We excluded the 15-second warm-up sessions, as they were intended to enable players to get familiar with the task.

Players were allowed to complete each experiment as many times as they wanted, and we included these repeat attempts in the analysis. This means that players who played more than one time were likely able to refine their strategies and improve their skill over time.

For the analysis of the results, we also excluded all results that showed indications of not being able to reach the targeted latency values. The following analysis represents a relatively high confidence of latency being controlled to within 500 microseconds of the target.

The remaining confounding factors include the latency of mice and monitors. Such latency was only estimated in many cases, which is almost unavoidable on an open platform like PC gaming when conducting a large-scale distributed study like this one.

Skill levels

In addition to a general analysis of all participants, we also classified players by their skill level for each experiment. This is done by averaging each player’s total score across all runs, then ranking all of the players by this combined mean score.

While looking at various skill levels may be interesting, we decided to focus solely on the top 10% and top 1% player cohorts in the detailed skill-level analysis. You can think of these two cohorts as the highly skilled enthusiasts (top 10%) and the best of the best (top 1%) who are effectively the “esports professionals” of these KovaaK’s tasks.

Latency Frenzy results

The Latency Frenzy experiment results analyzed here include 27,032 complete experiments from 12,168 players, equaling 81,096 sessions of 45 seconds each.

The biggest result is that, across all attempts, both of the lower-latency conditions (25 ms and 55 ms) improved the number of targets eliminated (Figure 3), a difference that was found to be statistically significant in pairwise t-tests (p-value << 0.001). Players hit an average of 13.2 more targets at 55 ms system latency compared to 85 ms, and an additional 8.9 more targets at 25 ms compared to 85 ms. That’s a 15% increase in targets hit at 55 ms and a 24% increase in targets hit at 25 ms.

Bar chart of Latency Frenzy scores with means of 112.62 for 25 milliseconds, 103.75 for 55 milliseconds and 90.58 for 85 milliseconds. — *Figure 3. Latency Frenzy mean scores*

Figure 4 shows a quadratic fit to the raw data on a scatter plot. The fit line shows the likely mean score as the latency varies, crossing the clusters of scattered points at the mean of those distributions. Because there are so many points, they look like vertical lines in this plot.

Box and whisker plot with trend line for Latency Frenzy. — *Figure 4. Box and whisker plot and quadratic fit for latency frenzy as latency increases*

Looking at the distributions of scores in Figure 5, you can see even more interesting trends in the data. In particular, every percentile line moves up the score axis as the latency decreases. What’s even more exciting is that the entire distribution expands, allowing an easier chance to distinguish between players of similar skill levels.

A chart showing the overall distribution of scores for Latency Frenzy. — *Figure 5. Latency Frenzy overall score distributions*

We believe these summary results show a clear (though somewhat small in score) benefit to frenzy-type aiming tasks that comes from a change in system latency of the computer system. On average, players hit over 20% more targets in 45 seconds at 25 ms than at 85 ms.

Latency Flicking results

As described earlier, the flicking experiment is challenging. In fact, in our final data set, 595 runs (7.20%) and 421 players (7.55%) hit 0 targets at 85 ms. We often exclude 0 scores from analysis because they could indicate that a player walked away from the computer and their score may not be useful. However, these 0 scores are an important part of the player performance for this particular task.

Fortunately, by reducing the latency to 25 ms, a much smaller 327 runs (3.96%) and 230 players (4.12%) still hit 0 targets. In other words, reduced system latency made an impossibly hard task possible for 3.4% of these players.

Fewer players completed this task than the frenzy task, probably in part because frenzy is more fun and less difficult than this flicking task. Yet 5,576 players completed 8,265 experiments comprising 24,795 sessions.

As in the frenzy results, the lower-latency conditions improved the average number of targets destroyed in 45 seconds, but with a greater magnitude of improvement (Figure 6). Again, pairwise t-tests show that these differences were statistically significant (p-value << 0). Impressively, players hit more than twice as many targets on average at 25 ms than at 85 ms of system latency.

Bar chart of Latency Flicking scores with means of 15.15 for 25 milliseconds, 11.20 for 55 milliseconds and 7.74 for 85 milliseconds. — *Figure 6. Latency Flicking mean scores*

Figure 7 shows that a quadratic fit to the flicking results suggests that this flicking task would become impossible for even the most skilled players with only a little more latency. This makes sense because the total of 600 ms of aiming time gets reduced by the computer system latency; the displayed location of the target isn’t seen by the player until after the full system latency amount. There is also less time to adjust the aim to be sure it hits the target.

In testing during the design of this task, we found that 450 ms was barely doable for highly skilled players, even at the minimum latency possible.

Box and whisker plot with trend line for Latency Flicking. — *Figure 7. Box and whisker plot and quadratic fit for latency flicking results*

Another exciting aspect of this particular experiment can be highlighted by the histogram distributional plots in Figure 8. As with the frenzy results, we found that all percentiles increased their score at lower latencies, with the exception of the bottom 5%–10% who still weren’t able to complete such a difficult aiming task.

At the higher skill levels, the difference between scores becomes amplified even more. For instance, at 25 ms of latency, the top 25% of scores were above the top 10% line at 85 ms. The top 1% at 25 ms were higher than any score achieved at 55 ms.

A chart showing the overall distribution of scores for latency flicking. — *Figure 8. Latency Flicking overall score distribution*

Figure 9 shows the distribution of results for the top 10% most-skilled players in this experiment. As a reminder, this includes only players whose average score fell in the top 10% of scores, but we plotted all scores from those players. These players were more skilled than the general population, so there’s a fairly clear separation in the distributions between different latency conditions. In fact, the median score at 25 ms (31) was more than 2x as high as at 85 ms (15)!

A chart showing the distribution of scores for the top 10% most skilled at Latency Flicking with median 31 for 25 milliseconds, 22 for 55 milliseconds and 15 for 85 milliseconds of latency. — *Figure 9. Latency Flicking top 10% score distribution. Players hit twice as many targets at low latency.*

The top 1% of players demonstrate an even more telling change in scores at different skill levels. There remains some overlap in scores at different latency levels, but the overlap between 85 ms only gets as high as the bottom 25% of scores at 25 ms.

A chart showing the distribution of scores for the top 1% most skilled at Latency Flicking. — *Figure 10. Latency Flicking top 1% score distribution. For the most skilled players, every 30 ms reduction in latency produces a big improvement to their score.*

Conclusion

We’re grateful to our friends at The Meta for helping us put this experiment mode in their game, and enabling us to run experiments with players at home.

Prior research has shown that computer latency is important to minimize for many types of aiming tasks. However, the bulk of prior research has depended on small numbers of players with careful control of the experimental conditions. This study represents the first study conducted in the wild where latency was able to be controlled well enough to be useful in analysis. Because the trends in these results reinforce prior findings, there is greater confidence in the importance of latency for competitive FPS gamers.

Perhaps the biggest new result we found is that lower latency is most important for the highest skilled players. Skill does make the difference between who wins and loses many times, but especially among the highest skill levels, latency has an increasingly essential role in who wins and loses.

We encourage all players to use technology like NVIDIA Reflex to have the best conditions for playing competitively. For players who are particularly interested in optimizing their PC and game settings for latency, G-SYNC monitors with a Reflex Latency Analyzer give you the chance to measure your latency directly.

NVIDIA Reflex SDK is a tool for game developers looking to implement a low-latency mode that enables just-in-time for rendering and optimizes system latency.

For more information, see the following research papers: