Fixing Common Android Lifecycle Issues in Games
Android application lifecycle events must be handled properly to avoid a bad user experience. This document will introduce you to some basic guidelines and techniques to ensure you can handle common lifecycle situations.
Table of Contents [show/hide]
For the purposes of our discussion, we will make the assertion that "application lifecycle" and "Activity lifecycle" are one and the same, at least in how they apply to most Android games (and possibly other monolithic applications).
Note: This document is not a replacement for a basic understanding of Android application development and basic android lifecycle concepts. For specific android topics, visit:
- Application basics: http://developer.android.com/index.html
- Lifecycle topics: http://developer.android.com/training/basics/activity-lifecycle/starting.html
The basic lifecycle events for an Android application are a fixed, clear hierarchy of stages (as shown in Figure 1), with a well-defined sequence and flow from launch to running and to final destruction. These stages are always transitioned in the given order; as each lifecycle event is received, the application moves onto the next matching stage.
Figure 1: Basic Android Lifecycle Hierarchy and Flow
In addition to the basic lifecycle events, there are other classes of Android events of interest to us, which occur at less strictly defined points in the lifecycle flow. We call these 'non-hierarchical' events since they do not sit in a particular spot in the normal hierarchy, nor are they received in any particular order with respect to the basic lifecycle stages. The events of interest include: window focus changes, visual surface changes, and device configuration changes.
Note that the ordering between focus and surface events is similarly unspecified; the window can gain focus before or after the surface is created, or lose focus before or after the surface is destroyed. None of the different classes of events should be assumed to come in any pre-determined sequence with respect to one another.
From years of experience seeing applications fail, we have learned that most application lifecycle problems are due to improper assumptions as to the order of the hierarchical and non-hierarchical lifecycle events, or the approach used to track and react to events. Most problems are easily addressed by tracking key application state as flags, and evaluating the flags as a whole -- and not assuming one particular event will follow another.
The three key states to track are "app resumed", "window focused", and "surface ready"; for ease of discussion we will call these the resumed flag, the focused flag, and the surface flag. Using such flags is the simplest and least error-prone way to determine the interactable and renderable state of the app. Set or clear a flag as each key event is received, and have a single check/update function that reviews the state of all flags and acts when important combinations of states occur.
The following is a list of events at which certain flags should be set or cleared:
All three state flags should be initialized to false/zero when the application first launches and receives the onCreate event. This ensures we start each run with a clean slate. From there, all other flag changes are defined by particular lifecycle events/messages:
- The resumed flag is set true when we receive the onResume event.
- The resumed flag is cleared to false when we receive the onPause event.
- The focus flag is set true when we receive a focus-gained event; in practice, this tends to come between onResume and onPause callbacks, but again is not required to.
- The focus flag is cleared to false when we receive a focus-lost event. Note focus-loss is more 'fluid', and may come before or after onPause.
- The surface flag is set true when we receive the surfaceCreate event (normally in the resumed state).
- The surface flag is cleared to false when we receive surfaceDestroyed.
The core engine in our GameWorks OpenGL samples hosted on GitHub (and offline as a component of NVIDIA CodeWorks for Android) shows this approach in working code. To simplify things, in the file NvAndroidNativeAppGlue.c we use a bitfield to track all application state in one variable, and an enum to declare the flags/masks.
In that file, you'll also see when/how we set our lifecycleFlags variable using different bit masks to capture the current state of the application.
Developers writing native C++ applications via NativeActivity and/or native_app_glue should not delude themselves; their applications are still based on a Java Android Activity, it just happens to be pre-built into the OS image. As a result, native applications must still handle all of the standard Android lifecycle cases. Fortunately, all of the Java lifecycle events are directly represented in both NativeActivity methods and native_app_glue messages. The following table lists the mappings between important Java callback names and their native variants:
|Java Activity member||NativeActivity callback||native_app_glue message|
Don't Render Until All Lifecycle Flags are True
Given the rules we have outlined, until all three state flags discussed earlier have been set true, we do not render. To reiterate, the three flags are:
- Resumed flag is true; app is 'active', between onResume and onPause
- Focus flag is true; window is focused, we had a focus-gain without subsequent focus-loss
- Surface flag is true; surface is ready, between surfaceCreated and surfaceDestroyed
To the extent one of the flags tells you your visual surface is ready, and that you do not render until the three flags are true, you also can hold off on the creation of EGL objects. Just before starting a frame's game logic/render processing, you check the three lifecycle state flags. If all three flags are set true, then you continue on, and check to see if EGL objects have been created and bound. If the EGL objects aren't ready yet, then create your EGL surface and context, and ensure they are bound.
When you have reached the point that the EGL objects are created and bound, the application has entered the 'meta' state we call 'ready to render'. Only when you have reached 'ready to render' are you actually okay to begin rendering.
In the GameWorks OpenGL sample engine for Android, if you look at the code in the file EngineAndroid.cpp, specifically the method shouldRender, you will see how it uses the mask NV_APP_STATUS_INTERACTABLE to test that all needed state flags are set, and then calls the isReadyToRender function to check the EGL state.
What goes for rendering also goes for audio: we do not recommend starting any audio playback until the app has reached 'ready to render' state.
As an added safety net, we recommend that at the start of actual rendering of each frame, you should check if the cached size of the window/surface has changed since the prior frame; if it has, then update aspect ratio, size and viewport information for your rendering context/surface. This way, the impact of resize events received out of the usual sequence can be minimized or eliminated.
In the sections that follow, we will be discussing in what situations you need to stop rendering again, watching for the above 'ready to render' state to be reached again.
Focus loss can occur in as simple a case as dragging down the notification drawer, or as part of a larger group of events when switching away to another app (from task switcher, or a notification, or an incoming call toast, etc.). For a brief loss like the notification drawer, or a system dialog, the application can lose and regain focus without ever being paused. There are also cases where the app lifecycle can go from resumed all the way to onDestroy without ever receiving focus loss. So you will generally react to focus loss independent of the other state flags.
When you lose focus, the user cannot provide game input. If your game supports pausing gameplay, you should automatically pause the game on focus loss, as the user would have lost the ability to control the game for some period of time.
NOTE: you should never automatically un-pause gameplay for the user. If you were to handle focus gain by simply unpausing, the user could be caught off guard and left to catch up (e.g., driving off-track in a racing game). So just show a pause screen, and allow the user to resume gameplay themselves.
Focus loss is also a good time to consider ducking your audio (i.e., temporarily reduce the volume), which helps reinforce that focus was lost, and gives whatever task now has focus the ability to play any audio without yours being overwhelming.
Stop Rendering, Computation, Audio
When you receive the onPause event, you clear the resumed flag. This happens when another Activity comes to the front, whether hitting Home, bringing up the task switcher, jumping to an app from a notification, etc. As you are no longer the foreground Activity/app, your impact on the device should be minimized.
You should immediately stop all rendering and any other work that consumes CPU (e.g., physics, AI, all gameplay logic). This also means pausing worker threads so they don't continue to run off on their own. Also at this point, games should stop all audio/video playback.
You should also automatically pause gameplay for the user if you haven't already due to another event.
To be explicitly clear (as this has been a fairly frequent error), you should not touch the focus flag state as a side-effect of onPause; you only set/clear a flag when an explicit message for that specific flag is received.
This is also a good opportunity to auto-save user game progress (important if you aren't hooking the onSaveInstanceState callback). You can choose to save progress in onStop handling instead, but any later than that in the lifecycle and you risk losing state for the user. This is because once stopped, the process can be terminated without warning; you may never receive an onDestroy event.
Also note that surfaceDestroyed can occur after onPause, in rare cases even after onDestroy. As a result, applications may need to consider shutting down their EGL objects before the host surfaces are explicitly destroyed.
Don't Start Rendering or Audio (Until Focused)
From basic lifecycle flow, we see we reach the resumed state by way of normal app startup on a new launch (onCreate), returning to a running app instance (onRestart), or just waking a device from sleep (simply onResume). While an earlier section covered 'app startup', the guidance there completely applies to all three cases.
As your app comes to the foreground, you will receive the onResume event; the first thing you do for that event is set your resumed state flag to true. From there, it is up to other functional points in your code to evaluate how to react to the flag changing state.
In the app startup section, we learned that the app must not start rendering (or audio) until all three state flags are true, and you have reached 'ready to render'.
When a device has a lockscreen, you will get a loss of focus at some point. However, you can then wake the device, and sleep it again, and your app generally should not get a focus gain (or another loss). If you assume focus was gained without a focus event, then upon waking the device your app would always start rendering and music and sounds immediately -- all while the user is just quickly checking the time! Quite the bad user experience.
If you had a device with no lock screen, you may not receive any focus change events upon sleeping or waking. Thus any code that simply assumed a focus gain event would come after onResume, would cause your app to loop forever waiting for an event that would never be received. But if you track focus as a flag, and evaluate at top of frame, you'll never just "wait for it." In fact, in the situations where you get a resume event and find the other two state flags are already true, your per-frame checks would notice that, ensure EGL was ready, and you would properly start up rendering and audio again.
So we don't restart audio or rendering before we are 'ready to render'. When the app receives onResume, simply set the resumed state flag true, and continue on.
Note again that when you do finally resume rendering, you must not automatically un-pause gameplay. The user should be presented with a game-paused screen, and leave it to them to un-pause.
Check Other State Flags
As described earlier, window focus is non-hierarchical, and can come out of order with respect to Activity lifecycle events like pause and resume. You must check all three state flags upon focus gain to see if it is time to start things that may have been stopped before.
However, you might get a temporary focus loss, and gain, due to certain system dialogs -- without the application ever getting a pause event. In this case, you are already still rendering and playing audio, you likely auto-paused the game for the user, so the focus gain might have little impact.
With a major focus loss/gain sequence, such as the device being suspended (powered off), or the app being swapped out (phone call, IM, text, etc.), the focus gain itself simply flips the focus state flag back to true. Your check-state function will see that and if the other flags are also true it can start up rendering and audio again.
When certain device configuration changes occur, the application may be completely shut down and restarted. This is obviously quite confusing to the application (and developers). To prevent this app 'restart' process, you can use the configChanges tag in the manifest with particular flags to tell Android that your application understands certain configuration changes natively, and wants to be notified when they occur. Our general recommended set of flags (for API 13 and above) is:
This will tell Android to send the app an onConfigurationChanged event for these classes of configuration changes. Many apps may not even need to override that callback explicitly, since the side effect is often a surface change event (which is already something we handle and track).
If you are interested in learning more about configuration changes, visit:
Native executables on most platforms signal exit by returning from their main function. However, since native code in Android is merely one piece of the application, this will not work for native Android applications.
Technically, a native Android application should not return from its main function until the framework -- whether NativeActivity or native_app_glue -- has requested it. If an application wishes to proactively exit, they must invoke the Java finish method. This can be done:
- Via JNI calls up to the Java of the application's activity to invoke Activity::finish
- When using NativeActivity, via a native call to ANativeActivity_finish
Exiting the native entrypoint (android_main) does not necessarily cause the top-level Java application to exit. Applications should keep their main loops running until requested to exit the thread by APP_CMD_DESTROY.
One other issue with native Android applications is that even when the Java application has exited, the native process of the application may still be alive. This is a key piece misunderstood by many app developers, causing bugs and crashes as any static/global variables on a subsequent re-start of the Java app will contain unexpected data/state. So make sure that if you do use static data, that you clear it explicitly at every launch/creation of your app.
In order to ensure your lifecycle handling is working as expected, here are the sorts of tests we do when testing games. You should perform these checks prior to every new private drop or public release:
- Suspend/Resume device (power off and back on) with a lock screen active.
- A badly-behaved app starts audio when the device is resumed, despite not having focus in this case.
- Suspend/Resume device (power off and back on) with no lock screen active.
- A badly-behaved app does not start rendering and sits hung, incorrectly and explicitly waiting for a window focus gained event (which never appears as the system may not report a focus loss/gain when there is no lock screen).
- Switch away from the app (e.g., Home button), then check CPU/GPU usage.
- A badly behaved app may not properly, immediately stop rendering (and other worker threads).
- Using shell tools like tegrastats or top can show the CPU (or GPU) is still heavily engaged despite being on an idle home screen, and show that if the process is forcibly killed off, that resource usage vanishes.
- You can also use Google's systrace tool to capture the state of your device over a period of time, and then visualize the results in a graphical environment.
- The 'double clutch' app suspend test: with a lock screen active, power off and back on, and without unlocking power back off, then check CPU/GPU usage.
- A badly behaved app may not properly stop rendering and other work, due to the quick sequence of major events, and focus events which come at odd points. You can use the tools above to also check device utilization for this case.