Tip of the Month
How to (and How NOT to) Reduce Lag
The DirectX API allows graphics drivers to buffer up to three frames in the command queue of the GPU. Such a large buffer enables CPU and GPU to work in parallel even as workload on CPU and GPU varies. If there was no buffer then the GPU would become idle as soon as the CPU reduced its graphics command output (for example, because it was solving physics equations) and conversely the CPU would become idle whenever it wanted to send another graphics command and the GPU was still busy rendering a previously submitted graphics command.
On the other hand, allowing the driver to buffer three frames worth of data also means that lag (the time between a user giving input and seeing its effect on-screen) increases by up to three frames.
Several solutions exist to limit lag, in case lag becomes problematic. Locking the back-buffer is a solution, but it is a particularly bad one, see our tip in Developer Newsletter #7 for why it is inadvisable.
1. For games that mainly interact via a cursor, such as real-time strategy games, it is often sufficient to simply reduce the lag of the cursor. GPUs have specialized hardware-supported cursors that can be updated independent of (that is, more frequently than) the rendered scene. For more details, see the DirectX documentation for the methods:
IDirect3DDevice9::ShowCursor, IDirect3DDevice9::SetCursorPosition, and IDirect3DDevice9::SetCursorProperties.
2. Another solution is to use DirectX event queries: DirectX allows the insertion of tokens, called "events," into the command buffer and then allows to check whether the event has been processed. For example, at start-up time create an event query via
IDirect3DQuery9 *pQuery;
device->CreateQuery(D3DQUERYTYPE_EVENT, &pQuery);
Then just before calling Present(), insert the event into the command buffer:
pQuery->Issue(D3DISSUE_END);
If we wanted to limit the number of frames buffered to at most one, we need to check that the query has been processed at the end of the next frame. If it has not then we spin until it has been processed:
bool data;
while (pQuery->GetData(&data, sizeof(data), D3DGETDATA_FLUSH) == S_FALSE);
Because we can track multiple events in parallel and because we can insert and query these events from anywhere in the frame, we can thus finely regulate how many frames get maximally buffered: it is possible to buffer anything from fractional frames (a buffer that is at most half a frame) to 1, 2, or 2.5 frames. The main disadvantage of this technique is that the application is actively spinning while waiting for an event to be processed (see above while loop). Spinning like this can waste precious CPU cycles.
Back to top
|