Tuesday, May 23, 2006

Data Access Latency

Ryan and I have talked about making a chart to easily visualize the relative costs of common computer operations. A significant part of this is data access latencies in modern computer hardware. Yesterday I stumbled across some numbers in Raymond Chen's PDC 05 Talk: Five Things Every Windows Programmer Should Know, and behold! a chart is born:



Main memory latency has become a crucial consideration in modern software development. As the processor (CPU in the chart) becomes faster, the gulf between the processor memory (registers and caches) and main memory latency widens, making cache misses increasingly more expensive with respect to processor cycles. An increasingly frequent software design decision is to use simple array based data structures as opposed to pointer based tree and list structures for many operations. Typically, pointer structures (linked lists, binary trees, etc.) are theoretically more efficient than their linear counterparts (for example, binary trees [O(lg n) vs. O(n)] vs. linear search). Given the high cost of cache misses in modern hardware and the good memory locality of array based approaches, pointer based structures may perform 25-100 times more poorly than their simpler counterparts. In developing for the Xbox 360 and PlayStation 3 with crazy powerful processors, linearizing data structures is a crucial optimization.

So instead of using that binary tree next time, consider using a sorted linear array with a binary search.

No comments: