GPU forms

Dedicated graphics cards

The most powerful class of GPUs typically interface with the motherboard by means of an expansion slot such as PCI Express (PCIe) or Accelerated Graphics Port (AGP) and can usually be replaced or upgraded with relative ease, assuming the motherboard is capable of supporting the upgrade. A few graphics cards still use Peripheral Component Interconnect (PCI) slots, but their bandwidth is so limited that they are generally used only when a PCIe or AGP slot is unavailable.

A dedicated GPU is not necessarily removable, nor does it necessarily interface with the motherboard in a standard fashion. The term "dedicated" refers to the fact that dedicated graphics cards have RAM that is dedicated to the card's use, not to the fact that most dedicated GPUs are removable. Dedicated GPUs for portable computers are most commonly interfaced through a non-standard and often proprietary slot due to size and weight constraints. Such ports may still be considered PCIe or AGP in terms of their logical host interface, even if they are not physically interchangeable with their counterparts.

Technologies such as SLI by NVIDIA and CrossFire by ATI allow multiple GPUs to be used to draw a single image, increasing the processing power available for graphics.

Integrated graphics solutions

Intel GMA X3000 IGP (under heatsink)

Integrated graphics solutions, or shared graphics solutions are graphics processors that utilize a portion of a computer's system RAM rather than dedicated graphics memory. Computers with integrated graphics account for 90% of all PC shipments[6]. These solutions are cheaper to implement than dedicated graphics solutions, but are less capable. Historically, integrated solutions were often considered unfit to play 3D games or run graphically intensive programs such as Adobe Flash[citation needed]. (Examples of such IGPs would be offerings from SiS and VIA circa 2004.)[7] However, today's integrated solutions such as the Intel's GMA X4500HD (Intel G45 chipset), AMD's Radeon HD 3200 (AMD 780G chipset) and NVIDIA's GeForce 8200 (NVIDIA nForce 730a) are more than capable of handling 2D graphics from Adobe Flash or low stress 3D graphics[8]. However, most integrated graphics still struggle with high-end video games. Chips like the Nvidia GeForce 9400M in Apple's new MacBook and MacBook Pro and AMD's Radeon HD 3300 (AMD 790GX) have improved performance, but still lag behind dedicated graphics cards.[citation needed] Some Integrated Graphics Modern desktop motherboards often include an integrated graphics solution and have expansion slots available to add a dedicated graphics card later.

As a GPU is extremely memory intensive, an integrated solution may find itself competing for the already slow system RAM with the CPU as it has minimal or no dedicated video memory. System RAM may be 2 Gbit/s to 12.8 Gbit/s, yet dedicated GPUs enjoy between 10 Gbit/s to over 100 Gbit/s of bandwidth depending on the model.

Older integrated graphics chipsets lacked hardware transform and lighting, but newer ones include it.[9]

Hybrid solutions

This newer class of GPUs competes with integrated graphics in the low-end desktop and notebook markets. The most common implementations of this are ATI's HyperMemory and NVIDIA's TurboCache. Hybrid graphics cards are somewhat more expensive than integrated graphics, but much less expensive than dedicated graphics cards. These also share memory with the system, but have a smaller dedicated amount of it than discrete graphics cards do, to make up for the high latency of the system RAM. Technologies within PCI Express can make this possible. While these solutions are sometimes advertised as having as much as 768MB of RAM, this refers to how much can be shared with the system memory.

Stream Processing and General Purpose GPUs (GPGPU)


A new concept is to use a modified form of a stream processor to allow a general purpose graphics processing unit. This concept turns the massive floating-point computational power of a modern graphics accelerator's shader pipeline into general-purpose computing power, as opposed to being hard wired solely to do graphical operations. In certain applications requiring massive vector operations, this can yield several orders of magnitude higher performance than a conventional CPU. The two largest discrete (see "Dedicated graphics cards" above) GPU designers, ATI and NVIDIA, are beginning to pursue this new market with an array of applications. Both nVidia and ATI have teamed with Stanford University to create a GPU-based client for the Folding@Home distributed computing project (for protein folding calculations). In certain circumstances the GPU calculates forty times faster than the conventional CPUs traditionally used in such applications.[10][11]

Recently NVidia began releasing cards supporting an API extension to the C programming language called CUDA ("Compute Unified Device Architecture"), which allows specified functions from a normal C program to run on the GPU's stream processors. This makes C programs capable of taking advantage of a GPU's ability to operate on large matrices in parallel, while still making use of the CPU where appropriate. CUDA is also the first API to allow CPU-based applications to access directly the resources of a GPU for more general purpose computing without the limitations of using a graphics API.

Since 2005 there has been interest in using the performance offered by GPUs for evolutionary computation in general and for accelerating the fitness evaluation in genetic programming in particular. There is a short introduction on pages 90–92 of A Field Guide To Genetic Programming. Most approaches compile linear or tree programs on the host PC and transfer the executable to the GPU to run. Typically the performance advantage is only obtained by running the single active program simultaneously on many example problems in parallel using the GPU's SIMD architecture[12][13]. However, substantial acceleration can also be obtained by not compiling the programs but instead transferring them to the GPU and interpreting them there[14][15]. Acceleration can then be obtained by either interpreting multiple programs simultaneously, simultaneously running multiple example problems, or combinations of both. A modern GPU (e.g. 8800 GTX or later) can readily simultaneously interpret hundreds of thousands of very small programs.

0 comments: