Wednesday, September 4, 2013

Episode 29: CUDA and OpenCL


Book of the Show
Tool of the Show

CUDA and OpenCL

  • GPU: Graphics Processing Unit
    • Vertex: A point in a triangle (poly)
      • Vertex Shader: Transforms properties of a vertex
    • Fragment: A portion of a pixel
      • Fragment Shader: Sets the color & alpha of a pixel for a poly.
    • GPUs are optimized for performing the same operation on many vertices / fragments.
  • GPGPU: General-Purpose Graphics Processing Unit
    • Transform input data to images
    • Turn off any special effects (fog, lighting, etc.)
    • Render exactly one frame
    • Read output data from screen
Common concepts
  • Both CUDA and OpenCL Relieve the developer from having to know about vertex/fragment shaders & graphics processing.
    • Data is organized in n-D arrays
    • Can specify read-only (write-once), write-only (read-never), or read-write.
    • Can copy data to/from GPU with a familiar interface
  • High latency, but can pipeline operations
  • The GPU is organized into warps (groups of threads).  Each thread in a warp must either perform the same operation or do nothing per cycle. (branching is bad)
  • Integer operations are slow, floating point is fast
  • Debugging can be a nightmare (but improving)
  • Write .h and .cu files (subset of C)
  • Compile with nvidia compiler, link with your C++ binary.
  • NVIDIA Libraries:
    • cuBLAS (CUDA Basic Linear Algebra Subprograms)
    • cuFFT (CUDA Fast Fourier Transform)
  • Language bindings for many languages (Python, Java, Lua, etc.)
  • A standard (like OpenGL) that has CPU and GPU implementations
  • Easy to debug on the CPU before running on the GPU
  • Not limited to NVIDIA graphics cards (good for distribution)