GPU programming with Java

               Pramuditha Aravinda.
                 Kelum Senanayake.
Outline
   What is GPU.
   CPU vs. GPU Architecture.
   What is Stream processing.
   General Purpose GPU.
   CUDA.
   OpenCL
   Demo
What is GPU
   Graphics processing unit.
       Specialized microprocessor.
   Very efficient at manipulating computer graphics.
   Offloads and accelerates graphics rendering from the
    CPU.
   Dedicated to calculating floating point operations.
   Highly parallel structure.
       More effective for a range of complex algorithms.
   GPU can be present on,
       Video card.
       Motherboard
       CPU die (certain Core Intel CPUs)
CPU vs. GPU Architecture




       The GPU devotes more transistors to data processing.

     GPU : A Highly Parallel, Multithreaded, Manycore Processor
CPU vs. GPU contd…
What is Stream processing
   Is a computer programming paradigm, related to SIMD.
   Allows applications to easily exploit a limited form of
    parallel processing.
   Terminology
       Stream :- A set of data
       Kernel functions :- A series of operations
   Uniform streaming :- One kernel function is applied to all
    elements in the stream.
   Stream processing is driven by a data-centric model
       Image, video and digital signal processing
   Less efficient in general purpose processing with more
    randomized data access (such as databases)
General Purpose GPU
   The GPU is, by design, a stream processing system.
   GPGPU is a programming methodology.
       Modifying algorithms to run on existing GPU hardware
   Capable of performing simple operations on a stream of
    input data with amazing speed.
   Allows software developers to use stream processing on
    non-graphics data.
How hard is it?
   The languages are not very easy to use. Most GPU cards
    still operate on assembly language.
   The process flow is unique. Typically simple branching
    statement, such as if statements, offer such a performance
    penalty that it is often faster to process both conditions.
   The unique Stream-In-Stream-Out design is not typically
    used in CPU programs.
   The need to work with geometric primitives in order to
    push mathematical inputs to the system.
   Rapidly growing community.
Programming GPUs!
   Plenty of interfaces
       Compute Unified Device Architecture (CUDA)
       OpenCL
       OpenGL Shader Language (GLSL)
       DirectX/DirectCompute/HLSL
       ATI Stream
CUDA
   Compute Unified Device Architecture.
   Parallel computing architecture developed by NVIDIA.
   Programmers use C for CUDA.
       C with NVIDIA extensions and certain restrictions.
   Third party wrappers are also available for Python, Perl,
    Fortran, Java, Ruby, Lua, MATLAB and IDL.
   Currently used in,
       SETI@Home
       Distributed Calculations, such as predicting the native
        conformation of proteins
       Accelerated inter conversion of video file formats
       Physical simulations, in particular in fluid dynamics
CUDA Processing Flow
OpenCL
   Open Computing Language.
   Managed by the non-profit technology consortium
    Khronos Group
   Framework for writing programs that execute across
    heterogeneous platforms consisting of CPUs, GPUs, and
    other processors.
   Includes a language (based on C99) for writing kernels.
   APIs to define and then control the platforms.
   Supports both AMD/ATI and NVIDIA.
Programming GPU with Java and
OpenCL
   We need to program using a GPGPU driver
   However most GPGPU drivers are available as a native dll
    only.
   We need another layer to interface to Java runtime.
   Called Java Binding
Java Binding

               UserProgram.class




                    Jocl.jar




                    Jocl.dll




                  OpenCL.dll
Prerequisites - Hardware
   OpenCL capable graphic card.
   nVidea – All CUDA enabled GPU have OpenCL support.
       GeForce 8xxx or higher with 256MB minimum.
       http://www.nvidia.com/object/cuda_gpus.html
   AMD ATI Radeon™ HD 5400 or higher, AMD Radeon™
    HD 6800 series or higher.
       AMD X86 CPU w/ SSE 2.x or later are also supported.
       http://developer.amd.com/gpu/AMDAPPSDK/pages/DriverCom
        patibility.aspx
Prerequisites - Software
   OpenCL driver. – for nVidia GPUs, Usually OpenCL
    drivers are distributed with graphic card drivers.
       http://developer.nvidia.com/object/opencl-download.html
   Java bindings for OpenCL. – Usually there are two parts
       Platform dependent dll. e.g. jocl-windows-x86.dll
       Platform independent jar file. Jocl.jar
       There are few Implementations.
        http://jogamp.org/deployment/webstart/archive/jocl-0.9-b1-
        20101213-windows-i586.zip
   JDK
       http://www.oracle.com/technetwork/java/javase/downloads/ind
        ex.html
Demo Program
   Based on sample program available at
    http://jogamp.org/wiki/index.php/JOCL_Tutorial
GPU Programming with Java

GPU Programming with Java

  • 1.
    GPU programming withJava Pramuditha Aravinda. Kelum Senanayake.
  • 2.
    Outline  What is GPU.  CPU vs. GPU Architecture.  What is Stream processing.  General Purpose GPU.  CUDA.  OpenCL  Demo
  • 3.
    What is GPU  Graphics processing unit.  Specialized microprocessor.  Very efficient at manipulating computer graphics.  Offloads and accelerates graphics rendering from the CPU.  Dedicated to calculating floating point operations.  Highly parallel structure.  More effective for a range of complex algorithms.  GPU can be present on,  Video card.  Motherboard  CPU die (certain Core Intel CPUs)
  • 4.
    CPU vs. GPUArchitecture The GPU devotes more transistors to data processing. GPU : A Highly Parallel, Multithreaded, Manycore Processor
  • 5.
    CPU vs. GPUcontd…
  • 6.
    What is Streamprocessing  Is a computer programming paradigm, related to SIMD.  Allows applications to easily exploit a limited form of parallel processing.  Terminology  Stream :- A set of data  Kernel functions :- A series of operations  Uniform streaming :- One kernel function is applied to all elements in the stream.  Stream processing is driven by a data-centric model  Image, video and digital signal processing  Less efficient in general purpose processing with more randomized data access (such as databases)
  • 7.
    General Purpose GPU  The GPU is, by design, a stream processing system.  GPGPU is a programming methodology.  Modifying algorithms to run on existing GPU hardware  Capable of performing simple operations on a stream of input data with amazing speed.  Allows software developers to use stream processing on non-graphics data.
  • 8.
    How hard isit?  The languages are not very easy to use. Most GPU cards still operate on assembly language.  The process flow is unique. Typically simple branching statement, such as if statements, offer such a performance penalty that it is often faster to process both conditions.  The unique Stream-In-Stream-Out design is not typically used in CPU programs.  The need to work with geometric primitives in order to push mathematical inputs to the system.  Rapidly growing community.
  • 9.
    Programming GPUs!  Plenty of interfaces  Compute Unified Device Architecture (CUDA)  OpenCL  OpenGL Shader Language (GLSL)  DirectX/DirectCompute/HLSL  ATI Stream
  • 10.
    CUDA  Compute Unified Device Architecture.  Parallel computing architecture developed by NVIDIA.  Programmers use C for CUDA.  C with NVIDIA extensions and certain restrictions.  Third party wrappers are also available for Python, Perl, Fortran, Java, Ruby, Lua, MATLAB and IDL.  Currently used in,  SETI@Home  Distributed Calculations, such as predicting the native conformation of proteins  Accelerated inter conversion of video file formats  Physical simulations, in particular in fluid dynamics
  • 11.
  • 12.
    OpenCL  Open Computing Language.  Managed by the non-profit technology consortium Khronos Group  Framework for writing programs that execute across heterogeneous platforms consisting of CPUs, GPUs, and other processors.  Includes a language (based on C99) for writing kernels.  APIs to define and then control the platforms.  Supports both AMD/ATI and NVIDIA.
  • 13.
    Programming GPU withJava and OpenCL  We need to program using a GPGPU driver  However most GPGPU drivers are available as a native dll only.  We need another layer to interface to Java runtime.  Called Java Binding
  • 14.
    Java Binding UserProgram.class Jocl.jar Jocl.dll OpenCL.dll
  • 15.
    Prerequisites - Hardware  OpenCL capable graphic card.  nVidea – All CUDA enabled GPU have OpenCL support.  GeForce 8xxx or higher with 256MB minimum.  http://www.nvidia.com/object/cuda_gpus.html  AMD ATI Radeon™ HD 5400 or higher, AMD Radeon™ HD 6800 series or higher.  AMD X86 CPU w/ SSE 2.x or later are also supported.  http://developer.amd.com/gpu/AMDAPPSDK/pages/DriverCom patibility.aspx
  • 16.
    Prerequisites - Software  OpenCL driver. – for nVidia GPUs, Usually OpenCL drivers are distributed with graphic card drivers.  http://developer.nvidia.com/object/opencl-download.html  Java bindings for OpenCL. – Usually there are two parts  Platform dependent dll. e.g. jocl-windows-x86.dll  Platform independent jar file. Jocl.jar  There are few Implementations. http://jogamp.org/deployment/webstart/archive/jocl-0.9-b1- 20101213-windows-i586.zip  JDK  http://www.oracle.com/technetwork/java/javase/downloads/ind ex.html
  • 17.
    Demo Program  Based on sample program available at http://jogamp.org/wiki/index.php/JOCL_Tutorial