Английская Википедия:CuPy

Материал из Онлайн справочника
Перейти к навигацииПерейти к поиску

Шаблон:Short description

Шаблон:Infobox software

CuPy is an open source library for GPU-accelerated computing with Python programming language, providing support for multi-dimensional arrays, sparse matrices, and a variety of numerical algorithms implemented on top of them.[1] CuPy shares the same API set as NumPy and SciPy, allowing it to be a drop-in replacement to run NumPy/SciPy code on GPU. CuPy supports NVIDIA CUDA GPU platform, and AMD ROCm GPU platform starting in v9.0.[2][3]

CuPy has been initially developed as a backend of Chainer deep learning framework, and later established as an independent project in 2017.[4]

CuPy is a part of the NumPy ecosystem array libraries[5] and is widely adopted to utilize GPU with Python,[6] especially in high-performance computing environments such as Summit,[7] Perlmutter,[8] EULER,[9] and ABCI.[10]

CuPy is a NumFOCUS affiliated project.[11]

Features

CuPy implements NumPy/SciPy-compatible APIs, as well as features to write user-defined GPU kernels or access low-level APIs.[12][13]

NumPy-compatible APIs

The same set of APIs defined in the NumPy package (Шаблон:Code) are available under Шаблон:Code package.

SciPy-compatible APIs

The same set of APIs defined in the SciPy package (Шаблон:Code) are available under Шаблон:Code package.

User-defined GPU kernels

  • Kernel templates for element-wise and reduction operations
  • Raw kernel (CUDA C/C++)
  • Just-in-time transpiler (JIT)
  • Kernel fusion

Distributed computing

  • Distributed communication package (Шаблон:Code), providing collective and peer-to-peer primitives

Low-level CUDA features

  • Stream and event
  • Memory pool
  • Profiler
  • Host API binding
  • CUDA Python support[14]

Interoperability

Examples

Array creation

>>> import cupy as cp
>>> x = cp.array([1, 2, 3])
>>> x
array([1, 2, 3])
>>> y = cp.arange(10)
>>> y
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

Basic operations

>>> import cupy as cp
>>> x = cp.arange(12).reshape(3, 4).astype(cp.float32)
>>> x
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.]], dtype=float32)
>>> x.sum(axis=1)
array([ 6., 22., 38.], dtype=float32)

Raw CUDA C/C++ kernel

>>> import cupy as cp
>>> kern = cp.RawKernel(r'''
... extern "C" __global__
... void multiply_elemwise(const float* in1, const float* in2, float* out) {
...     int tid = blockDim.x * blockIdx.x + threadIdx.x;
...     out[tid] = in1[tid] * in2[tid];
... }
... ''', 'multiply_elemwise')
>>> in1 = cp.arange(16, dtype=cp.float32).reshape(4, 4)
>>> in2 = cp.arange(16, dtype=cp.float32).reshape(4, 4)
>>> out = cp.zeros((4, 4), dtype=cp.float32)
>>> kern((4,), (4,), (in1, in2, out))  # grid, block and arguments
>>> out
array([[  0.,   1.,   4.,   9.],
       [ 16.,  25.,  36.,  49.],
       [ 64.,  81., 100., 121.],
       [144., 169., 196., 225.]], dtype=float32)

Applications

See also

References

Шаблон:Reflist

External links

Шаблон:SciPy ecosystem