High-level GPU-accelerated Program in Python

After some basic digging, sadly I found that currently (2015.12) there’s (probably) no mature solution that offers simple high-level GPU programming in Python. The commercial NumbaPro is promising, especially considering the success of its free version numba. However from my test, the JITted program is still highly unstable, easily crashes when fed with large data. And the acceleration is still far from my expectations.

According to this article from stackoverflow (2015.6), the scikit-cuda package can provide some frequently used functions such as matrix multiplication. Although there are some restrictions: (1) Python27 only; (2) No customization ability (???).

numba.cuda is a relatively low-level package. It sounds interesting, but I just don’t have that time to dig deep into that. The ideal solution is that the guy at Continuum can keep improving the guvectorize in the NumbaPro, and make it really usable.

links

social