Download: https://github.com/dimitrs/cpp-opencl
The cpp-opencl project provides a way to make programming GPUs easy for the developer. It allows you to implement data parallelism on a GPU directly in C++ instead of using OpenCL. See the example below. The code in the parallel_for_each lambda function is executed on the GPU, and all the rest is executed on the CPU. More specifically, the “square” function is executed both on the CPU (via a call to std::transform) and the GPU (via a call to compute::parallel_for_each). Conceptually, compute::parallel_for_each is similar to std::transform except that one executes code on the GPU and the other on the CPU.
The cpp-opencl project provides a way to make programming GPUs easy for the developer. It allows you to implement data parallelism on a GPU directly in C++ instead of using OpenCL. See the example below. The code in the parallel_for_each lambda function is executed on the GPU, and all the rest is executed on the CPU. More specifically, the “square” function is executed both on the CPU (via a call to std::transform) and the GPU (via a call to compute::parallel_for_each). Conceptually, compute::parallel_for_each is similar to std::transform except that one executes code on the GPU and the other on the CPU.
#include <vector>
#include <stdio.h>
#include "ParallelForEach.h"
template<class T>
T square(T x)
{
return x * x;
}
void func() {
std::vector<int> In {1,2,3,4,5,6};
std::vector<int> OutGpu(6);
std::vector<int> OutCpu(6);
compute::parallel_for_each(In.begin(), In.end(), OutGpu.begin(), [](int x){
return square(x);
});
std::transform(In.begin(), In.end(), OutCpu.begin(), [](int x) {
return square(x);
});
//
// Do something with OutCpu and OutGpu …..........
//
}
int main() {
func();
return 0;
}
Function Overloading
Additionally, it is possible to overload functions. The “A::GetIt” member function below is overloaded. The function marked as “gpu” will be executed on the GPU and other on the CPU.
struct A {
int GetIt() const __attribute__((amp_restrict("cpu"))) {
return 2;
}
int GetIt() const __attribute__((amp_restrict("gpu"))) {
return 4;
}
};
compute::parallel_for_each(In.begin(), In.end(), OutGpu.begin(), [](int x){
A a;
return a.GetIt(); // returns 4
});
Build the Executable
The tool uses a special compiler based on Clang/LLVM.
cpp_opencl -x c++ -std=c++11 -O3 -o Input.cc.o -c Input.cc
The above command generates four files:
1. Input.cc.o
2. Input.cc.cl
3. Input.cc_cpu.cpp
4. Input.cc_gpu.cpp
Use the Clang C++ compiler directly to link:
clang++ ./Input.cc.o -o test -lOpenCL
Then just execute:
./test
Good luck & keep writing such awesome content.
ReplyDeleteBest dental clinic in Faridabad
best child dentist in greater Noida
coaching in Greater Noida for CBSE 9 10 11 12th
Feeling wonderful after reading as such post. Thank you for sharing. Keep going
Virgin Linseed Oil BP
Pure Linseed Oil
Best content & valuable as well. Thanks for sharing this content.
ReplyDeleteApproved Auditor in DAFZA
Approved Auditor in RAKEZ
Approved Auditor in JAFZA
i heard about this blog & get actually whatever i was finding. Nice post love to read this blog
Approved Auditor in DMCC