Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
|
3 years ago | |
---|---|---|
.. | ||
impl | 3 years ago | |
include/megbrain/jit | 3 years ago | |
test | 3 years ago | |
README.md | 5 years ago |
A optimization for MegBrain by just-in-time compilation.
JIT can reduce the global memory access times by fusing elemwise kernels into a
single larger one fusion kernel to improve performence.
For some regular expressions like a * b + c and a * b + c * d, MegBrain have
alreay did FMA3_FUSE and FMA4_FUSE optimization. Now MegBrain can speed up any
elemwise expressions by JIT.
a * b * c
opt0 | opt2 | opt3(with jit) | |
---|---|---|---|
speed | 100% | 100% | 150% |
a * b + c
opt0 | opt2(with fma3) | opt3(with jit) | |
---|---|---|---|
speed | 100% | 150% | 150% |
Alexnet with adam
opt0 | opt2 | opt3(with jit) | |
---|---|---|---|
speed | 100% | 103% | 114% |
Resnet with adam, training
opt0 | opt2 | opt3(with jit) | |
---|---|---|---|
speed | 100% | 122% | 124% |
Detection the subgraph can be fused and compiling the subgraph into a fusion
kernel are the most two important parts in JIT.
The detection is implemented in impl/fusion_pass.cpp,
the main detection logic is in function Fusion::Impl::on_opr. Compared to nnvm
fusion, our fusion logic can fuse more operators into one fusion kernel.
For now , JIT just support CUDA, but it has reserved interface to extend other
platforms.
You can set graph_opt_level
to 3 to enable JIT.
In python
cg = mgb.comp_graph()
cg.set_option('graph_opt_level', 3)
You can set environment variable MGB_JIT_BACKEND
to select the JIT backend.
Backend | Platforms | Reduction support | Kernel Binary Cache | Kernel Reuse | Noncontig Input |
---|---|---|---|---|---|
HALIDE | CUDA | Y | No | Shape | No |
NVRTC | CUDA | N | Via PersistentCache | Bcast type | Monotone |
To enable fusion of Reduce oprs, set graph_opt.jit = 2
in graph options.
JIT may produce temporary files. The default working directory is
a temp dir and can be changed via MGB_JIT_WORKDIR
environment variable. Set
MGB_JIT_KEEP_INTERM
to keep intermediate files (such as generated sources and
object files) for debugging.
MGB_HALIDE_DEBUG
: enable debug print for Halide.MegEngine 安装包中集成了使用 GPU 运行代码所需的 CUDA 环境,不用区分 CPU 和 GPU 版。 如果想要运行 GPU 程序,请确保机器本身配有 GPU 硬件设备并安装好驱动。 如果你想体验在云端 GPU 算力平台进行深度学习开发的感觉,欢迎访问 MegStudio 平台
C++ Cuda Python C SVG other