Megvii Engine Team
9b4b910dc1
feat(dnn/cuda): integrate cutlass operation table and replace all cutlass wrappers
GitOrigin-RevId: 2a70335441
3 years ago
Megvii Engine Team
b18feaab33
feat(dnn/cuda): use cutlass remove shared load imma conv kernel
GitOrigin-RevId: 0b5574f526
4 years ago
Megvii Engine Team
1af350c6d2
feat(dnn): add fill kernel
GitOrigin-RevId: d2cee3a7a0
3 years ago
Megvii Engine Team
3eb0505f9b
feat(imperative): add support for quantized conv transpose2d
GitOrigin-RevId: ffd6431299
3 years ago
Megvii Engine Team
c68e669530
feat(bazel/windows/xp/sp2/inference): implement inference on windows xp
(os vesion >= sp2) build with bazel
* bazel build support(define __DEPLOY_ON_XP_SP2__ when deploy on xp sp2):
(dbg)./bazel build //brain/megbrain:load_and_run --cpu='x86_windows_xp'
--compiler='clang_cl' -c dbg --copt "-D__DEPLOY_ON_XP_SP2__=1"
(opt)./bazel build //brain/megbrain:load_and_run --cpu='x86_windows_xp'
--compiler='clang_cl' -c opt --copt "-D__DEPLOY_ON_XP_SP2__=1"
* internal behavior:
will define MGB_HAVE_THREAD=0 when enable __DEPLOY_ON_XP_SP2__
* refer to
https://docs.microsoft.com/en-us/cpp/build/configuring-programs-for-windows-xp?view=msvc-160
xp sp2(x86) do not support vc runtime fully, casused by KERNEL32.dll do not
implement some base apis for c++ std function, for example,
std::mutex/std::thread/std::condition_variable as a workround, we will
disable some MegEngine features on xp sp2 env, for exampe, multi-thread etc!
* about DNN_MUTEX/MGB_MUTEX, if your code will build in inference
code (even CPU backends), please replace std::mutex to DNN_MUTEX/MGB_MUTEX,
* about multi-thread, if you code need multi-thread support, please
enable it when MGB_HAVE_THREAD=1
* about test build env status
1: Visual Studio 2019(MSVC version <= 14.26.28801)---- pass
2: Visual Studio 2019(MSVC version > 14.26.28801) ---- failed
caused by this 'new' version will put VCR depends on win7
KERNEL32.DLL, this may be fixed at Visual Studio 2019 later version
but we do not test at this MR merge point
3: Visual Studio 2017 ---------- pass
4: Visual Studio 2014 ---------- pass
GitOrigin-RevId: 65ac48b95e
3 years ago
Megvii Engine Team
3b452d8c16
feat(mgb): cuda conv support nhwc format and fp16 dtype
GitOrigin-RevId: b8ddcd108a
3 years ago
Megvii Engine Team
10bcf75767
feat(dnn/x86): add algo for x86 max pooling for Window size bigger than 10 and S1 under NCHW88
GitOrigin-RevId: 613a18dd91
3 years ago
Megvii Engine Team
ddba5c9674
fix(core): fix nr_threads is zero
GitOrigin-RevId: 0ccbe3c69b
3 years ago
Megvii Engine Team
67f117882b
perf(arm_common): add elemwise unary multithread support
GitOrigin-RevId: 8eac123f67
3 years ago
Megvii Engine Team
3afa3893d7
perf(arm_common): optimize arm common pooling 9x9 and 13x13
GitOrigin-RevId: 33d5a62478
3 years ago
Megvii Engine Team
2c4ff5431b
fix(mgb): fix cudnn ConvolutionBackwardData
GitOrigin-RevId: 1fffc06eaa
3 years ago
Megvii Engine Team
287cab49c2
fix(mgb/sereg): fix rng operator compatibility
GitOrigin-RevId: 66d1694035
3 years ago
Megvii Engine Team
2aba0378b9
refactor(mgb/dnn): fix group conv is_available
GitOrigin-RevId: b279909168
3 years ago
Megvii Engine Team
4a92346b7a
refactor(mgb): refactor group conv3d
GitOrigin-RevId: 15360a3a41
3 years ago
Megvii Engine Team
6ce212d2e0
refactor(mgb): refactor group conv
GitOrigin-RevId: 7afd312690
4 years ago
Megvii Engine Team
f76a2cc2c6
feat(mge/opr): add silu and gelu
GitOrigin-RevId: 75aa42947e
3 years ago
Megvii Engine Team
f8b0f2cb91
build(dnn/cutlass): fix build for cutlass
GitOrigin-RevId: 9aa095fe84
3 years ago
Megvii Engine Team
869a03271b
perf(mgb): disable FoldingConvBiasDimshufflePass in cuda10 for performance
GitOrigin-RevId: d1b95a6f01
3 years ago
Megvii Engine Team
239916a997
fix(mgb/gopt): fix testcase for enable nchw64 pass
GitOrigin-RevId: 2ae8d1608d
4 years ago
Megvii Engine Team
4eda338876
feat(dnn/cuda): generate cutlass kimpls using cmake and bazel
GitOrigin-RevId: da3bcfb85a
4 years ago
Megvii Engine Team
8d248a6a9a
fix(dnn/cuda): fix testcase for fallback nchw qs8 conv
GitOrigin-RevId: 646440db59
4 years ago
Megvii Engine Team
894a2407c2
feat(dnn/cuda): add relayout format kernel for nchw <-> nhwc
GitOrigin-RevId: e11f3e5408
4 years ago
Megvii Engine Team
43c59204df
refactor(dnn/cuda): refactor relayout format kernels
GitOrigin-RevId: ab86e66533
4 years ago
Megvii Engine Team
f41a808694
feat(dnn/cuda): add nhwc int4 conv support
GitOrigin-RevId: 5236b235d0
4 years ago
Megvii Engine Team
5a14a89224
refactor(dnn/cuda): refactor cutlass kernel generator for gemm and gemv
GitOrigin-RevId: 11d78ab227
4 years ago
Megvii Engine Team
b33217d8f0
refactor(dnn/cuda): refactor cutlass kernel generator for deconv operation
GitOrigin-RevId: 88e962a912
4 years ago
Megvii Engine Team
4abf7bd36f
refactor(dnn/cuda): refactor kernel generator for cutlass convolution kernels
GitOrigin-RevId: 7882f9c68c
4 years ago
Megvii Engine Team
b4687ce8da
feat(dnn/cuda): add convolution with i8 input and u4 output
GitOrigin-RevId: 8be439abf1
4 years ago
Megvii Engine Team
00083d13b6
fix(dnn/cuda): fix recursive algo search for fallback_nchw_qs8
GitOrigin-RevId: 6be2991224
4 years ago
Megvii Engine Team
66f70578c2
feat(dnn/cuda): add convolution with i8 input and i4 output
GitOrigin-RevId: 10512645d5
4 years ago
Megvii Engine Team
7d3df995cb
feat(gopt/inference): allow Float32 output dtype in EnableNCHW4Pass
GitOrigin-RevId: 81100dbaf7
4 years ago
Megvii Engine Team
633016a962
fix(dnn/cuda): fix AlgoFallbackNCHWQS8 to support Float32 dst
GitOrigin-RevId: 06f90f5cf3
4 years ago
Megvii Engine Team
4e4497b903
refactor(mgb/dnn): x86 pooling rebase algochooser
GitOrigin-RevId: 96cdc57180
3 years ago
Megvii Engine Team
a33c3b73bd
refactor(mgb/dnn): arm pooling rebase algochooser
GitOrigin-RevId: 21d17e647a
3 years ago
Megvii Engine Team
ea70d99b4d
fix(mge/convbias): make fallback convbias support nhwcd4 layout
GitOrigin-RevId: 1c306f867d
4 years ago
Megvii Engine Team
43098fb8f1
feat(mge): add SlidingWindowTranspose opr
BREAKING CHANGE:
GitOrigin-RevId: 54d726d2fe
4 years ago
Megvii Engine Team
b078dda90b
feat(mge/random): add some random op and remove random/distrbution.py
GitOrigin-RevId: 4c05ebc266
4 years ago
Megvii Engine Team
83e4c9d7ab
fix(opencl): open opencl topk test when opencl beyond 2.0
GitOrigin-RevId: f2ad6b4af2
4 years ago
Megvii Engine Team
f30c0e06a6
feat(mgb/opr): add lsq opr
GitOrigin-RevId: 45494a2b57
4 years ago
Megvii Engine Team
25932352e9
refactor(mgb/dnn): rocm pooling rebase algochooser
GitOrigin-RevId: 95be929841
4 years ago
Megvii Engine Team
1cfdbc565c
feat(dnn): add deterministic max pooling
GitOrigin-RevId: 9ab4c7a748
4 years ago
Megvii Engine Team
20ab82d00c
fix(tee): fix tee crash
GitOrigin-RevId: 379f970c87
4 years ago
Megvii Engine Team
a5060a2bfe
feat(mgb/opr): add check_has_inf kernel and opr
GitOrigin-RevId: 0d042dbfce
4 years ago
Megvii Engine Team
3597a6dbd7
feat(dnn/arm): nchw_nchw44 conv support 1x1s1
GitOrigin-RevId: 8c8f7d7c76
4 years ago
Megvii Engine Team
d915c5a3fd
refactor(mgb): make convolution3D handle noncontiguous tensors
GitOrigin-RevId: 3d3c31b021
4 years ago
Megvii Engine Team
d04cd67faf
refactor(mgb): make conv-backward-filter handle noncontiguous tensors
GitOrigin-RevId: 44c586f912
4 years ago
Megvii Engine Team
44376f702a
refactor(mgb): make conv-backward-data handle noncontiguous tensors
GitOrigin-RevId: 0a8f66f9d3
4 years ago
Megvii Engine Team
7b2a76d1ee
refactor(mgb): make conv handle noncontiguous tensors
GitOrigin-RevId: 86282709b3
4 years ago
Megvii Engine Team
ca2828ddcb
fix(dnn/x86): fix x86 int8 matmul ldc bug
GitOrigin-RevId: 2502f99000
4 years ago
Megvii Engine Team
40085acbae
fix(mgb): remove unnecessary cudnn8 warning
GitOrigin-RevId: 04cf1bfca9
4 years ago