Megvii Engine Team
|
7b17c1180e
|
refactor(dnn): make cudnn_frontend work
GitOrigin-RevId: f089f93494
|
3 years ago |
Megvii Engine Team
|
35e9cc9845
|
feat(dnn/cuda): add cudnn frontend api
GitOrigin-RevId: 9b18a57893
|
3 years ago |
Megvii Engine Team
|
ab8f6398d9
|
fix(test): make test install
GitOrigin-RevId: e38d6c5e9f
|
3 years ago |
Megvii Engine Team
|
99cfefbfe0
|
fix(test): fix test copybara
GitOrigin-RevId: 19b7bdf377
|
3 years ago |
Megvii Engine Team
|
0d7ace15c8
|
fix(mgb/dnn): suport fp16 for resize nhwc
GitOrigin-RevId: bb04d2a801
|
3 years ago |
Megvii Engine Team
|
f12b75c04b
|
perf(dnn/fallback): optimize some corner case in reduce
GitOrigin-RevId: 1185594301
|
3 years ago |
Megvii Engine Team
|
b55942a94d
|
feat(dnn/naive/norm,-dnn/cuda/norm,-dnn/test/norm): add norm dnn opr,
fwd only
GitOrigin-RevId: 989474168d
|
3 years ago |
Megvii Engine Team
|
5e306b756b
|
feat(x86): make conv1x1 and im2col available on with x86-NCHW44
add AlgoF32GiMK4Pack4x12 matrix_mul algo
GitOrigin-RevId: 47cfe1d733
|
3 years ago |
Megvii Engine Team
|
5873d5f56f
|
feat(gi): add more gi api
GitOrigin-RevId: e2ae8c0873
|
3 years ago |
Megvii Engine Team
|
bbafe69974
|
feat(dnn): add elemwise COND_LT_MOV
GitOrigin-RevId: 444cd6825a
|
3 years ago |
Megvii Engine Team
|
0a266d7a1d
|
feat(riscv): speed up bazel build and fix rv64gc without rvv build
GitOrigin-RevId: 9bcbb4a9a0
|
3 years ago |
Megvii Engine Team
|
7d7cc3c8da
|
feat(gi/riscv): add gi support with risc-v
GitOrigin-RevId: a28fec3ce5
|
3 years ago |
Megvii Engine Team
|
4e66e0eb1f
|
feat(megdnn/softmax): add softmax operator in OpenCL
GitOrigin-RevId: e207d6ceb4
|
3 years ago |
Megvii Engine Team
|
96d90be1c6
|
feat(dnn): fallback support int4 relayout
GitOrigin-RevId: 3625f58470
|
3 years ago |
Megvii Engine Team
|
98b5ee78c1
|
feat(mge/dnn): add lamb optimizer
GitOrigin-RevId: 5a27157456
|
3 years ago |
Megvii Engine Team
|
9e0583e13a
|
feat(dnn/arm_common): add arm_common chanwise dot 11x11
GitOrigin-RevId: 84e0815a59
|
3 years ago |
Megvii Engine Team
|
c2500cdb7e
|
chore(license): apply change caused by bot forward rebase
GitOrigin-RevId: 2707bc03c9
|
3 years ago |
Megvii Engine Team
|
5f0e7ffb64
|
feat(fallback): add FB_GI_F32_4x12 benchmark
GitOrigin-RevId: cfacf31b28
|
3 years ago |
Megvii Engine Team
|
f249d387de
|
feat(fallback): imp gi matmul FB_GI_F32_4x12 algo
GitOrigin-RevId: 16255e7a72
|
3 years ago |
Megvii Engine Team
|
03f78547f7
|
feat(dnn/arm_common): add 9x9s1s2 dot chanwise kernel
GitOrigin-RevId: a28a97fcb5
|
3 years ago |
Megvii Engine Team
|
c2e9860feb
|
chore(license): remove all license in file header
GitOrigin-RevId: a0e31247a6
|
3 years ago |
Megvii Engine Team
|
e98049d77e
|
feat(fallback): move arm_common resize f32 algo to fallback gi
GitOrigin-RevId: 3370cdc57a
|
3 years ago |
Megvii Engine Team
|
91aaafd587
|
feat(fallback): move arm_common pooling f32 algo to fallback gi
GitOrigin-RevId: 1bddd6dc2c
|
3 years ago |
Megvii Engine Team
|
af6cdb2004
|
feat(fallback): fix ci
GitOrigin-RevId: b6e4e59553
|
3 years ago |
Megvii Engine Team
|
e4cc85e52c
|
feat(fallback): move arm_common f32 convbias to fallback gi
GitOrigin-RevId: ccf8b589be
|
3 years ago |
Megvii Engine Team
|
0f1afb0935
|
feat(fallback): imp gi matmul AlgoF32GiMK4_4x8 algo,
move AlgoF32GemvMK4 from arm_common to fallback
GitOrigin-RevId: 6c065abf99
|
3 years ago |
Megvii Engine Team
|
410dcb6c69
|
feat(fallback): add more gi api for conv, and add gi API test
GitOrigin-RevId: 24eb237502
|
3 years ago |
Megvii Engine Team
|
70209667e8
|
fix(dnn/test): fix some bug when force_deduce_layout is off
GitOrigin-RevId: d7ccc397df
|
3 years ago |
Megvii Engine Team
|
7dc347697a
|
feat(dnn/cuda): add typecvt uint16
GitOrigin-RevId: d1368c414e
|
3 years ago |
Megvii Engine Team
|
115c4592c0
|
fix(dnn/opencl): fix opencl elemwise tuning issue
GitOrigin-RevId: 317640547d
|
3 years ago |
Megvii Engine Team
|
ffbf8fad6c
|
feat(fallback): add general intrinsic to elemwise multitype
GitOrigin-RevId: fe7b335545
|
3 years ago |
Megvii Engine Team
|
4c0bff1dba
|
refactor(megdnn): refactor TEGRA_X1/X2 macro
GitOrigin-RevId: 1aa78712c6
|
3 years ago |
Megvii Engine Team
|
758549b936
|
feat(megengine): support tx2
GitOrigin-RevId: d1175a1f4a
|
3 years ago |
Megvii Engine Team
|
b6ad457269
|
feat(cuda): support int1 simplewq conv
GitOrigin-RevId: 9c37c41bc7
|
3 years ago |
Megvii Engine Team
|
331567af5d
|
fix(opencl/ci): misc opt and fix:
1: fix megbrain test failed on mali 2.1 devices
2: reduce ci time (about reduce 20min)
GitOrigin-RevId: 4dcdcd48a6
|
3 years ago |
Megvii Engine Team
|
ff6a3bb819
|
fix(fallback): delete the repeat opcaller in fallback and arm_common
GitOrigin-RevId: 87046b8197
|
3 years ago |
Megvii Engine Team
|
547945e854
|
feat(fallback): support general intrinsic in elemwise in fallback
GitOrigin-RevId: 96ff2e88cc
|
3 years ago |
Megvii Engine Team
|
fd6f8e58b0
|
feat(mgb/dtype): add dtype qint1
GitOrigin-RevId: abe9fb68b1
|
3 years ago |
Megvii Engine Team
|
8c415f4ed7
|
feat(dnn): cuda nhwc nearest resize support not 1 or 3 channel
GitOrigin-RevId: 764504c341
|
3 years ago |
Megvii Engine Team
|
87de704a46
|
feat(gopt): fuse conv h_swish
GitOrigin-RevId: a3d12991fb
|
3 years ago |
Megvii Engine Team
|
04193e3bd1
|
feat(dnn): add nearest mode for remap and resize
GitOrigin-RevId: 31e7b72a78
|
3 years ago |
Megvii Engine Team
|
e34a642b31
|
feat(fallback): reduce support general intrinsic
GitOrigin-RevId: f250aa7b2a
|
3 years ago |
Megvii Engine Team
|
d7b0994a3e
|
feat(cuda): add fp16 compute 16 kernel
GitOrigin-RevId: e03435be02
|
3 years ago |
Megvii Engine Team
|
8a2e92bd6c
|
refactor(cuda): depthwish large kernel
GitOrigin-RevId: dade8710b4
|
3 years ago |
Megvii Engine Team
|
6b8a69d5b6
|
feat(cuda): float16 depthwise large kernel conv compute fp32
GitOrigin-RevId: 3050d48f26
|
3 years ago |
Megvii Engine Team
|
bc385b5374
|
feat(cuda): support float16 depthwise large kernel conv
GitOrigin-RevId: fdc1b15fbc
|
3 years ago |
Megvii Engine Team
|
7d2063e35a
|
perf(cuda): speedup conv backward data with small feature map and large filter size
GitOrigin-RevId: 85592bca6b
|
3 years ago |
Megvii Engine Team
|
72403e8929
|
perf(cuda): speedup chanwise conv with small feature map and large filter size
GitOrigin-RevId: e65b2ce856
|
3 years ago |
Megvii Engine Team
|
ab6d12caff
|
feat(mge): add conv padding mode
GitOrigin-RevId: 147ced856e
|
3 years ago |
Megvii Engine Team
|
47fe766310
|
feat(dnn/cuda): add implicit bmm kernels for large kernel depthwise convolution backward filter opr
GitOrigin-RevId: 932e7689e8
|
3 years ago |