Megvii Engine Team
|
bbafe69974
|
feat(dnn): add elemwise COND_LT_MOV
GitOrigin-RevId: 444cd6825a
|
3 years ago |
Megvii Engine Team
|
0a266d7a1d
|
feat(riscv): speed up bazel build and fix rv64gc without rvv build
GitOrigin-RevId: 9bcbb4a9a0
|
3 years ago |
Megvii Engine Team
|
7d7cc3c8da
|
feat(gi/riscv): add gi support with risc-v
GitOrigin-RevId: a28fec3ce5
|
3 years ago |
Megvii Engine Team
|
4e66e0eb1f
|
feat(megdnn/softmax): add softmax operator in OpenCL
GitOrigin-RevId: e207d6ceb4
|
3 years ago |
Megvii Engine Team
|
96d90be1c6
|
feat(dnn): fallback support int4 relayout
GitOrigin-RevId: 3625f58470
|
3 years ago |
Megvii Engine Team
|
98b5ee78c1
|
feat(mge/dnn): add lamb optimizer
GitOrigin-RevId: 5a27157456
|
3 years ago |
Megvii Engine Team
|
9e0583e13a
|
feat(dnn/arm_common): add arm_common chanwise dot 11x11
GitOrigin-RevId: 84e0815a59
|
3 years ago |
Megvii Engine Team
|
c2500cdb7e
|
chore(license): apply change caused by bot forward rebase
GitOrigin-RevId: 2707bc03c9
|
3 years ago |
Megvii Engine Team
|
5f0e7ffb64
|
feat(fallback): add FB_GI_F32_4x12 benchmark
GitOrigin-RevId: cfacf31b28
|
3 years ago |
Megvii Engine Team
|
f249d387de
|
feat(fallback): imp gi matmul FB_GI_F32_4x12 algo
GitOrigin-RevId: 16255e7a72
|
3 years ago |
Megvii Engine Team
|
03f78547f7
|
feat(dnn/arm_common): add 9x9s1s2 dot chanwise kernel
GitOrigin-RevId: a28a97fcb5
|
3 years ago |
Megvii Engine Team
|
c2e9860feb
|
chore(license): remove all license in file header
GitOrigin-RevId: a0e31247a6
|
3 years ago |
Megvii Engine Team
|
e98049d77e
|
feat(fallback): move arm_common resize f32 algo to fallback gi
GitOrigin-RevId: 3370cdc57a
|
3 years ago |
Megvii Engine Team
|
91aaafd587
|
feat(fallback): move arm_common pooling f32 algo to fallback gi
GitOrigin-RevId: 1bddd6dc2c
|
3 years ago |
Megvii Engine Team
|
af6cdb2004
|
feat(fallback): fix ci
GitOrigin-RevId: b6e4e59553
|
3 years ago |
Megvii Engine Team
|
e4cc85e52c
|
feat(fallback): move arm_common f32 convbias to fallback gi
GitOrigin-RevId: ccf8b589be
|
3 years ago |
Megvii Engine Team
|
0f1afb0935
|
feat(fallback): imp gi matmul AlgoF32GiMK4_4x8 algo,
move AlgoF32GemvMK4 from arm_common to fallback
GitOrigin-RevId: 6c065abf99
|
3 years ago |
Megvii Engine Team
|
410dcb6c69
|
feat(fallback): add more gi api for conv, and add gi API test
GitOrigin-RevId: 24eb237502
|
3 years ago |
Megvii Engine Team
|
70209667e8
|
fix(dnn/test): fix some bug when force_deduce_layout is off
GitOrigin-RevId: d7ccc397df
|
3 years ago |
Megvii Engine Team
|
7dc347697a
|
feat(dnn/cuda): add typecvt uint16
GitOrigin-RevId: d1368c414e
|
3 years ago |
Megvii Engine Team
|
115c4592c0
|
fix(dnn/opencl): fix opencl elemwise tuning issue
GitOrigin-RevId: 317640547d
|
3 years ago |
Megvii Engine Team
|
ffbf8fad6c
|
feat(fallback): add general intrinsic to elemwise multitype
GitOrigin-RevId: fe7b335545
|
3 years ago |
Megvii Engine Team
|
4c0bff1dba
|
refactor(megdnn): refactor TEGRA_X1/X2 macro
GitOrigin-RevId: 1aa78712c6
|
3 years ago |
Megvii Engine Team
|
758549b936
|
feat(megengine): support tx2
GitOrigin-RevId: d1175a1f4a
|
3 years ago |
Megvii Engine Team
|
b6ad457269
|
feat(cuda): support int1 simplewq conv
GitOrigin-RevId: 9c37c41bc7
|
3 years ago |
Megvii Engine Team
|
331567af5d
|
fix(opencl/ci): misc opt and fix:
1: fix megbrain test failed on mali 2.1 devices
2: reduce ci time (about reduce 20min)
GitOrigin-RevId: 4dcdcd48a6
|
3 years ago |
Megvii Engine Team
|
ff6a3bb819
|
fix(fallback): delete the repeat opcaller in fallback and arm_common
GitOrigin-RevId: 87046b8197
|
3 years ago |
Megvii Engine Team
|
547945e854
|
feat(fallback): support general intrinsic in elemwise in fallback
GitOrigin-RevId: 96ff2e88cc
|
3 years ago |
Megvii Engine Team
|
fd6f8e58b0
|
feat(mgb/dtype): add dtype qint1
GitOrigin-RevId: abe9fb68b1
|
3 years ago |
Megvii Engine Team
|
8c415f4ed7
|
feat(dnn): cuda nhwc nearest resize support not 1 or 3 channel
GitOrigin-RevId: 764504c341
|
3 years ago |
Megvii Engine Team
|
87de704a46
|
feat(gopt): fuse conv h_swish
GitOrigin-RevId: a3d12991fb
|
3 years ago |
Megvii Engine Team
|
04193e3bd1
|
feat(dnn): add nearest mode for remap and resize
GitOrigin-RevId: 31e7b72a78
|
3 years ago |
Megvii Engine Team
|
e34a642b31
|
feat(fallback): reduce support general intrinsic
GitOrigin-RevId: f250aa7b2a
|
3 years ago |
Megvii Engine Team
|
d7b0994a3e
|
feat(cuda): add fp16 compute 16 kernel
GitOrigin-RevId: e03435be02
|
3 years ago |
Megvii Engine Team
|
8a2e92bd6c
|
refactor(cuda): depthwish large kernel
GitOrigin-RevId: dade8710b4
|
3 years ago |
Megvii Engine Team
|
6b8a69d5b6
|
feat(cuda): float16 depthwise large kernel conv compute fp32
GitOrigin-RevId: 3050d48f26
|
3 years ago |
Megvii Engine Team
|
bc385b5374
|
feat(cuda): support float16 depthwise large kernel conv
GitOrigin-RevId: fdc1b15fbc
|
3 years ago |
Megvii Engine Team
|
7d2063e35a
|
perf(cuda): speedup conv backward data with small feature map and large filter size
GitOrigin-RevId: 85592bca6b
|
3 years ago |
Megvii Engine Team
|
72403e8929
|
perf(cuda): speedup chanwise conv with small feature map and large filter size
GitOrigin-RevId: e65b2ce856
|
3 years ago |
Megvii Engine Team
|
ab6d12caff
|
feat(mge): add conv padding mode
GitOrigin-RevId: 147ced856e
|
3 years ago |
Megvii Engine Team
|
47fe766310
|
feat(dnn/cuda): add implicit bmm kernels for large kernel depthwise convolution backward filter opr
GitOrigin-RevId: 932e7689e8
|
3 years ago |
Megvii Engine Team
|
6cefabe734
|
fix(dnn/cuda): fix ci
GitOrigin-RevId: 8267e5f9dd
|
3 years ago |
Megvii Engine Team
|
888f4e46ae
|
feat(dnn/cuda): add implicit bmm large kernel dwconv2d dgrad kernels
GitOrigin-RevId: fcb7974d62
|
3 years ago |
Megvii Engine Team
|
08d8635ff5
|
feat(dnn/cuda): add implicit bmm large kernel dwconv2d fprop impl
GitOrigin-RevId: feb09ebb58
|
3 years ago |
Megvii Engine Team
|
260923e11c
|
perf(aarch64): optimize aarch64 uint16 relayout with block_w==3
GitOrigin-RevId: fe6aaaac0c
|
3 years ago |
Megvii Engine Team
|
95ac055538
|
feat(dnn,mgb,imperative): add diag opr implement
GitOrigin-RevId: 43016ffa2b
|
3 years ago |
Megvii Engine Team
|
39d77fb55a
|
feat(arm): add arm rnn_cell/lstm_cell/lstm optimized kernel
GitOrigin-RevId: b9bb7352bc
|
3 years ago |
Megvii Engine Team
|
ee0b95e935
|
feat(dnn/elemwise/arm_common): support part of arm ternary elemwise multithread
BCAST111C_VEC_BCAST111C and BCAST101_VEC_BCAST101
GitOrigin-RevId: 0e26553c90
|
3 years ago |
Megvii Engine Team
|
cbbca5fb10
|
feat(mge): add softmax op use cudnn api
GitOrigin-RevId: 7734ebf8c4
|
3 years ago |
Megvii Engine Team
|
20b42a8c3b
|
fix(dnn): add naive lstm kernel
GitOrigin-RevId: f08ef810cf
|
3 years ago |