335 Commits (release-1.9)

Author SHA1 Message Date
  Megvii Engine Team 09dab38748 feat(cuda): support int1 simplewq conv 3 years ago
  Megvii Engine Team fd6f8e58b0 feat(mgb/dtype): add dtype qint1 3 years ago
  Megvii Engine Team 8c415f4ed7 feat(dnn): cuda nhwc nearest resize support not 1 or 3 channel 3 years ago
  Megvii Engine Team 87de704a46 feat(gopt): fuse conv h_swish 3 years ago
  Megvii Engine Team 04193e3bd1 feat(dnn): add nearest mode for remap and resize 3 years ago
  Megvii Engine Team e34a642b31 feat(fallback): reduce support general intrinsic 3 years ago
  Megvii Engine Team d7b0994a3e feat(cuda): add fp16 compute 16 kernel 3 years ago
  Megvii Engine Team 8a2e92bd6c refactor(cuda): depthwish large kernel 3 years ago
  Megvii Engine Team 6b8a69d5b6 feat(cuda): float16 depthwise large kernel conv compute fp32 3 years ago
  Megvii Engine Team bc385b5374 feat(cuda): support float16 depthwise large kernel conv 3 years ago
  Megvii Engine Team 7d2063e35a perf(cuda): speedup conv backward data with small feature map and large filter size 3 years ago
  Megvii Engine Team 72403e8929 perf(cuda): speedup chanwise conv with small feature map and large filter size 3 years ago
  Megvii Engine Team ab6d12caff feat(mge): add conv padding mode 3 years ago
  Megvii Engine Team 47fe766310 feat(dnn/cuda): add implicit bmm kernels for large kernel depthwise convolution backward filter opr 3 years ago
  Megvii Engine Team 6cefabe734 fix(dnn/cuda): fix ci 3 years ago
  Megvii Engine Team 888f4e46ae feat(dnn/cuda): add implicit bmm large kernel dwconv2d dgrad kernels 3 years ago
  Megvii Engine Team 08d8635ff5 feat(dnn/cuda): add implicit bmm large kernel dwconv2d fprop impl 3 years ago
  Megvii Engine Team 260923e11c perf(aarch64): optimize aarch64 uint16 relayout with block_w==3 3 years ago
  Megvii Engine Team 95ac055538 feat(dnn,mgb,imperative): add diag opr implement 3 years ago
  Megvii Engine Team 39d77fb55a feat(arm): add arm rnn_cell/lstm_cell/lstm optimized kernel 3 years ago
  Megvii Engine Team ee0b95e935 feat(dnn/elemwise/arm_common): support part of arm ternary elemwise multithread 3 years ago
  Megvii Engine Team cbbca5fb10 feat(mge): add softmax op use cudnn api 3 years ago
  Megvii Engine Team 20b42a8c3b fix(dnn): add naive lstm kernel 3 years ago
  Megvii Engine Team 2faa6ea5a9 Merge pull request #213 from kxz18:rnn 3 years ago
  Megvii Engine Team 82be0aaced test(dnn): fix compute capability requirement for NCHWX test 3 years ago
  Megvii Engine Team 1999307015 feat(mgb/opr): add dropout kernel 3 years ago
  Megvii Engine Team a93741815b feat(mgb/opr): add layernorm forward and backward kernel 3 years ago
  Megvii Engine Team c53cad2049 feat(cmake): format all cmake file 3 years ago
  Megvii Engine Team c90e0b54be perf(arm): optimize arm uint16 relayout with n=4 3 years ago
  Megvii Engine Team f6d9909460 feat(dnn): add elemwise multi type support i16xf32 and u8xf32 3 years ago
  kxz@thumt102-1 8f48da7ffe feat(mgb/opr): add cell level rnn/lstm and sequence level rnn/lstm 3 years ago
  Megvii Engine Team 6bb5409976 feat(dnn/src): add images2neibs kernel of opencl and related test 3 years ago
  Megvii Engine Team c96dbd29b8 fix(dnn/arm_common): support more monotonous case in arm typecvt for performance 3 years ago
  Megvii Engine Team 02d5f46d90 fix(mgb/x86): fix convbias crash on X86 3 years ago
  Megvii Engine Team 2696e4efaa feat(dnn): add float16 for remap backward 3 years ago
  Megvii Engine Team 11d75fecb5 feat(dnn/check_non_finite): add batch check_non_finite 3 years ago
  Megvii Engine Team 2318ea3f15 fix(dnn): fix naive average pooling overflow bug for int8 type 3 years ago
  Megvii Engine Team ba2f0c2e48 fix(dnn/cuda): fix cudnn_conv algo of conv_bias opr for fp16 add z cases 3 years ago
  Megvii Engine Team b59e8ccf24 fix(mgb): fix cambricon bangc copybara 3 years ago
  Megvii Engine Team 3116e128c5 fix(ci/integration_test): fix benchmark torch version 3 years ago
  Megvii Engine Team c85631aa77 feat(dnn): use ref ptr interface for all backends 3 years ago
  Megvii Engine Team 89186edc5d fix(dnn): correct reduce/argmxx/fakequant calculation with nan 3 years ago
  Megvii Engine Team 68cdabd288 feat(opr): indexing_multi_axis_vec support nd index 3 years ago
  Megvii Engine Team a1cba6cc27 fix(dnn): fix convbias crash on X86 3 years ago
  Megvii Engine Team 9b4cd92ba3 fix(mgb/dnn): fix cudnnConvBiasActivation crash on nchw32 int8 with oc > 256 3 years ago
  Megvii Engine Team c48d58daa8 feat(dnn/arm_common): add N1HW like elemwise broadcast mode 3 years ago
  Megvii Engine Team 26634db7a8 fix(dnn): support relayout for non-contigous layout 3 years ago
  Megvii Engine Team 056fd6bc59 feat(dnn/arm64): support stride_m in arm64 relayout 3 years ago
  liuke b0ba6d3201 Merge pull request #207 from togetherwhenyouwant:feat-x86-matmul-6x16x2 3 years ago
  Megvii Engine Team 10af44abba fix(dnn/cuda): fix cudnn conv impl for nchw4_nchw hybrid layout 3 years ago