579 Commits (v1.8.1.m1)

Author SHA1 Message Date
  Megvii Engine Team 0708bc780c fix(dnn/cuda): disallow implicit dtype conversion in cublaslt matmul algos 3 years ago
  Megvii Engine Team 1e83ab638e feat(dnn): add channelwise conv for fp16 nchw88 3 years ago
  Megvii Engine Team 7b855dc64a fix(dnn/cuda): fix compilation for windows bazel 3 years ago
  Megvii Engine Team 3abe0b2462 fix(mgb): fix rocm pooling 3 years ago
  Megvii Engine Team 16678bb998 fix(dnn): fix_short_cutlass_name_gemm 3 years ago
  Megvii Engine Team 4c13bc7e1b feat(dnn/cuda): add nhwc int8 deconv 3 years ago
  Megvii Engine Team 11f022ff7c feat(dnn/cuda): add nhwc int8 imma conv and conv fuse typecvt 3 years ago
  Megvii Engine Team a0231a7920 fix(dnn/cuda): fix algo matmul for conv bwd filter 3 years ago
  Megvii Engine Team 56c1b626bf refactor(dnn): move arch-dependant code to arch.h 3 years ago
  Megvii Engine Team 67575d582c feat(mge/opr): add interpolate bilinear mode 3 years ago
  Megvii Engine Team 0558b2123d feat(mge/opr): add interpolate nearest mode 3 years ago
  Megvii Engine Team 127870a926 feat(dnn/opencl): add heuristic rule for batched matmul 3 years ago
  Megvii Engine Team c25125e3d2 perf(dnn/cuda): sass int8 epilogue remove shared load 3 years ago
  Megvii Engine Team 55efc8e197 feat(mgb/gopt): add reformat emitter 4 years ago
  Megvii Engine Team c9d060307f feat(dnn/common): add named tensor shape 4 years ago
  Megvii Engine Team ff0e6be7b9 fix(dnn/cuda): fix cutlass tensorop kernels 3 years ago
  Megvii Engine Team 336761253d feat(dnn/cuda): add tensorcore matmul for fp16 data type 3 years ago
  Megvii Engine Team 2c4ee99227 fix(dnn): short cutlass filename in windows 3 years ago
  Megvii Engine Team 432592374d build(dnn/cuda): fix cmake compile dependency for cutlass kernels 3 years ago
  Megvii Engine Team cc07b96f82 perf(dnn/relayout): disable copy_last_contiguous when contiguous_size is 3 years ago
  Megvii Engine Team d195fdec71 refactor(mgb): refactor has-usable-algo function for global optimizer 3 years ago
  Megvii Engine Team 604bb2a569 feat(mgb/dnn): add int atomic add for megdnn 3 years ago
  Megvii Engine Team eab6afab47 feat(mgb): add padding opr for megbrain 4 years ago
  Megvii Engine Team 66c18f6054 fix(ci): fix bazel compile error in new macos 3 years ago
  Megvii Engine Team c88a4e5b32 fix(mgb): fix get env macro 3 years ago
  Megvii Engine Team 9b4b910dc1 feat(dnn/cuda): integrate cutlass operation table and replace all cutlass wrappers 3 years ago
  Megvii Engine Team b18feaab33 feat(dnn/cuda): use cutlass remove shared load imma conv kernel 4 years ago
  Megvii Engine Team 1af350c6d2 feat(dnn): add fill kernel 3 years ago
  Megvii Engine Team 3eb0505f9b feat(imperative): add support for quantized conv transpose2d 3 years ago
  Megvii Engine Team c68e669530 feat(bazel/windows/xp/sp2/inference): implement inference on windows xp 3 years ago
  Megvii Engine Team 3b452d8c16 feat(mgb): cuda conv support nhwc format and fp16 dtype 3 years ago
  Megvii Engine Team 10bcf75767 feat(dnn/x86): add algo for x86 max pooling for Window size bigger than 10 and S1 under NCHW88 3 years ago
  Megvii Engine Team ddba5c9674 fix(core): fix nr_threads is zero 3 years ago
  Megvii Engine Team 67f117882b perf(arm_common): add elemwise unary multithread support 3 years ago
  Megvii Engine Team 3afa3893d7 perf(arm_common): optimize arm common pooling 9x9 and 13x13 3 years ago
  Megvii Engine Team 2c4ff5431b fix(mgb): fix cudnn ConvolutionBackwardData 3 years ago
  Megvii Engine Team 287cab49c2 fix(mgb/sereg): fix rng operator compatibility 3 years ago
  Megvii Engine Team 2aba0378b9 refactor(mgb/dnn): fix group conv is_available 3 years ago
  Megvii Engine Team 4a92346b7a refactor(mgb): refactor group conv3d 3 years ago
  Megvii Engine Team 6ce212d2e0 refactor(mgb): refactor group conv 4 years ago
  Megvii Engine Team f76a2cc2c6 feat(mge/opr): add silu and gelu 3 years ago
  Megvii Engine Team f8b0f2cb91 build(dnn/cutlass): fix build for cutlass 3 years ago
  Megvii Engine Team 869a03271b perf(mgb): disable FoldingConvBiasDimshufflePass in cuda10 for performance 3 years ago
  Megvii Engine Team 239916a997 fix(mgb/gopt): fix testcase for enable nchw64 pass 4 years ago
  Megvii Engine Team 4eda338876 feat(dnn/cuda): generate cutlass kimpls using cmake and bazel 4 years ago
  Megvii Engine Team 8d248a6a9a fix(dnn/cuda): fix testcase for fallback nchw qs8 conv 4 years ago
  Megvii Engine Team 894a2407c2 feat(dnn/cuda): add relayout format kernel for nchw <-> nhwc 4 years ago
  Megvii Engine Team 43c59204df refactor(dnn/cuda): refactor relayout format kernels 4 years ago
  Megvii Engine Team f41a808694 feat(dnn/cuda): add nhwc int4 conv support 4 years ago
  Megvii Engine Team 5a14a89224 refactor(dnn/cuda): refactor cutlass kernel generator for gemm and gemv 4 years ago