7 Commits (ff0e6be7b97f721e85be13996ceee0b8aec87cbb)

Author SHA1 Message Date
  Megvii Engine Team ff0e6be7b9 fix(dnn/cuda): fix cutlass tensorop kernels 3 years ago
  Megvii Engine Team 336761253d feat(dnn/cuda): add tensorcore matmul for fp16 data type 3 years ago
  Megvii Engine Team 9b4b910dc1 feat(dnn/cuda): integrate cutlass operation table and replace all cutlass wrappers 3 years ago
  Megvii Engine Team 33da8de12b build(dnn/cuda): split compilation for cutlass wrapper 4 years ago
  Megvii Engine Team b717606989 fix(dnn/cuda): add block size limit for culass gemm algo 4 years ago
  Megvii Engine Team 2de2222e46 feat(dnn/cuda): add cutlass batched gemv kernel for matmul operator 4 years ago
  Megvii Engine Team 03c921f7c4 feat(dnn/cuda): add cutlass matmul impls 4 years ago