5 Commits (c3a4b2225d21b40e16be3f77638a38d8f8ae11e4)

Author SHA1 Message Date
  Megvii Engine Team c3a4b2225d feat(dnn/cuda): add cutlass impls for fused convolution reformat operation 4 years ago
  Megvii Engine Team 5f44203d7b feat(dnn/cuda): add a cutlass impl for fusing convolution and dimshuffle 4 years ago
  Megvii Engine Team 739f927c4c feat(dnn/cuda): opt dp4a conv for small channel base on cutlass 4 years ago
  Megvii Engine Team 76fa71573b feat(dnn/cuda): add cutlass nchw4 convolution 4 years ago
  Megvii Engine Team aeffcd5897 feat(dnn/cuda): integrate cutlass nchw32 tensorcore convolution 4 years ago