Megvii Engine Team
|
afe9c4b50d
|
feat(dnn/cuda): add implicit bmm kernels for large kernel depthwise convolution backward filter opr
GitOrigin-RevId: 932e7689e8
|
3 years ago |
Megvii Engine Team
|
1da58ae17a
|
feat(dnn/cuda): add implicit bmm large kernel dwconv2d dgrad kernels
GitOrigin-RevId: fcb7974d62
|
3 years ago |
Megvii Engine Team
|
96050073a2
|
feat(dnn/cuda): add implicit bmm large kernel dwconv2d fprop impl
GitOrigin-RevId: feb09ebb58
|
3 years ago |
Megvii Engine Team
|
16678bb998
|
fix(dnn): fix_short_cutlass_name_gemm
GitOrigin-RevId: cc0a2db9da
|
3 years ago |
Megvii Engine Team
|
4c13bc7e1b
|
feat(dnn/cuda): add nhwc int8 deconv
GitOrigin-RevId: ad361a0f81
|
3 years ago |
Megvii Engine Team
|
11f022ff7c
|
feat(dnn/cuda): add nhwc int8 imma conv and conv fuse typecvt
GitOrigin-RevId: 229e1eb4be
|
3 years ago |
Megvii Engine Team
|
ff0e6be7b9
|
fix(dnn/cuda): fix cutlass tensorop kernels
do not compile cutlass tensorop kernels, when using cuda version less than 10.2
GitOrigin-RevId: d4c37d5f41
|
3 years ago |
Megvii Engine Team
|
336761253d
|
feat(dnn/cuda): add tensorcore matmul for fp16 data type
GitOrigin-RevId: 025c591f75
|
3 years ago |
Megvii Engine Team
|
2c4ee99227
|
fix(dnn): short cutlass filename in windows
GitOrigin-RevId: 83a43fdf87
|
3 years ago |
Megvii Engine Team
|
432592374d
|
build(dnn/cuda): fix cmake compile dependency for cutlass kernels
GitOrigin-RevId: ebe71f5a12
|
3 years ago |
Megvii Engine Team
|
9b4b910dc1
|
feat(dnn/cuda): integrate cutlass operation table and replace all cutlass wrappers
GitOrigin-RevId: 2a70335441
|
3 years ago |
Megvii Engine Team
|
b18feaab33
|
feat(dnn/cuda): use cutlass remove shared load imma conv kernel
GitOrigin-RevId: 0b5574f526
|
4 years ago |
Megvii Engine Team
|
f8b0f2cb91
|
build(dnn/cutlass): fix build for cutlass
GitOrigin-RevId: 9aa095fe84
|
3 years ago |
Megvii Engine Team
|
4eda338876
|
feat(dnn/cuda): generate cutlass kimpls using cmake and bazel
GitOrigin-RevId: da3bcfb85a
|
4 years ago |