Megvii Engine Team
26634db7a8
fix(dnn): support relayout for non-contigous layout
GitOrigin-RevId: 44a0adddba
3 years ago
Megvii Engine Team
056fd6bc59
feat(dnn/arm64): support stride_m in arm64 relayout
GitOrigin-RevId: c74193a23d
3 years ago
liuke
b0ba6d3201
Merge pull request #207 from togetherwhenyouwant:feat-x86-matmul-6x16x2
GitOrigin-RevId: 148ae44ba0
3 years ago
Megvii Engine Team
10af44abba
fix(dnn/cuda): fix cudnn conv impl for nchw4_nchw hybrid layout
the conv_bias algo *_IMPLICIT_GEMM in cudnn less than 8.0.0 is disabled due to the incorrect result for int8x4->f32 configs
GitOrigin-RevId: 7cc52d0a85
3 years ago
Megvii Engine Team
5885b137fa
feat(dnn/arm): support layout like NHWC channel like broadcast on arm
GitOrigin-RevId: fb4300004c
3 years ago
Megvii Engine Team
369c2ccc5a
style(all): reformat c++ code
GitOrigin-RevId: 3ffd1b211f
3 years ago
zjl
d2184af3b2
feat(dnn/src/x86/matmul): add matmul_6x16 for x86
3 years ago
Megvii Engine Team
177dec94c5
feat(mgb/opr): add bgr2gray mode for cvtcolor opr
GitOrigin-RevId: d50415b236
3 years ago
Megvii Engine Team
f5cb21ed3a
fix(mgb/opr): add non finite check
GitOrigin-RevId: a9fcd0a350
3 years ago
Megvii Engine Team
bde5cf3564
feat(dnn): add resize linear for arm
GitOrigin-RevId: 14ac5bda3f
3 years ago
Megvii Engine Team
3344b580a9
feat(dnn): add elemwise for nchw88+fp16
GitOrigin-RevId: 63587975f8
3 years ago
Megvii Engine Team
682c74df27
feat(dnn): add direct nchw88 fp16 conv
GitOrigin-RevId: 44719e8b64
3 years ago
Megvii Engine Team
3d3666b6e0
test(dnn/bn): add compatible configs for NHWC BN
GitOrigin-RevId: ac757ca307
3 years ago
Megvii Engine Team
3977b7aa0b
feat(mgb/shuffle): add shuffle opr
GitOrigin-RevId: 80490a6f84
3 years ago
Megvii Engine Team
17371e79b9
fix(dnn/reduce): fix reduce_mean o16c32 is incorrect for large tensor
GitOrigin-RevId: ebf03d814a
3 years ago
Megvii Engine Team
c33126ab5c
feat(mgb/gopt): add reformat manager
GitOrigin-RevId: b9791b131a
3 years ago
Megvii Engine Team
8b40f57738
feat(mgb/dnn): add conv1x1 algo for matrix mul
GitOrigin-RevId: 585b2c045a
3 years ago
Megvii Engine Team
d69b59035d
feat(dnn): add an get_all_algorithms_safe interface
GitOrigin-RevId: e3734e4531
3 years ago
Megvii Engine Team
103d7f33ba
refactor(dnn/rocm): update hip license header
GitOrigin-RevId: 79d684755d
4 years ago
Megvii Engine Team
5aa52d3863
feat(dnn/rocm): add adaptive pooling opr
GitOrigin-RevId: e844b3e770
3 years ago
Megvii Engine Team
323a4642e6
feat(dnn/rocm): add topk opr
GitOrigin-RevId: 5ecb079854
3 years ago
Megvii Engine Team
f4784f4af1
feat(dnn/rocm): add argsort opr
GitOrigin-RevId: b4c3eb4707
3 years ago
Megvii Engine Team
8b94f49328
fix(dnn/cuda): fix elemwise and relayout int4 bug when last shape is 1
GitOrigin-RevId: e7d64c4987
3 years ago
Megvii Engine Team
bc9cfc277a
feat(mgb): add arm resize nchwxx and naive nearest interp
GitOrigin-RevId: d5fbd59a30
3 years ago
Megvii Engine Team
722aecd437
feat(mgb): support fp16 nhwc backward
GitOrigin-RevId: 954ac6405a
3 years ago
Megvii Engine Team
0708bc780c
fix(dnn/cuda): disallow implicit dtype conversion in cublaslt matmul algos
disable tensor op matmul kernels when input and output tensors are in f32 data type to avoid potential accuracy loss
GitOrigin-RevId: 36859cba5a
3 years ago
Megvii Engine Team
1e83ab638e
feat(dnn): add channelwise conv for fp16 nchw88
GitOrigin-RevId: 1bb64f82c5
3 years ago
Megvii Engine Team
4c13bc7e1b
feat(dnn/cuda): add nhwc int8 deconv
GitOrigin-RevId: ad361a0f81
3 years ago
Megvii Engine Team
11f022ff7c
feat(dnn/cuda): add nhwc int8 imma conv and conv fuse typecvt
GitOrigin-RevId: 229e1eb4be
3 years ago
Megvii Engine Team
67575d582c
feat(mge/opr): add interpolate bilinear mode
GitOrigin-RevId: f7023a3fd3
3 years ago
Megvii Engine Team
0558b2123d
feat(mge/opr): add interpolate nearest mode
GitOrigin-RevId: d384b87f50
3 years ago
Megvii Engine Team
c25125e3d2
perf(dnn/cuda): sass int8 epilogue remove shared load
GitOrigin-RevId: 2b49f5069b
3 years ago
Megvii Engine Team
c9d060307f
feat(dnn/common): add named tensor shape
GitOrigin-RevId: 918928b8ba
4 years ago
Megvii Engine Team
ff0e6be7b9
fix(dnn/cuda): fix cutlass tensorop kernels
do not compile cutlass tensorop kernels, when using cuda version less than 10.2
GitOrigin-RevId: d4c37d5f41
3 years ago
Megvii Engine Team
336761253d
feat(dnn/cuda): add tensorcore matmul for fp16 data type
GitOrigin-RevId: 025c591f75
3 years ago
Megvii Engine Team
eab6afab47
feat(mgb): add padding opr for megbrain
GitOrigin-RevId: 490e0c5d5a
4 years ago
Megvii Engine Team
b18feaab33
feat(dnn/cuda): use cutlass remove shared load imma conv kernel
GitOrigin-RevId: 0b5574f526
4 years ago
Megvii Engine Team
1af350c6d2
feat(dnn): add fill kernel
GitOrigin-RevId: d2cee3a7a0
3 years ago
Megvii Engine Team
3eb0505f9b
feat(imperative): add support for quantized conv transpose2d
GitOrigin-RevId: ffd6431299
3 years ago
Megvii Engine Team
3b452d8c16
feat(mgb): cuda conv support nhwc format and fp16 dtype
GitOrigin-RevId: b8ddcd108a
3 years ago
Megvii Engine Team
10bcf75767
feat(dnn/x86): add algo for x86 max pooling for Window size bigger than 10 and S1 under NCHW88
GitOrigin-RevId: 613a18dd91
3 years ago
Megvii Engine Team
ddba5c9674
fix(core): fix nr_threads is zero
GitOrigin-RevId: 0ccbe3c69b
3 years ago
Megvii Engine Team
67f117882b
perf(arm_common): add elemwise unary multithread support
GitOrigin-RevId: 8eac123f67
3 years ago
Megvii Engine Team
3afa3893d7
perf(arm_common): optimize arm common pooling 9x9 and 13x13
GitOrigin-RevId: 33d5a62478
3 years ago
Megvii Engine Team
2aba0378b9
refactor(mgb/dnn): fix group conv is_available
GitOrigin-RevId: b279909168
3 years ago
Megvii Engine Team
4a92346b7a
refactor(mgb): refactor group conv3d
GitOrigin-RevId: 15360a3a41
4 years ago
Megvii Engine Team
6ce212d2e0
refactor(mgb): refactor group conv
GitOrigin-RevId: 7afd312690
4 years ago
Megvii Engine Team
869a03271b
perf(mgb): disable FoldingConvBiasDimshufflePass in cuda10 for performance
GitOrigin-RevId: d1b95a6f01
3 years ago
Megvii Engine Team
8d248a6a9a
fix(dnn/cuda): fix testcase for fallback nchw qs8 conv
GitOrigin-RevId: 646440db59
4 years ago
Megvii Engine Team
894a2407c2
feat(dnn/cuda): add relayout format kernel for nchw <-> nhwc
GitOrigin-RevId: e11f3e5408
4 years ago