Megvii Engine Team
|
ffbf8fad6c
|
feat(fallback): add general intrinsic to elemwise multitype
GitOrigin-RevId: fe7b335545
|
3 years ago |
Megvii Engine Team
|
14e9ad625d
|
fix(megdnn): emit define-but-not-referenced and extra-;-ignored warning on cuda9.0~cuda9.1
GitOrigin-RevId: f6db42e395
|
3 years ago |
Megvii Engine Team
|
4c0bff1dba
|
refactor(megdnn): refactor TEGRA_X1/X2 macro
GitOrigin-RevId: 1aa78712c6
|
3 years ago |
Megvii Engine Team
|
758549b936
|
feat(megengine): support tx2
GitOrigin-RevId: d1175a1f4a
|
3 years ago |
Megvii Engine Team
|
c2435d1561
|
perf(imperative): specialize adaptive pooling
GitOrigin-RevId: 01e1418458
|
3 years ago |
Megvii Engine Team
|
39d98d4525
|
feat(fallback): add fallback typecvt with general intrinsic
GitOrigin-RevId: 1e6fcd929b
|
3 years ago |
Megvii Engine Team
|
d2278f02d2
|
perf(imperative): speed up conv_transpose3d
GitOrigin-RevId: e741305446
|
3 years ago |
Megvii Engine Team
|
3a5347ed21
|
perf(imperative): speed up pooling
GitOrigin-RevId: 9f60b45eeb
|
3 years ago |
Megvii Engine Team
|
c0b267fff6
|
refactor(cuda-stub): opt cuda-stub log
GitOrigin-RevId: 87dda08e1b
|
3 years ago |
Megvii Engine Team
|
d9c4ef59fe
|
perf(imperative): using simple hash key in heuristic cache
GitOrigin-RevId: 6fddd612e7
|
3 years ago |
Megvii Engine Team
|
26ea33c6a7
|
perf(imperative): improve convbwd performance
GitOrigin-RevId: cfc8623d7a
|
3 years ago |
Megvii Engine Team
|
3949d425fb
|
feat(core): always show MegEngine version and git commit id
GitOrigin-RevId: 4daa5be6d6
|
3 years ago |
Megvii Engine Team
|
b6ad457269
|
feat(cuda): support int1 simplewq conv
GitOrigin-RevId: 9c37c41bc7
|
3 years ago |
Megvii Engine Team
|
331567af5d
|
fix(opencl/ci): misc opt and fix:
1: fix megbrain test failed on mali 2.1 devices
2: reduce ci time (about reduce 20min)
GitOrigin-RevId: 4dcdcd48a6
|
3 years ago |
Megvii Engine Team
|
ff6a3bb819
|
fix(fallback): delete the repeat opcaller in fallback and arm_common
GitOrigin-RevId: 87046b8197
|
3 years ago |
Megvii Engine Team
|
547945e854
|
feat(fallback): support general intrinsic in elemwise in fallback
GitOrigin-RevId: 96ff2e88cc
|
3 years ago |
Megvii Engine Team
|
a017bed3aa
|
fix(fallback): reman general intrinsic type and add more intrinsic
GitOrigin-RevId: 37409bae9a
|
3 years ago |
Megvii Engine Team
|
fd6f8e58b0
|
feat(mgb/dtype): add dtype qint1
GitOrigin-RevId: abe9fb68b1
|
3 years ago |
Megvii Engine Team
|
2a900a69cb
|
perf(imperative): improve reduce op performance
GitOrigin-RevId: 26d982a7b8
|
3 years ago |
Megvii Engine Team
|
72a70dd6a7
|
perf(imperative): specialize convolution implementation
GitOrigin-RevId: 33634c550f
|
3 years ago |
Megvii Engine Team
|
2b80806f21
|
perf(imperative/src): improve dot performance
GitOrigin-RevId: 35b5bd164f
|
3 years ago |
Megvii Engine Team
|
3c3fc6f33c
|
refactor(imperative): move python code of elemwise/reduce/conv2d/bn to c++
GitOrigin-RevId: 01b5324392
|
3 years ago |
Megvii Engine Team
|
e400b7ffe5
|
perf(imperative): enable memory forwarding for imperative
GitOrigin-RevId: 7c1993979c
|
3 years ago |
Megvii Engine Team
|
1ce78aa09b
|
fix(imperative): destruct dnn handles at last
GitOrigin-RevId: 7a67c68c55
|
3 years ago |
Megvii Engine Team
|
3228fb75a5
|
fix(cuda): conv algo heuristic choose
GitOrigin-RevId: 95c5e7d627
|
3 years ago |
Megvii Engine Team
|
8c415f4ed7
|
feat(dnn): cuda nhwc nearest resize support not 1 or 3 channel
GitOrigin-RevId: 764504c341
|
3 years ago |
Megvii Engine Team
|
6fb5a34360
|
build(flatbuffer/cx2): fix cx2 build and fix uclibc build flatbuffer
GitOrigin-RevId: af851e155f
|
3 years ago |
Megvii Engine Team
|
87de704a46
|
feat(gopt): fuse conv h_swish
GitOrigin-RevId: a3d12991fb
|
3 years ago |
Megvii Engine Team
|
3726f5cc92
|
feat(gopt): merger consecutive relayout and dimshuffle to one relayout to optimize CD4 performarce
GitOrigin-RevId: a058776be3
|
3 years ago |
Megvii Engine Team
|
ac26bdcef5
|
fix(cuda): fix direct conv speed and memory problem
GitOrigin-RevId: 6faeeff3b8
|
3 years ago |
Megvii Engine Team
|
f7994683bd
|
feat(cuda): add large kernel direct conv to heuristic algo chooser
GitOrigin-RevId: bc927b6df7
|
3 years ago |
Megvii Engine Team
|
6dc0c0b9cc
|
fix(dnn): fix the sync problem in some kernels
GitOrigin-RevId: df3f7dc51b
|
3 years ago |
Megvii Engine Team
|
04193e3bd1
|
feat(dnn): add nearest mode for remap and resize
GitOrigin-RevId: 31e7b72a78
|
3 years ago |
Megvii Engine Team
|
93c7e45188
|
feat(arm): delete the reduant implement
GitOrigin-RevId: ff32a3dc8b
|
3 years ago |
Megvii Engine Team
|
e34a642b31
|
feat(fallback): reduce support general intrinsic
GitOrigin-RevId: f250aa7b2a
|
3 years ago |
Megvii Engine Team
|
10f23778a8
|
feat(fallback): add simd general intrinsic
GitOrigin-RevId: ad78ba689f
|
3 years ago |
Megvii Engine Team
|
286051ede1
|
feat(dnn): differentiate sass kernel with cuda version
GitOrigin-RevId: 40bb4423b8
|
3 years ago |
Megvii Engine Team
|
f78b60ec10
|
feat(bazel): make bazel gensass depend on cuda toolchain version automatically
GitOrigin-RevId: 9433f21a91
|
3 years ago |
Megvii Engine Team
|
f48227c07d
|
feat(mgb): show more details for cuda driver api call
GitOrigin-RevId: 40e63d9dac
|
3 years ago |
Megvii Engine Team
|
d8bb3ff5b4
|
fix(cuda): fix fp16 tensorcore gemm split k workspace
GitOrigin-RevId: d04a0e0985
|
3 years ago |
Megvii Engine Team
|
d7b0994a3e
|
feat(cuda): add fp16 compute 16 kernel
GitOrigin-RevId: e03435be02
|
3 years ago |
Megvii Engine Team
|
8a2e92bd6c
|
refactor(cuda): depthwish large kernel
GitOrigin-RevId: dade8710b4
|
3 years ago |
Megvii Engine Team
|
6b8a69d5b6
|
feat(cuda): float16 depthwise large kernel conv compute fp32
GitOrigin-RevId: 3050d48f26
|
3 years ago |
Megvii Engine Team
|
bc385b5374
|
feat(cuda): support float16 depthwise large kernel conv
GitOrigin-RevId: fdc1b15fbc
|
3 years ago |
Megvii Engine Team
|
7d2063e35a
|
perf(cuda): speedup conv backward data with small feature map and large filter size
GitOrigin-RevId: 85592bca6b
|
3 years ago |
Megvii Engine Team
|
72403e8929
|
perf(cuda): speedup chanwise conv with small feature map and large filter size
GitOrigin-RevId: e65b2ce856
|
3 years ago |
Megvii Engine Team
|
ab6d12caff
|
feat(mge): add conv padding mode
GitOrigin-RevId: 147ced856e
|
3 years ago |
Megvii Engine Team
|
47fe766310
|
feat(dnn/cuda): add implicit bmm kernels for large kernel depthwise convolution backward filter opr
GitOrigin-RevId: 932e7689e8
|
3 years ago |
Megvii Engine Team
|
dcc9693582
|
feat(dnn/cuda): add heuristic rule for implicit batched gemm large kernel dwconv2d kernels
GitOrigin-RevId: 2d2c213bfd
|
3 years ago |
Megvii Engine Team
|
6cefabe734
|
fix(dnn/cuda): fix ci
GitOrigin-RevId: 8267e5f9dd
|
3 years ago |