Megvii Engine Team
|
7dc347697a
|
feat(dnn/cuda): add typecvt uint16
GitOrigin-RevId: d1368c414e
|
3 years ago |
Megvii Engine Team
|
409c988163
|
fix(imperative): add matmul apply_on_varnode
GitOrigin-RevId: 2cf6bf237c
|
3 years ago |
Megvii Engine Team
|
115c4592c0
|
fix(dnn/opencl): fix opencl elemwise tuning issue
GitOrigin-RevId: 317640547d
|
3 years ago |
Megvii Engine Team
|
73112558d0
|
feat(mge/dnn): support checknonfinite for fp16
GitOrigin-RevId: 83fa139ac0
|
3 years ago |
Megvii Engine Team
|
ed7fa10470
|
feat(fallback): move direct multi_thread_common helper to fallback
GitOrigin-RevId: 27ed93e4c1
|
3 years ago |
Megvii Engine Team
|
8871ad74af
|
refactor(fallback): opt gi naive reinterpret
GitOrigin-RevId: d27c6bccfe
|
3 years ago |
Megvii Engine Team
|
ffbf8fad6c
|
feat(fallback): add general intrinsic to elemwise multitype
GitOrigin-RevId: fe7b335545
|
3 years ago |
Megvii Engine Team
|
14e9ad625d
|
fix(megdnn): emit define-but-not-referenced and extra-;-ignored warning on cuda9.0~cuda9.1
GitOrigin-RevId: f6db42e395
|
3 years ago |
Megvii Engine Team
|
4c0bff1dba
|
refactor(megdnn): refactor TEGRA_X1/X2 macro
GitOrigin-RevId: 1aa78712c6
|
3 years ago |
Megvii Engine Team
|
758549b936
|
feat(megengine): support tx2
GitOrigin-RevId: d1175a1f4a
|
3 years ago |
Megvii Engine Team
|
c2435d1561
|
perf(imperative): specialize adaptive pooling
GitOrigin-RevId: 01e1418458
|
3 years ago |
Megvii Engine Team
|
39d98d4525
|
feat(fallback): add fallback typecvt with general intrinsic
GitOrigin-RevId: 1e6fcd929b
|
3 years ago |
Megvii Engine Team
|
d2278f02d2
|
perf(imperative): speed up conv_transpose3d
GitOrigin-RevId: e741305446
|
3 years ago |
Megvii Engine Team
|
3a5347ed21
|
perf(imperative): speed up pooling
GitOrigin-RevId: 9f60b45eeb
|
3 years ago |
Megvii Engine Team
|
c0b267fff6
|
refactor(cuda-stub): opt cuda-stub log
GitOrigin-RevId: 87dda08e1b
|
3 years ago |
Megvii Engine Team
|
d9c4ef59fe
|
perf(imperative): using simple hash key in heuristic cache
GitOrigin-RevId: 6fddd612e7
|
3 years ago |
Megvii Engine Team
|
26ea33c6a7
|
perf(imperative): improve convbwd performance
GitOrigin-RevId: cfc8623d7a
|
3 years ago |
Megvii Engine Team
|
3949d425fb
|
feat(core): always show MegEngine version and git commit id
GitOrigin-RevId: 4daa5be6d6
|
3 years ago |
Megvii Engine Team
|
b6ad457269
|
feat(cuda): support int1 simplewq conv
GitOrigin-RevId: 9c37c41bc7
|
3 years ago |
Megvii Engine Team
|
331567af5d
|
fix(opencl/ci): misc opt and fix:
1: fix megbrain test failed on mali 2.1 devices
2: reduce ci time (about reduce 20min)
GitOrigin-RevId: 4dcdcd48a6
|
3 years ago |
Megvii Engine Team
|
ff6a3bb819
|
fix(fallback): delete the repeat opcaller in fallback and arm_common
GitOrigin-RevId: 87046b8197
|
3 years ago |
Megvii Engine Team
|
547945e854
|
feat(fallback): support general intrinsic in elemwise in fallback
GitOrigin-RevId: 96ff2e88cc
|
3 years ago |
Megvii Engine Team
|
a017bed3aa
|
fix(fallback): reman general intrinsic type and add more intrinsic
GitOrigin-RevId: 37409bae9a
|
3 years ago |
Megvii Engine Team
|
fd6f8e58b0
|
feat(mgb/dtype): add dtype qint1
GitOrigin-RevId: abe9fb68b1
|
3 years ago |
Megvii Engine Team
|
2a900a69cb
|
perf(imperative): improve reduce op performance
GitOrigin-RevId: 26d982a7b8
|
3 years ago |
Megvii Engine Team
|
72a70dd6a7
|
perf(imperative): specialize convolution implementation
GitOrigin-RevId: 33634c550f
|
3 years ago |
Megvii Engine Team
|
2b80806f21
|
perf(imperative/src): improve dot performance
GitOrigin-RevId: 35b5bd164f
|
3 years ago |
Megvii Engine Team
|
3c3fc6f33c
|
refactor(imperative): move python code of elemwise/reduce/conv2d/bn to c++
GitOrigin-RevId: 01b5324392
|
3 years ago |
Megvii Engine Team
|
e400b7ffe5
|
perf(imperative): enable memory forwarding for imperative
GitOrigin-RevId: 7c1993979c
|
3 years ago |
Megvii Engine Team
|
1ce78aa09b
|
fix(imperative): destruct dnn handles at last
GitOrigin-RevId: 7a67c68c55
|
3 years ago |
Megvii Engine Team
|
3228fb75a5
|
fix(cuda): conv algo heuristic choose
GitOrigin-RevId: 95c5e7d627
|
3 years ago |
Megvii Engine Team
|
8c415f4ed7
|
feat(dnn): cuda nhwc nearest resize support not 1 or 3 channel
GitOrigin-RevId: 764504c341
|
3 years ago |
Megvii Engine Team
|
6fb5a34360
|
build(flatbuffer/cx2): fix cx2 build and fix uclibc build flatbuffer
GitOrigin-RevId: af851e155f
|
3 years ago |
Megvii Engine Team
|
87de704a46
|
feat(gopt): fuse conv h_swish
GitOrigin-RevId: a3d12991fb
|
3 years ago |
Megvii Engine Team
|
3726f5cc92
|
feat(gopt): merger consecutive relayout and dimshuffle to one relayout to optimize CD4 performarce
GitOrigin-RevId: a058776be3
|
3 years ago |
Megvii Engine Team
|
ac26bdcef5
|
fix(cuda): fix direct conv speed and memory problem
GitOrigin-RevId: 6faeeff3b8
|
3 years ago |
Megvii Engine Team
|
f7994683bd
|
feat(cuda): add large kernel direct conv to heuristic algo chooser
GitOrigin-RevId: bc927b6df7
|
3 years ago |
Megvii Engine Team
|
6dc0c0b9cc
|
fix(dnn): fix the sync problem in some kernels
GitOrigin-RevId: df3f7dc51b
|
3 years ago |
Megvii Engine Team
|
04193e3bd1
|
feat(dnn): add nearest mode for remap and resize
GitOrigin-RevId: 31e7b72a78
|
3 years ago |
Megvii Engine Team
|
93c7e45188
|
feat(arm): delete the reduant implement
GitOrigin-RevId: ff32a3dc8b
|
3 years ago |
Megvii Engine Team
|
e34a642b31
|
feat(fallback): reduce support general intrinsic
GitOrigin-RevId: f250aa7b2a
|
3 years ago |
Megvii Engine Team
|
10f23778a8
|
feat(fallback): add simd general intrinsic
GitOrigin-RevId: ad78ba689f
|
3 years ago |
Megvii Engine Team
|
286051ede1
|
feat(dnn): differentiate sass kernel with cuda version
GitOrigin-RevId: 40bb4423b8
|
3 years ago |
Megvii Engine Team
|
f78b60ec10
|
feat(bazel): make bazel gensass depend on cuda toolchain version automatically
GitOrigin-RevId: 9433f21a91
|
3 years ago |
Megvii Engine Team
|
f48227c07d
|
feat(mgb): show more details for cuda driver api call
GitOrigin-RevId: 40e63d9dac
|
3 years ago |
Megvii Engine Team
|
d8bb3ff5b4
|
fix(cuda): fix fp16 tensorcore gemm split k workspace
GitOrigin-RevId: d04a0e0985
|
3 years ago |
Megvii Engine Team
|
d7b0994a3e
|
feat(cuda): add fp16 compute 16 kernel
GitOrigin-RevId: e03435be02
|
3 years ago |
Megvii Engine Team
|
8a2e92bd6c
|
refactor(cuda): depthwish large kernel
GitOrigin-RevId: dade8710b4
|
3 years ago |
Megvii Engine Team
|
6b8a69d5b6
|
feat(cuda): float16 depthwise large kernel conv compute fp32
GitOrigin-RevId: 3050d48f26
|
3 years ago |
Megvii Engine Team
|
bc385b5374
|
feat(cuda): support float16 depthwise large kernel conv
GitOrigin-RevId: fdc1b15fbc
|
3 years ago |