1628 Commits (4eda338876a0179580235bceded36957383c7a84)
 

Author SHA1 Message Date
  Megvii Engine Team 4eda338876 feat(dnn/cuda): generate cutlass kimpls using cmake and bazel 4 years ago
  Megvii Engine Team 8d248a6a9a fix(dnn/cuda): fix testcase for fallback nchw qs8 conv 4 years ago
  Megvii Engine Team 894a2407c2 feat(dnn/cuda): add relayout format kernel for nchw <-> nhwc 4 years ago
  Megvii Engine Team 43c59204df refactor(dnn/cuda): refactor relayout format kernels 4 years ago
  Megvii Engine Team f41a808694 feat(dnn/cuda): add nhwc int4 conv support 4 years ago
  Megvii Engine Team 5a14a89224 refactor(dnn/cuda): refactor cutlass kernel generator for gemm and gemv 4 years ago
  Megvii Engine Team b33217d8f0 refactor(dnn/cuda): refactor cutlass kernel generator for deconv operation 4 years ago
  Megvii Engine Team 4abf7bd36f refactor(dnn/cuda): refactor kernel generator for cutlass convolution kernels 4 years ago
  Megvii Engine Team b4687ce8da feat(dnn/cuda): add convolution with i8 input and u4 output 4 years ago
  Megvii Engine Team 00083d13b6 fix(dnn/cuda): fix recursive algo search for fallback_nchw_qs8 4 years ago
  Megvii Engine Team bba04f02e5 feat(mgb/gopt): add fusion support for conv, astype(s4) and reformat 4 years ago
  Megvii Engine Team 66f70578c2 feat(dnn/cuda): add convolution with i8 input and i4 output 4 years ago
  Megvii Engine Team 6d686ff26f feat(gopt/inference): allow Float32 output dtype in EnableNCHW64Pass 4 years ago
  Megvii Engine Team 7d3df995cb feat(gopt/inference): allow Float32 output dtype in EnableNCHW4Pass 4 years ago
  Megvii Engine Team 633016a962 fix(dnn/cuda): fix AlgoFallbackNCHWQS8 to support Float32 dst 4 years ago
  Megvii Engine Team e6caa9ff89 feat(opr): add bn backward for inference mode 4 years ago
  Xinda Huang c90fa087ea test(mge): delete test_external.py 3 years ago
  Megvii Engine Team b2944559a8 fix(imperative/module): remove ``__getattribute__`` method in module 4 years ago
  Megvii Engine Team 77ead9377b fix(src/serialization): fix compatibility error of oss model 3 years ago
  Megvii Engine Team 070c811732 fix(imperative): remove convert_inputs 3 years ago
  Megvii Engine Team f40df60242 docs(mge): refactor docs to remove warnings 3 years ago
  Megvii Engine Team 1040b77843 fix(mge/functional): fix F.topk(kth_only=True) 4 years ago
  Megvii Engine Team 551cc701c6 docs(distributed.functional): add return type for all_reduce_max (jira #MGE-2706) 3 years ago
  Megvii Engine Team 72ff7aeccb feat(docs): add docs for megengine.functional.ones_like(jira #MGE-2702) 3 years ago
  Megvii Engine Team 7c9569e4e5 fix(mge/random): fix random seed 3 years ago
  Megvii Engine Team 07de15713c fix(mgb): remove static mem record from tee 4 years ago
  Megvii Engine Team d7b6bfd56c test(mge/fakequant): use fixed input for lsq test to temperarily avoid precision error 3 years ago
  Megvii Engine Team 5cef74a77e feat(mge/amp): add GradScaler support 4 years ago
  Megvii Engine Team 1bf18252c4 feat(mge/amp): add mix precision autocast support 4 years ago
  Megvii Engine Team f12355f727 fix(imperative/grad): fix hardcode dtype in subtensor_grad_rule 4 years ago
  Megvii Engine Team 4e4497b903 refactor(mgb/dnn): x86 pooling rebase algochooser 3 years ago
  Megvii Engine Team a33c3b73bd refactor(mgb/dnn): arm pooling rebase algochooser 3 years ago
  Megvii Engine Team 8dea6b3c68 build(dnn): compat for more windows env 3 years ago
  Megvii Engine Team 56b94d89a2 feat(dtr): add sqrt sampling 3 years ago
  Megvii Engine Team 8a73193c2d feat(dtr): remove eviction threshold 3 years ago
  Megvii Engine Team 69d1fd0f32 refactor(opdef): split apply_on_physical_tensor into infer_output_mem_desc and execute 3 years ago
  Megvii Engine Team 75eb04c559 feat(mge/experimental): add WeightScaler support 4 years ago
  Megvii Engine Team dedecf6922 fix(imperative/utils): fix logical error of replace var 4 years ago
  Megvii Engine Team ea70d99b4d fix(mge/convbias): make fallback convbias support nhwcd4 layout 4 years ago
  Megvii Engine Team 497ef6c337 fix(mge/dist): fix gl oom error 3 years ago
  Megvii Engine Team 43098fb8f1 feat(mge): add SlidingWindowTranspose opr 4 years ago
  Megvii Engine Team df79334cae feat(mge/distributed): add user_pop function to save device memory 3 years ago
  Megvii Engine Team 1eaf32cd78 fix(mgb): fix typo in message 4 years ago
  Megvii Engine Team 7225b0f09f fix(mge/utils): use static infer manager to get value of network.varnode 4 years ago
  Megvii Engine Team ffe2bb2eb2 fix(mge): fix some errors caused by unknown shape when using symbolic trace or building graph 4 years ago
  Megvii Engine Team 2d42455fa8 fix(mge/utils): fix toposort to get definition order 4 years ago
  Megvii Engine Team 0c97b2a3ec fix(module): remove assert during forward 3 years ago
  Megvii Engine Team 427113088b fix(module/normalization): fix bug of LayerNorm and support input of any shape 4 years ago
  Megvii Engine Team a95f6d4f75 perf(trace): add fastpath for const value assert 3 years ago
  Megvii Engine Team 2cd9823210 fix(mgb/tensorrt): fix trt runtime, padding channel to a multiple of 4 when using kCHW4 IOFormat 3 years ago