Browse Source

add Data Compliance SIG details

pull/201/head
Clement Li 3 years ago
parent
commit
5770cf2898
2 changed files with 46 additions and 1 deletions
  1. +2
    -1
      sigs/README.md
  2. +44
    -0
      sigs/datacompliance/README.md

+ 2
- 1
sigs/README.md View File

@@ -37,4 +37,5 @@ in the mailing list. SIG artifacts can be found in the current repository.
| [Parallel](parallel/README.md) | This SIG is responsible for the development of MindSpore's functionality of automatically finding the efficient parallel strategy for DNN training and inference. | [@dr-orange](https://gitee.com/dr-orange)(chengli7@ustc.edu.cn) |
| [AdaptiveTraining](adaptivetraining/README.md) | This SIG is to develop an adaptive distributed training system that can train the neural networks in elastic clusters without affecting the convergence. | [@luomai-edin](https://gitee.com/luomai-edin)(luo.mai@outlook.com) |
| [Serving](serving/README.md) | This SIG is responsible for the development of MindSpore Serving module. | [@xu-yfei](https://gitee.com/xu-yfei) |
| [DevelopereXperience](dx/README.md) | This SIG is responsible for improving the experience of those who upstream contribute or develop applications for MindSpore community. | [@jiancao81](https://gitee.com/jiancao81)(cao-jian@cs.sjtu.edu.cn) [@clement_li](https://gitee.com/clement_li) |
| [DevelopereXperience](dx/README.md) | This SIG is responsible for improving the experience of those who upstream contribute or develop applications for MindSpore community. | [@jiancao81](https://gitee.com/jiancao81)(cao-jian@cs.sjtu.edu.cn) |
| [DataCompliance](datacompliance/README.md) | This SIG aims to reduce the risk of license compliance and help developers to use and share datasets legally. | [@gopikrishnanrajbahadur](https://gitee.com/gopikrishnanrajbahadur) [@clement_li](https://gitee.com/clement_li) |

+ 44
- 0
sigs/datacompliance/README.md View File

@@ -0,0 +1,44 @@
# MindSpore Data Compliance Interest Group (SIG)

Data Compliance SIG aims to find out the risk of license compliance and help developers to use and sharing datasets legally.

- List all the licenses of open datasets used in modelzoo

If we do not know what license the data have, our use of data creates legal risks. Find out whether the data has a license by looking for the source of the data itself. If there is no license, users are not recommended to use it. For data with a license, we must clearly identify the license and record it onto website.

- Categorize the dataset licenses into rights, obligations and limitations

From a legal standpoint, depending on the nature of the data, collating and unifying data in databases could arguably have qualified (under certain legal systems) as copyright infringement or database right infringement (for jurisdictions such as the European Union where such a right exists). Without knowledgement of copyright law, people can hardly know which dataset can be used commercially.

In cooperation with lawyers, we categorize the license clauses, for what we can do we call it rights. For what we have to do, we call it obligations, for what we do with restrictions, we call it a limitations. All analysis results will be output as a risk matrix.

- Build a process to review the risk of license compliance

After we have a team and rules, we have to set up a process to help our development team use the dataset more easily, some steps we should do before we release a datasets to our community:

Do the datasets have a license or term of use?

Which license or term of use do the datasets have?

Is it non-commercial or research-use-only?

Give the feedback to the data development team.

- Form a standard license schema to resolve conceptual ambiguities

As we gradually accumulate experience in data compliance, we will try to form a standard license language to help the entire industry reduce ambiguity. At the right time, we make it a standard.

## SIG Leads

- Gopi Krishnan Rajbahadur (Canada Queens University)
- Li Zi (Huawei)

## Logistics

- SIG leads will drive the meeting.
- Meeting announcement will be posted on our gitee channel: https://gitee.com/mindspore/community/tree/master/sigs/datacompliance
- Feedbacks and topic requests are welcomed by all.

## Discussion

- Documents and artifacts: https://gitee.com/mindspore/community/tree/master/sigs/datacompliance

Loading…
Cancel
Save