大赛简介 |
神经网络(NN)模型已经成功地应用于很多NLP任务并取得了不错的成绩。但由于NN模型的黑盒性质,降低使用者对其结果的信任度,阻碍了NN模型更大规模地落地应用,尤其是对模型可靠性和安全性要求较高的领域,如医疗、法律等。因此,NN模型的可解释性、鲁棒性等问题受到广泛关注。为进一步推动模型可解释性研究的发展,评测数据和对应的评估指标是必需的。 Neural Network (NN) models have greatly improved various NLP tasks in terms of accuracy. However, as NN models are black-box systems, their inner decision processes are opaque to users. This lack of transparency makes them untrustworthy and hard to be applied in the decision-making applications, such as health and law, where users often hope to understand the inner decision process for the output. Consequently, the interpretability of NN models has attracted extensive attention. In order to further facilitate the research of interpretability, evaluation datasets with human-annotated rationales and the corresponding evaluation metrics are required.
本次比赛提供阅读理解任务的可解释评测数据集和相应的评估指标,评估模型的可解释性以及解释方法的精准性。本次比赛旨在为研究者提供学术交流平台,推动模型可解释的发展,以协助构建更加可信赖的深度学习模型和系统。 The challenge provides an evaluation dataset for Chinese machine reading comprehension task with human-annotated rationales and the corresponding evaluation metrics, so as to evaluate the interpretability of models and the accuracy of rationale extraction methods. The challenge provides a platform for research and academic exchanges on model interpretability, and facilitates the research progress in building trustworthy systems. |