SparseKT:Towards Robust Knowledge Tracing Models via k-Sparse Attention
We added sparseKT into our pyKT package.
Title: sparseKT: Towards Robust Knowledge Tracing Models via k-Sparse Attention
Knowledge tracing (KT) is the problem of predicting students’ future performance based on their historical interaction sequences. With the advanced capability of capturing contextual long-term dependency, attention mechanism becomes one of the essential components in many deep learning based KT (DLKT) models. In spite of the impressive performance achieved by these attentional DLKT models, many of them are often vulnerable to run the risk of overfitting, especially on small-scale educational datasets. Therefore, in this paper, we propose sparseKT, a simple yet effective framework to facilitate the robustness and generalization of the attention based DLKT approaches. To learn the sparse attention from KT data we incorporate a k-selection module into standard attention function to only pick items with highest attention scores. We propose two sparsification heuristics : (1) soft-thresholding sparse attention and (2) top-K sparse attention. We show that our sparseKT is able to help attentional KT models get rid of irrelevant or useless student interactions and improves the predictive performance when compared to 12 state-of-the-art on three publicly available real-world educational datasets. To encourage reproducible research, we make our data and code publicly available at https://tinyurl.com/366fwdjn.