Tóm tắt
In this study, we aim to estimate the sigma coefficient in the activation probability calculation for a topic’s diffusion prediction problem. In our previous studies, we proposed an aggregated activation probability combination of the metapath and text information, in which sigma is the characteristic coefficient of interest’s similarity based on textual content. σ is a parameter that controls the rates of the influence of active probability based on the metapath and interest similarity on aggregated activation probability. In a previous study, we supposed the equal importance between the metapath and textual information, when σ = 0.5. However, for different datasets, this coefficient differs, depending on the meaning of the meta-path and the textual information. In this study, we continue to investigate the importance of the sigma coefficient for the effectiveness of the topic’s diffusion prediction problem on the bibliographic network. We propose to utilize the two most common methods for feature selection: the ANOVA test and mutual information to obtain the significance of two features MP (metapath) and the IS (textual information). The experimental results show that the use of the feature selection methods to estimate the sigma coefficient is reliable and improves the predictive performance of the topic’s diffusion compared with the standard assignment of 0.5.
công trình này được cấp phép theo phép Creative Commons Ghi công 4.0 Giấy phép International . p>