Fund Project:Project supported by the Young Scientists Fund of the National Natural Science Foundation of China (Grant No. 61803264)
Received Date:05 May 2019
Accepted Date:17 July 2019
Available Online:01 October 2019
Published Online:05 October 2019
Abstract:The K-shell has important theoretical significance and application value in measuring the importance of nodes in complex networks. However, in the K-shell method, most of nodes possess an identical K-shell value so that the relative importance of the identical K-shell nodes cannot be further compared with each other. Therefore, based on the K-shell value of nodes in the complex network and the K-shell values of multi-order neighbors in complex networks, in this paper we use the vectors to represent the relative importance of node in each of complex networks, which is named multi-order K-shell vector. Multi-order K-shell vector centrality defines a vector indicating the number of multi-order neighbors with different K-shells and groups them into elements of the vector. Each vector infers to not only the original K-shell of the given node but also the number of its multi-order neighbors and their K-shell values, which indicates the propagation capability of the given node. An approach to comparing two multi-order K-shell vectors is also presented, which is used to sort the vectors to evaluate the node importance. The method is explored by comparing several existing centrality methods. Through the experiments of SI propagation and static attack experiments in seven real-world networks, it is found that multi-order K-shell vector centrality provides low computational complexity, effectively evaluates nodes with high propagation capability, which confirms the improved performance in susceptible infected model propagation experiments. On the other hand, the static attack experiments show that the multi-order K-shell vector tends to preferentially select the core structure with powerful propagation capability in the network. The multi-order K-shell vector greatly improves the difference rate of node centrality under the premise of preserving the K-shell structure information, as well as balancing the importance measure of nodes in the complex network and the structure evaluation of propagation capability. The multi-order K-shell vector is not appropriate for all types of networks when considering the result of network attacking. For the networks with low clustering coefficients and high average path lengths, multi-order K-shell vector method is dominant and the effect is relatively obvious. By contrast, multi-order K-shell vector surpasses most of centrality approaches when spreading information is our priority. In a few networks, eigenvector centrality presents a slightly better performance with a larger computational complexity. The proposed centrality measure is therefore of great theoretical and practical importance. Keywords:centrality/ multi-order K-shell vector/ susceptible infected model/ static attack
全文HTML
--> --> --> -->
2.1.基于多阶邻居壳数的向量中心性方法
本文的算法是基于K-壳分解法进行的, K-壳分解法是一个递归的过程, 具体的分解过程如下: 第一步, 找到网络G中所有度值为1的节点, 将度值为1的节点及其连边一并删除, 得到一个子网络G1, 再选取子网络G1中所有度值为1的节点, 将该类节点及其连边删除, 直至子网络G2中不包含度值为1的子节点. 将这一步骤中删除节点的壳值ks定义为1; 第二步, 删除子网络G2中度值为2的节点及其连边, 重复第一步的递归过程, 将这一步骤中删除节点的壳值ks定义为2. 重复上述过程, 直到所有的点都分解到某个壳中[12]. K-壳分解的示意图如图1所示. 图 1 K-shell分解法的示意图 Figure1. A diagram of the K-shell.
图 7 七个网络静态攻击重要节点的子团数量的统计情况(越大越好) (a) 子团数量平均值情况; (b) 子团数量最值情况 Figure7. The number of components during static attack experiments on the seven networks (the larger, the better)
本文采用SI(susceptible-infected)模型[36]来仿真网络中重要节点的传播行为, SI模型将传染范围内的人群划分为两类: 1) S类: 易感人群, 这类人是指还未染病但是与感染病患者接触后以一定的概率α受到感染; 2) I类: 染病人群, 这类人是指已经患病的人群. SI传染过程如图8所示. 图 8 SI传播模型图 Figure8. SI propagation model diagram.
计算每种中心性的度量值, 将得出的度量值按数值降序排列, 选取网络中每种中心性度量值最大的节点设置为感染节点I, 该节点以感染比率α去感染易感节点, 将该传染过程迭代M次, 统计感染节点的数量. 并用所求得的感染节点数量来度量所选初始节点在网络中的传播能力. 以美国电力网power为例, 选取复杂网络中按评估节点重要方法度量值最高的节点设置为感染节点I, 感染节点以α = 0.001的概率去感染易感节点S, 将这个过程迭代1000次, 统计出迭代过程结束后的感染节点数量, 选取部分迭代次数的感染节点的数量绘制统计图. 美国电力网power的不同迭代次数感染节点统计图和大肠杆菌代谢网络C.Elegans不同迭代次数感染节点统计图如图9和图10所示. 图 9 美国电力网power在SI传播模型下感染节点的变化情况 Figure9. Change of infected nodes in Power of American Electric Power network under SI propagation model.
图 10 大肠杆菌代谢网络C. Elegans在SI传播模型下感染节点的变化情况 Figure10. Changes of infection nodes in C. Elegans network under SI propagation model.
在SI模型的实验中, 传播感染的节点数量越多, 证明该节点在网络中的重要程度越高. 如图9中结果显示的那样, 多阶邻居壳数的向量中心性MKV在美国电力power网络中传播的效果是第二优的, 通过分析得到EC算法的时间复杂度为O(N2), 而MKV方法是时间复杂度为O(NlogN), 在相同的条件下, MKV的传播效果虽然比EC的传播效果差, 但是MKV方法要比EC的计算代价低. 而在图10大肠杆菌代谢网络C.Elegans的SI传播实验中, 迭代壳数的向量中心性MKV方法要优于其他四种方法. 因为本文的目的是验证哪个中心性参数在度量节点重要程度上有最好的结果, 所以本文试图在这几种中心性传播过程中能感染最多的节点. 为了量化这一目标, 本文从上述七个网络SI传播实验中选取不同迭代次数的感染节点数对其统计. ${r_{{\rm{si}}}} = {p}/{q}$, q表示K-shell方法感染节点数量, p表示其余中心性方法感染节点数量, rsi表示其他方法感染的节点数量要比K-shell方法感染节点的数量高多少, 分别求出这些时间段的平均值和最大值. 统计结果如图11所示. 图 11 七个网络传播感染节点的统计情况(越大越好) (a) SI传播节点数量比较平均值; (b) SI传播节点数量比较最值 Figure11. The statistical result of infected nodes of seven network (the larger, the better)