王超1, 2,
许国良2,,,
雒江涛2,
张轩1, 2
1.重庆邮电大学通信与信息工程学院 重庆 400065
2.重庆邮电大学电子信息与网络工程研究院 重庆 400065
基金项目:重庆市自然科学基金 (cstc2018jcyjAX0587),新型感知技术、信息融合处理及其应用(A2017-10)
详细信息
作者简介:李万林:男,1963年生,教授、博士生导师,研究方向为新一代网络技术、自动驾驶,车联网及移动大数据等
王超:男,1994年生,硕士生,研究方向为移动大数据、机器学习
许国良:男,1973年生,教授、硕士生导师,研究方向为光电传感与检测、通信网络设计与规划、大数据分析挖掘
雒江涛:男,1971年生,教授、博士生导师,研究方向为移动大数据、新一代网络技术、通信网络测试与优化等
张轩:男,1991年生,硕士生,研究方向为移动大数据、机器学习
通讯作者:许国良 xugl@cqupt.edu.cn
中图分类号:TN929.5计量
文章访问数:1111
HTML全文浏览量:439
PDF下载量:67
被引次数:0
出版历程
收稿日期:2019-11-14
修回日期:2020-06-09
网络出版日期:2020-07-16
刊出日期:2020-12-08
Research of Track Resident Point Identification Algorithm Based on Signaling Data
Wanlin LI1,Chao WANG1, 2,
Guoliang XU2,,,
Jiangtao LUO2,
Xuan ZHANG1, 2
1. Institute of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
2. Electronic Information and Networking Research Institute, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
Funds:The Natural Science Foundation of Chongqing (cstc2018jcyjAX0587), The New Sensing Technology, Information Fusion Processing and its Application (A2017-10)
摘要
摘要:针对密度聚类算法只能识别密度相近的簇类且计算复杂度高等问题,该文提出一种基于信令数据中时空轨迹信息的密度峰值快速聚类(ST-CFSFDP)算法。首先对低采样密度的信令数据进行预处理,消除轨迹震荡现象;然后基于密度峰值快速聚类(CFSFDP)算法显式地增加时间维度限制,将局部密度由2维扩展到3维,并提出高密度时间间隔以表征簇中心在时间维度上的数据特征;接着设计筛选策略以选取聚类中心;最后识别用户出行轨迹中的驻留点,完成出行链的划分。实验结果表明,所提算法适用于采样密度低且定位精度差的信令数据,相比CFSFDP算法更适用于时空数据,相比基于密度的时空聚类算法(ST-DBSCAN)召回率提升14%,准确率提升8%,同时降低计算复杂度。
关键词:信令数据/
时空聚类/
密度峰值快速聚类算法/
驻留点识别/
出行链
Abstract:For the problem that the density-based clustering algorithm can only identify clusters with similar density and high computational complexity, a Clustering by Fast Search and Find of Density Peaks based on Spatio-Temporal trajectory information in mobile phone signaling data, namely ST-CFSFDP, is proposed. Firstly, the low sampling density signaling data are pre-processed to eliminate the trajectory oscillation phenomenon in the data. Then, based on the Clustering by Fast Search and Find of Density Peaks(CFSFDP) algorithm, the time dimension limitation is explicitly increased, and the local density is extended from two-dimension to three-dimension. Moreover, in order to characterize the cluster center point in the time dimension, the concept of high-density time interval is defined. Secondly, the suitable cluster center screening strategy is developed to select automatically the appropriate cluster center. Finally, the resident points are identified in the travel trajectory of individual users over a period of time and the division of the travel chains is completed. The experimental results show that the algorithm is suitable for signaling data with low sampling density and poor positioning accuracy. It is more suitable for spatio-temporal data than CFSFDP algorithm. Compared with Density-Based Spatial Clustering of Applications with Noise based on Spatio-Temporal data (ST-DBSCAN) algorithm, the recall rate is improved by 14%, the accuracy rate is increased by 8%, and the computational complexity is also reduced.
Key words:Signaling data/
Spatio-temporal clustering/
Clustering by Fast Search and Find of Density Peaks (CFSFDP)/
Residual point recognition/
Travel chain
PDF全文下载地址:
https://jeit.ac.cn/article/exportPdf?id=6b76559f-196e-4c28-8242-e1bb4cd38133