

本站小编 Free考研考试/2021-12-29

<script type="text/javascript" src="https://cdn.bootcss.com/mathjax/2.7.2-beta.0/MathJax.js?config=TeX-AMS-MML_HTMLorMML"></script> <script type='text/x-mathjax-config'> MathJax.Hub.Config({ extensions: ["tex2jax.js"], jax: ["input/TeX", "output/HTML-CSS"], tex2jax: {inlineMath: [ ['$','$'], ["\\(","\\)"] ],displayMath: [ ['$$','$$'], ["\\[","\\]"] ],processEscapes: true}, "HTML-CSS": { availableFonts: ["TeX"] }, TeX: {equationNumbers: {autoNumber: ["none"], useLabelIds: true}}, "HTML-CSS": {linebreaks: {automatic: true}}, SVG: {linebreaks: {automatic: true}} }); </script> 陈世莉1,2,3,, 陶海燕1,2,, 李旭亮1,2, 卓莉1,2
1. 中山大学地理科学与规划学院 综合地理信息研究中心,广州 510275
2. 广东省城市化与地理环境空间模拟重点实验室,广州 510275
3. 中山大学城市化研究院,广州 510275

Discovering urban functional regions using latent semantic information: Spatiotemporal data mining of floating cars GPS data of Guangzhou

CHENShili1,2,3,, TAOHaiyan1,2,, LIXuliang1,2, ZHUOLi1,2
1. Center of Integrated Geographic Information Analysis, School of Geography and Planning, Sun Yat-sen University, Guangzhou 510275, China
2. Guangdong Provincial Key Laboratory of Urbanization and Geo-simulation, Guangzhou 510275, China
3. Urbanization Institute of Sun Yat-sen University, Guangzhou 510275, China
通讯作者:陶海燕(1966-), 女, 江苏扬州人, 副教授, 主要研究方向为多智能体地理模拟,空间数据挖掘.E-mail: taohy@mail.sysu.edu.cn
基金资助:广国家高技术发展计划(863) (2013AA122302);广东省自然科学基金项目(S2013010012554);国家自然科学基金项目(41371499, 41271138)
-->作者简介:陈世莉(1990-), 女, 四川达州人, 博士, 主要研究方向为空间数据挖掘,城市空间结构研究.E-mail:SLChen@126.com



China has been experiencing rapid urbanization at an unprecedented rate and as a result, urban internal space structure has evolved significantly. It is of great significance to label different functional regions (DFR) inside a city for urban structure analysis, policy making, and resource allocation. These DFRs include residential district, industrial district, education district, and the administration district. This paper explored the characteristics and distribution of urban functional regions based on big geographic data. With the latest road network data, the study area (i.e., 6 districts of Guangzhou city in Guangdong Province, China) was partitioned into 439 segments. By applying the employment of spatial and temporal semantic mining method to the one-week massive floating cars GPS data and the point of interest data, we developed a Latent Dirichlet Allocation (LDA) and Dirichlet Multinomial Regression (DMR) model. Moreover, OPTICS clustering method was employed to process the results of LDA and DMR to identify different functional zones. Meanwhile, status map of Guangzhou urban planning, and resident travel characteristics were used to verify the verification of mentioned results. The results show that this method can identify the obvious characteristics of urban functional areas, such as mature residential area, science and education culture area, commercial area, and development zone. The results also show that residential and commercial areas are dominant DFRs in Guangzhou city, which are surrounded by other types of functional regions. This paper brings a new perspective on using large-scale and high quality individual space-time data to study human migration and daily activities, as well as to explore social space to unveil the formation and mechanism of urban functional zones.

Keywords:latent dirichlet allocation;functional regions;big geographic data;GPS data;point of interest;Guangzhou

PDF (1093KB)元数据多维度评价相关文章收藏文章
陈世莉, 陶海燕, 李旭亮, 卓莉. 基于潜在语义信息的城市功能区识别----广州市浮动车GPS时空数据挖掘[J]. , 2016, 71(3): 471-483 https://doi.org/10.11821/dlxb201603010
CHEN Shili, TAO Haiyan, LI Xuliang, ZHUO Li. Discovering urban functional regions using latent semantic information: Spatiotemporal data mining of floating cars GPS data of Guangzhou[J]. 地理学报, 2016, 71(3): 471-483 https://doi.org/10.11821/dlxb201603010

1 引言

城市化是重要的全球性社会经济现象,而城市化中的功能多元化则是城市发展的基础.功能多元化形成的城市功能分工[1],为人们的居住,工作,游憩和交通提供着各方面的便利.传统的城市功能分区研究大多是采用土地利用类型现状图,调查问卷等数据,利用各种聚类或者建立指标体系等方法来对城市功能区域进行划分[2-12].数据的获取不仅耗时耗力效率低,且确定权重系数时主观因素太强.随着信息和通讯技术的发展,城市中大量的传感器,例如,全球定位系统(Global Position System, GPS),全球移动通信系统(Global System for Mobile, GSM),智能卡收费系统数据(Smart Card Data, SCD)等,可以获取大规模的,高质量的个体时空数据.这些移动定位数据获取成本低,覆盖范围广,且具有时态特性.因此,结合大数据和数据挖掘,通过人们日常行为反映城市功能分区,可以为城市研究提供一种新的方法和途径.
近年来,大量国外****使用手机数据,公交车数据[13-15],GPS数据[16]以及LBS(Location Based Service)数据[17-21]开展土地利用分类,城市功能区划,活动轨迹等方面研究.同时,随着国内智慧城市建设的推广以及大数据相关研究的深入,相继有****[16, 22-30]采用GPS数据和公交车数据,通过Google软件,数据挖掘工具,建立各种模型等来研究城市空间结构.
开展城市功能分区的研究有利于合理,健康地规划未来城市.近年来,已有****采用大数据开展人们出行规律和特征等方面的研究[21-25, 31],而通过挖掘大数据中的海量潜在语义信息来辅助开展城市空间结构中的城市功能分区的研究还相对较少.因此本文拟采用在文本分类领域中能快速挖掘出海量文本中潜在语义的主题模型(LDA模型和DMR模型),以广州市为例,提取浮动车轨迹数据和兴趣点数据的潜在语义信息,然后,通过OPTICS聚类方法对不同模型处理后的结果进行聚类,并对聚类后的类别进行识别.研究结果表明,该方法能有效地识别广州市不同类型的功能区,对城市规划,政策制定,资源配置等各个领域具有辅助和引导价值.

2 研究方法和思路

潜在的狄利克雷分布(Latent Dirichlet Allocation, LDA)是一种主题模型,由Blei等于2003年提出[32].它是当前文本处理研究的范式之一,可以对文字隐含主题进行建模,不仅弥补了信息检索中传统的文本相似度计算方法的不足,并且更适合基于大规模语料库(乃至于海量互联网数据)寻找文字间的语义主题[32-33].此模型能确定一个语料库中的每一篇文档有多个主题的概率,能充分提取出一句话或者一个单词的语义信息.
-->Fig. 1Latent Dirichlet Allocation model

这里的α和η分别为狄利克雷文档--主题分布和主题--单词分布的输入参数.假设文档有K个主题,β是一个 K×M的矩阵,其中M是字典中的单词数(语料库D中的所有单词).每一个βk是字典上的一种分布.每一个文档d的主题比例为θd,且θd,k是主题Kd文档中的主题比例值.文档d的主题分布为Zd,Zd,n是第n个单词在文档d中的主题分布.文档d中的观察单词为Wd,Wd,n表示n个单词在文档d中.
狄利克雷多项式回归(Dirichlet Multinomial Regression, DMR)模型因其参数中需要输入先验数据,使得实验结果更贴近现实情境,与基本模型LDA相比更有优势[35].DMR模型中的先验α对每一个区域的αi而言,结合了每一个区域的POI特征项矢量.例如: αi,k=exp(XiTλk).因此,对于不同的POI类别分布,都会有一个α值的分布.所以,活动类型的分布是POI特征项和移动模式的总和.最后,通过应用DMR模型,输入移动模式和POI特征项,获得了每一个区域的活动分布以及每一次活动的移动模式分布.
式中: Oi表示起始区域; Ti表示从起始区域出发的时间; Dj表示目的区域; Tj表示到达目的区域的时间.如果一个人在T时刻从S区域出发,到达目的区域X,则用字符串"O_X_T"表示起始区域的移动模式,其中字符"O"代表起始区域S;用字符串"S_D_T"表示目的区域的移动模式,其中字符"D"代表目的区域X.

3 研究数据和研究区域

Tab. 1
Tab. 1Data specification

兴趣点(Point Of Interest, POI)是导航,智能交通,基于位置服务等应用中一种重要的基础数据.本文所选取的POI数据的类别大类划分为15种,中类划分为65种.其中,每一个POI点,编号由大类+中类+序号组成,包括名字,经度,纬度等属性.根据本文的研究目的和需求,并经过多次试验,将POI数据合并为29个类别.需要说明的是本文中暂未考虑同一类别中不同等级兴趣点对结果可能造成的影响.

4 基于语义信息的广州市城市功能区识别

4.1 城市功能分区结果

-->Fig. 2The result of LDA model

-->Fig. 3The result of DMR model

4.2 城市功能分区识别方法

(1)计算得出每种功能区的FD(Frequency Density),即通过计算每一类POI在每类功能区中的密度,并对其进行内部索引排序即IR(Internal Ranking),从而得到该类型区域内POI的分布特征,推测该类型区域能实现的功能.其中第 j类的POI类别在各个区域中的密度 pj计算公式如下:
Tab. 2
Tab. 2Overall POI density vector and ranking of functional zones (FD: Frequency Density, IR: Internal Ranking, F: Functional Zones)

-->Fig. 4Frequency of arrival/departure of functional regions on weekdays and weekends

-->Fig. 5Weekday and weekend arrival/departure regions of F1 and F2

-->Fig. 6Weekday and weekend arrival/departure regions of F3 and F4

-->Fig. 7Weekday and weekend arrival/departure regions of F6 and F7


4.3 识别结果


5 精度验证

Tab. 3
Tab. 3The comparisons between recognized results and realities


6 结论与讨论

The authors have declared that no competing interests exist.

参考文献 原文顺序

[1]Li Dehua.Principles of Urban Planning. Beijing: China Architecture & Building Press, 2001. [本文引用: 1]

[李德华. 城市规划原理. 北京: 中国建筑工业出版社, 2001: 12] [本文引用: 1]
[2]Tian G, Wu J, Yang Z.Spatial pattern of urban functions in the Beijing metropolitan region
. Habitat International, 2010, 34(2): 249-255.
https://doi.org/10.1016/j.habitatint.2009.09.010Magsci [本文引用: 1]摘要
<h2 class="secHeading" id="section_abstract">Abstract</h2><p id="">The morphology of a city affects its ecological and socioeconomic functions, and thus how a city is spatially structured has important bearings on urban sustainability. The paper analyzes the spatial pattern of Beijing in relation to its urban functions. Our results show that the 6 concentric ring-roads in Beijing provide a basic framework for the city's overall spatial pattern, and also give its apparent resemblance to the classic concentric zone theory. The paper identifies 5 concentric zones for Beijing based on a suite of urban functions. However, there are significant differences between the urban spatial pattern of Beijing and that depicted in the classic concentric zone theory. The study sheds new light on the urban morphology of one of the major Chinese cities, and provides needed information for developing plans to diffuse urban functions in Beijing.</p>
[3]Dou Zhi.Research on the spatial clustering algorithm for urban function zoning [D]
. Chengdu: Sichuan Normal University, 2010.

[窦智. 城市功能区划分空间聚类算法研究[D]
. 成都: 四川师范大学, 2010.]

[4]Li Xinyun.Research on methods and application for urban spatial data mining [D]
. Taian: Shandong University of Science and Technology, 2004.

[李新运. 城市空间数据挖掘方法与应用研究[D]
. 泰安: 山东科技大学, 2004.]

[5]Qu Guoqing, Jiang Yuchun.Cluster analysis and its application in land utilization classification
. Resource Development & Market, 1999, 15(4): 4-7.
[曲国庆, 姜玉春. 聚类分析及其在土地利用分类中的应用
. 资源开发与市场, 1999, 15(4): 4-7.]
[6]Shi Yufeng, Wang Yan.Study on urban function partition based on self-organizing neural network
. Computer Engineering, 2006, 32(18): 206-207.
城市功能分区是指运用有关模型和方法,使城市空间形成明确的功能单元和有序的空间结构,空间聚类是城市功能分区的一种常用方法。基于自组织映射神经网络,该文提出了一种组合式的城市功能区聚类方法,根据位置-属性一体化思想,综合考虑了影响城市功能分区的位置数据和属性信息,对城市功能区进行空间聚类计算。该方法挖掘了空间位置数据和属性信息中隐含的空间聚集信息,保证了城市功能分区结果的可靠性。实例分析表明,该方法的聚类结果可以为城市功能分区提供准确、可靠的依据。 <BR><BR>
[史玉峰, 王艳. 基于自组织神经网络的城市功能分区研究
. 计算机工程, 2006, 32(18): 206-207.]
城市功能分区是指运用有关模型和方法,使城市空间形成明确的功能单元和有序的空间结构,空间聚类是城市功能分区的一种常用方法。基于自组织映射神经网络,该文提出了一种组合式的城市功能区聚类方法,根据位置-属性一体化思想,综合考虑了影响城市功能分区的位置数据和属性信息,对城市功能区进行空间聚类计算。该方法挖掘了空间位置数据和属性信息中隐含的空间聚集信息,保证了城市功能分区结果的可靠性。实例分析表明,该方法的聚类结果可以为城市功能分区提供准确、可靠的依据。 <BR><BR>
[7]Wang Hui.Spatial impacts of new economies and the implications for city planning and decision-making
. Geographical Research, 2007, 26(3): 577-589.
[王慧. 城市"新经济"发展的空间效应及其启示: 以西安市为例
. 地理研究, 2007, 26(3): 577-589.]
[8]Wang Hui.Rise of new special development zones and polarization of socio-economic space in Xi'an
. Acta Geographica Sinica, 2006, 61(10): 1011-1024.
[王慧. 开发区发展与西安城市经济社会空间极化分异
. 地理学报, 2006, 61(10): 1011-1024.]
[9]Wang Hui, Tian Pingping, Liu Hong.Spatial structuring of the 'new economies' in Xi'an and its mechanisms
. Geographical Research, 2006, 25(3): 539-550.
[王慧, 田萍萍, 刘红. 西安城市"新经济"发展的空间特征及其机制
. 地理研究, 2006, 25(3): 539-550.]
[10]Wang Yan, Song Zhenbai, Wu Peilin.A study on spatial clustering of urban function partition
. Areal Research and Development, 2009, 28(1): 27-31.
作为空间数据挖掘的一种重要手段,空间聚类目前已在许多领域得到了应用,它是城市功能分区中 的关键性步骤。根据空间一属性一体化的概念模型,把影响城市功能分区的空间坐标、空间关系和属性特征纳入到统一的空间计算模型,分别运用K-平均算法、神 经网络方法,对城市功能分区进行空间聚类计算,充分挖掘空间坐标和空间关系数据中隐含的空间聚集信息。实例分析表明,基于神经网络的空间聚类结果可以为城 市功能分区提供准确、可靠的依据。
[王艳, 宋振柏, 吴佩林. 城市功能分区的空间聚类方法研究及其应用: 以济南市为例
. 地域研究与开发, 2009, 28(1): 27-31.]
作为空间数据挖掘的一种重要手段,空间聚类目前已在许多领域得到了应用,它是城市功能分区中 的关键性步骤。根据空间一属性一体化的概念模型,把影响城市功能分区的空间坐标、空间关系和属性特征纳入到统一的空间计算模型,分别运用K-平均算法、神 经网络方法,对城市功能分区进行空间聚类计算,充分挖掘空间坐标和空间关系数据中隐含的空间聚集信息。实例分析表明,基于神经网络的空间聚类结果可以为城 市功能分区提供准确、可靠的依据。
[11]Wu Wenheng, Xu Zewei, Yang Xinjun.Quantitative research of spatial development differentiation in Xi'an from the perspective of urban functional zoning
. Geographical Research, 2012, 31(12): 2173-2184.
有效的功能分区是城市系统健康有序发展的保障。本文构建评价指标体系, 采用系统 聚类分析方法, 在功能分区视角下将西安市九个辖区分为核心区、雁塔区、未央区、长安区、 灞桥区及临潼-阎良区六个功能区域。中心商务区主导下核心区商贸流通业发达, 高新技术 产业开发区与曲江文化产业示范区主导下雁塔区技术经济与文化产业特色鲜明, 经济技术开 发区主导下未央区工业经济特征明显, 郭杜教育科技产业开发区与国家民用航天产业基地强 化长安区科教文化发展方向, 浐灞生态区彰显灞桥区生态经济发展特色, 生态农业区、阎良 飞机城加快临潼、阎良两区传统农业发展转型。功能分区及发展空间分异是城市化与经济发 展、历史基础与路径依赖、规模经济与集聚效应、本底化与特色化等基础因子以及新经济与 开发区建设、宏观调控与政策导向等新型因子综合驱动的结果, 基础因子起缓慢影响与逐渐 推进的辅助作用, 新型因子起突变影响与直接推动的主导作用。研究有助于科学认识城市, 合理组织城市, 提供实践参考。
[吴文恒, 徐泽伟, 杨新军. 功能分区视角下的西安市发展空间分异
. 地理研究, 2012, 31(12): 2173-2184.]
有效的功能分区是城市系统健康有序发展的保障。本文构建评价指标体系, 采用系统 聚类分析方法, 在功能分区视角下将西安市九个辖区分为核心区、雁塔区、未央区、长安区、 灞桥区及临潼-阎良区六个功能区域。中心商务区主导下核心区商贸流通业发达, 高新技术 产业开发区与曲江文化产业示范区主导下雁塔区技术经济与文化产业特色鲜明, 经济技术开 发区主导下未央区工业经济特征明显, 郭杜教育科技产业开发区与国家民用航天产业基地强 化长安区科教文化发展方向, 浐灞生态区彰显灞桥区生态经济发展特色, 生态农业区、阎良 飞机城加快临潼、阎良两区传统农业发展转型。功能分区及发展空间分异是城市化与经济发 展、历史基础与路径依赖、规模经济与集聚效应、本底化与特色化等基础因子以及新经济与 开发区建设、宏观调控与政策导向等新型因子综合驱动的结果, 基础因子起缓慢影响与逐渐 推进的辅助作用, 新型因子起突变影响与直接推动的主导作用。研究有助于科学认识城市, 合理组织城市, 提供实践参考。
[12]Zhu Zhilin.Research on the urban function zoning divide based on fuzzy clustering. Dam and Safety, 2006(S): 28-31.https://doi.org/10.3969/j.issn.1671-1092.2006.z1.011URL [本文引用: 1]摘要
阐述了城市土地利用功能区的基本概念、分区的基本原则和各个功能 区的特性,简述模糊聚类分析方法的基本思想和特征,详述了运用模糊聚类进行城市功能区划分的基本流程和具体步骤,重点是指标的选取和相似系数的确定.以大 同市新荣区为例对模糊聚类在城市土地利用功能分区上的应用进行实例研究,并对聚类出来的结果进行了客观的分析,分析了所研究内容的应用前景和作用.
[朱枝琳. 基于模糊聚类的城市功能区划分研究. 大坝与安全, 2006(S): 28-31.]https://doi.org/10.3969/j.issn.1671-1092.2006.z1.011URL [本文引用: 1]摘要
阐述了城市土地利用功能区的基本概念、分区的基本原则和各个功能 区的特性,简述模糊聚类分析方法的基本思想和特征,详述了运用模糊聚类进行城市功能区划分的基本流程和具体步骤,重点是指标的选取和相似系数的确定.以大 同市新荣区为例对模糊聚类在城市土地利用功能分区上的应用进行实例研究,并对聚类出来的结果进行了客观的分析,分析了所研究内容的应用前景和作用.
[13]Joh C H, Hwang C A.Time-geographic analysis of trip trajectories and land use characteristics in Seoul metropolitan area by using multidimensional sequence alignment and spatial analysis
. Washington, DC: AAG Annual Meeting, 2010.
[本文引用: 1]
[14]Sun L, Lee D, Erath A, et al.Using smart card data to extract passenger's spatio-temporal density and train's trajectory of MRT system
. Urban Computing, 2012: 142-148.
ABSTRACT Rapid tranit systems are the most important public transportation service modes in many large cities around the world. Hence, its service reliability is of high importance for government and transit agencies. Despite taking all the necessary precautions, disruptions cannot be entirely prevented but what transit agencies can do is to prepare to respond to failure in a timely and effective manner. To this end, information about daily travel demand patterns are crucial to develop efficient failure response strategies. To the extent of urban computing, smart card data offers us the opportunity to investigate and understand the demand pattern of passengers and service level from transit operators. In this present study, we present a methodology to analyze smart card data collected in Singapore, to describe dynamic demand characteristics of one case mass rapid transit (MRT) service. The smart card reader registers passengers when they enter and leave an MRT station. Between tapping in and out of MRT stations, passengers are either walking to and fro the platform as they alight and board on the trains or they are traveling in the train. To reveal the effective position of the passengers, a regression model based on the observations from the fastest passengers for each origin destination pair has been developed. By applying this model to all other observations, the model allows us to divide passengers in the MRT system into two groups, passengers on the trains and passengers waiting in the stations. The estimation model provides the spatio-temporal density of passengers. From the density plots, trains' trajectories can be identified and passengers can be assigned to single trains according to the estimated location. Thus, with this model, the location of a certain train and the number of onboard passengers can be estimated, which can further enable transit agencies to improve their response to service disruptions. Since the respective final destination can also be derived from the data set, one can develop effective failure response scenarios such as the planning of contingency buses that bring passengers directly to their final destinations and thus relieves the bridging buses that are typically made available in such situations.
[15]Zhong C, Huang X, Arisona S M, et al.Inferring building functions from a probabilistic model using public transportation data. Computers,
Environment and Urban Systems, 2014, 48(6): 124-137.
https://doi.org/10.1016/j.compenvurbsys.2014.07.004URL [本文引用: 1]摘要
Cities are complex systems. They contain different functional areas originally defined by planning and then reshaped by actual needs and use by the inhabitants. Estimating the functions of urban space is of significant importance for detecting urban problems, evaluating planning strategies, and supporting policy making. In light of the potential of data mining and spatial analysis techniques for urban analysis, this paper proposes a method to infer urban functions at the building level using transportation data obtained from surveys and smart card systems. Specifically, we establish a two-step framework making use of the spatial relationships between trips, stops, and buildings. Firstly, information about the travel purposes for daily activities is deduced using passengers鈥 mobility patterns based on a probabilistic Bayesian model. Secondly, building functions are inferred by linking daily activities to the buildings surrounding the stops based on spatial statistics. We demonstrate the proposed method using large-scale public transportation data from two areas of Singapore. Our method is applied to identify building functions at building level. The result is verified with master plan, street view, and investigated data, and limitations are identified. Our work shows that the presented method is applicable in practice with a good accuracy. In a broader context, it shows the effectiveness of applying integrated techniques to combine multi-source data in order to make insights about social activities and complex urban space.
[16]Qi G, Li X, Li S, et al.Measuring social functions of city regions from large-scale taxi behaviors
. Pervasive Computing and Communications Workshops, 2011: 384-388.
https://doi.org/10.1109/PERCOMW.2011.5766912URL [本文引用: 2]摘要
City-scale human mobility analysis is an important problem in pervasive computing. In this paper, with qualitative and quantitative analysis, we establish and confirm the relationship between the get-on/off characteristics of taxi passengers and the social function of city regions. We find that get-on/off amount in a region can depict the social activity dynamics in that area, i.e. the temporal variation of get-on/off amount can characterize the social function of a region. The experimental results on a large-scale real-world taxi dataset suggest that three typical regional categories can be recognized even using a very simple classification method.
[17]Cranshaw J, Schwartz R, Hong J I.The livehoods project: utilizing social media to understand the dynamics of a city
. ICWSM, 2012.
URL [本文引用: 1]摘要
Studying the social dynamics of a city on a large scale has traditionally been a challenging endeavor, requiring long hours of observation and interviews, usual
[18]Gaubatz P. Changing Beijing. Geographical Review, 1995: 79-96.https://doi.org/10.2307/215557URL摘要
Since the economic reforms of 1979 the physical representation of socialist ideology and state power imposed on Beijing during the Mao era has been transformed by ideological, economic, and social changes. Focusing on the rapid transformation of the urban landscape after 1979, this article examines urban-planning strategy and the effect of changes in industry, housing, commerce, and transportation on urban form. Traditional Chinese urban form and socialist urban structure continue to shape the city, despite rapid change that is bringing Beijing closer in form to cities in other developing countries. The data presented in this article were collected during fieldwork in 1992, 1993, and 1994 as part of a study of recent urban planning and development in Beijing, Shanghai, Xiamen, and Guangzhou.TRADITIONAL BEIJINGBeijing is a venerable city steeped in the grandest of northern and imperial traditions. It has served as the national capital for much of the time since the founding of the Liao dynasty in the tenth century A.D. It has been a major regional political center from as early as the Warring States period (453-221 B.C.). City form in 1949 retained many patterns dating to the Ming dynasty (1368-1644) (Hou 1983). The street network and all monumental architecture were aligned with the cardinal directions to conform with Chinese geomancy, and massive crenellated walls bounded most of the site. The stone-faced walls enclosed two adjoining areas: a square imperial city on the north, which contained the walled palace complex as well as the homes and temples of the city's political elites; and a rectangular area on the south, which contained the commercial and common residential districts. Between the few monumental axis roads that traversed the city, residential neighborhoods of courtyard houses were threaded by hutongs - narrow alleyways separating the high, blank walls of the courtyards. The imperial palace, a few monumental structures such as temples and the bell-and-drum towers, and hundreds of hutongs and old neighborhoods still survive.Pre-1949 Beijing had several distinctive districts that serve similar functions today. Two of these districts - the imperial city and the Qianmen-Dazhalan market - date to the Ming dynasty. Two late-nineteenth-century districts also survive: the Wangfujing Street shopping area east of the palace complex, which served as the commercial district for the foreign community, and the university district on the northwest, which developed around the Harvard-Yenching Institute.Traditional Beijing, like most Chinese cities, was characterized by a high degree of neighborhood specialization. Members of craft guilds plied their trades as groups in specific neighborhoods. Other neighborhoods specialized in activities ranging from the academic nurturing of the next generation of bureaucrats to the provision of services for foreign legations. Present-day street names often recall past functions.SOCIALIST BEIJINGDuring the 1950s, 1960s, and 1970s, Mao Zedong's socialism reshaped the city. New development was structured around large walled work-unit compounds, where people lived as small communities centered on the workplace. The compounds consisted of three-to-five-story blocklike buildings that accommodated varied enterprises such as housing, production facilities, dining halls, and infirmaries. In the work-unit-based city, neighborhoods were relatively undifferentiated by function (Pannell 1980). Chinese cities in the 1949-1978 era were planned on the assumption that most residents would rarely need to travel beyond their compounds. There were no private cars and few taxis. Wide monumental streets that ran between the high compound walls were traversed primarily by buses, trucks, and bicycles, but traffic was sparse.Although the Maoist urban structure was realized in newly developed areas, preexisting structures - specifically the complex maze of courtyard housing and winding hutongs - constrained development in the old districts. 鈥
[19]Kling F, Pozdnoukhov A.When a city tells a story: Urban topic analysis
. Advances in Geographic Information Systems, 2012: 482-485.
ABSTRACT This paper explores the use of textual and event-based citizen-generated data from services such as Twitter and Foursquare to study urban dynamics. It applies a probabilistic topic model to obtain a decomposition of the stream of digital traces into a set of urban topics related to various activities of the citizens in a course of the week. Due to the combined use of implicit textual and movement data, we obtain seman-tically rich modalities of the urban dynamics and overcome the drawbacks of several previous attempts. Other impor-tant advantages of our method includes its flexibility and robustness with respect to the varying quality and volume of the incoming data. We describe an implementation archi-tecture of the system, the main outputs of the analysis, and the derived exploratory visualizations. Finally, we discuss the implications of our methodology for enriching location-based services with real-time context.
[20]Pozdnoukhov A, Kaiser C.Space-time dynamics of topics in streaming text. Location-based
Social Networks, 2011: 1-8.
ABSTRACT Human-generated textual data streams from services such as Twitter increasingly become geo-referenced. The spatial resolution of their coverage improves quickly, making them a promising instrument for sensing various aspects of evolution and dynamics of social systems. This work explores spacetime structures of the topical content of short textual messages in a stream available from Twitter in Ireland. It uses a streaming Latent Dirichlet Allocation topic model trained with an incremental variational Bayes method. The posterior probabilities of the discovered topics are post-processed with a spatial kernel density and subjected to comparative analysis. The identified prevailing topics are often found to be spatially contiguous. We apply Markov-modulated non-homogeneous Poisson processes to quantify a proportion of novelty in the observed abnormal patterns. A combined use of these techniques allows for real-time analysis of the temporal evolution and spatial variability of population's response to various stimuli such as large scale sportive, political or cultural events.
[21]Yin Z, Cao L, Han J, et al.Geographical topic discovery and comparison
. World Wide Web, 2011: 247-256.
https://doi.org/10.1145/1963405.1963443URL [本文引用: 2]摘要
This paper studies the problem of discovering and comparing geographical topics from GPS-associated documents. GPSassociated documents become popular with the pervasiveness of location-acquisition technologies. For example, in Flickr, the geo-tagged photos are associated with tags and GPS locations. In Twitter, the locations of the tweets can be identified by the GPS locations from smart phones. Many interesting concepts, including cultures, scenes, and product sales, correspond to specialized geographical distributions. In this paper, we are interested in two questions: (1) how to discover different topics of interests that are coherent in geographical regions? (2) how to compare several topics across different geographical locations? To answer these questions, this paper proposes and compares three ways of modeling geographical topics: location-driven model, text-driven model, and a novel joint model called LGTA (Latent Geographical Topic Analysis) that combines location and text. To make a fair comparison, we collect several representative datasets from Flickr website including Landscape, Activity, Manhattan, National park, Festival, Car, and Food. The results show that the first two methods work in some datasets but fail in others. LGTA works well in all these datasets at not only finding regions of interests but also providing effective comparisons of the topics across different locations. The results confirm our hypothesis that the geographical distributions can help modeling topics, while topics provide important cues to group different geographical regions.
[22]Liu Y, Wang F, Xiao Y, et al.Urban land uses and traffic 'source-sink areas': Evidence from GPS-enabled taxi data in Shanghai
. Landscape and Urban Planning, 2012, 106(1): 73-87.
https://doi.org/10.1016/j.landurbplan.2012.02.012URL [本文引用: 1]摘要
Most of the existing literature focuses on estimating traffic or explaining trip lengths from land use. This research attempts to reveal intraurban land use variations from traffic patterns. Using a seven-day taxi trajectory data set collected in Shanghai, we investigate the temporal variations of both pick-ups and drop-offs, and their association with different land use features. Based on the balance between the numbers of drop-offs and pick-ups and its distinctive temporal patterns, the study area is classified into six traffic ‘source-sink’ areas. These areas are closely associated with various land use types (commercial, industrial, residential, institutional and recreational) as well as land use intensity. The study shows that human mobility data from location aware devices provide us an opportunity to derive urban land use information in a timely fashion, and help urban planners and policy makers in mitigating traffic, planning for public services and resources, and other purposes.
[23]Pulliam H R.Sources, sinks, and population regulation
. The American Naturalist, 1988, 132(5): 652-661.
Fertl, W.H. and Martin, J.R., 1987. Well logging technology for highly deviated and horizontal wellbores. J. Pet. Sci. Eng., 1: 83-90. With the Dresser Atlas pipe-conveyed logging system for highly deviated wells, i.e., the Slant-hole Express, the logging instruments are guided to the bottom of the borehole through the protection of the drillpipe. A wireline guide placed at the end of the drillstring assembly, facilitates the re-entry of the logging instruments into the drillstring after completion of the open-hole logging run. The significant difference in the Slant-hole Express operation and conventional through-drillpipe pump-down operations is in the use of a sidedoor wireline entry-sub properly placed within the drillstring. Casing depth and placement of the sub, which should stay within the protective casing during the logging operation, then defines the length of the open-hole logging run per trip in the borehole. A significant advantage of the Slant-hole Express system is that no special "wet" electrical cable connector is required for mating to the logging instrumentation. In order to overcome downhole problems due to friction of logging instruments, insufficient weight, rigidity of long instrument assemblies, and the possibility of the bull plug sticking at small obstructions in highly deviated wellbores, a number of mechanical devices are available. These include sinker bars, swivels, knuckle joints, side-and-through-rollers, and nose guides. In the horizontal portion of a borehole, the use of coiled tubing has proven successful in "pushing" the logging instrumentation toward the bottom (end) of the borehole. The basic concepts, considerations, observations, and field experiences with this technology are discussed and illustrated in this paper.
[24]Yuan J, Zheng Y, Xie X.Discovering regions of different functions in a city using human mobility and POIs
. Knowledge Discovery and Data Mining, 2012: 186-194.
[25]Dong Xiaojing, Yu Zhiwei, Fu Weiwei.Data processing and analyzing system for bus IC card based on GIS
. Geospatial Information, 2009, 7(5): 124-126.
https://doi.org/10.3969/j.issn.1672-4623.2009.05.039URL [本文引用: 1]摘要
公交数据库里潜含着反映公交运 营状况的客流信息,对其进行分析可以为公交线网优化和交通管理提供依据。以北京市为例,在VC++2005开发平台上,利用SQL Server数据库和ArcSDE开发出基于GIS的公交IC卡数据处理及分析系统,不仅能够实现GIS的一般功能,还改进了公交/地铁换乘处理标准,实 现全网断面客流以及全网速度的显示。
[董晓晶, 余志伟, 伏伟伟. 基于GIS的公交IC卡数据处理及分析系统
. 地理空间信息, 2009, 7(5): 124-126.]
https://doi.org/10.3969/j.issn.1672-4623.2009.05.039URL [本文引用: 1]摘要
公交数据库里潜含着反映公交运 营状况的客流信息,对其进行分析可以为公交线网优化和交通管理提供依据。以北京市为例,在VC++2005开发平台上,利用SQL Server数据库和ArcSDE开发出基于GIS的公交IC卡数据处理及分析系统,不仅能够实现GIS的一般功能,还改进了公交/地铁换乘处理标准,实 现全网断面客流以及全网速度的显示。
[26]Gao Lianxiong, Wu Jianping.An algorithm for mining passenger flow information from smart card data
. Journal of Beijing University of Posts and Telecommunications, 2011, 34(3): 94-97.
为了从广泛使用的智能卡付费系 统获取公交客流信息,提出了一种利用公交调度信息和智能卡刷卡信息推断乘客上车站点的方法.对同一辆车的连续2次刷卡进行朴素贝叶斯分类,区分是否是在同 一个站刷卡;利用极大似然估计、动态规划和二次规划方法估计出各路段的行程时间;运用坐标下降法从不准确的初始参数出发,交替估计行程时间和行程时间的参 数,从而推断出每次刷卡的上车站点.实验结果验证了新方法的正确性和有效性,证明了该方法误差较小,收敛较快.
[高联雄, 吴建平. 从智能卡数据挖掘客流信息的算法
. 北京邮电大学学报, 2011, 34(3): 94-97.]
为了从广泛使用的智能卡付费系 统获取公交客流信息,提出了一种利用公交调度信息和智能卡刷卡信息推断乘客上车站点的方法.对同一辆车的连续2次刷卡进行朴素贝叶斯分类,区分是否是在同 一个站刷卡;利用极大似然估计、动态规划和二次规划方法估计出各路段的行程时间;运用坐标下降法从不准确的初始参数出发,交替估计行程时间和行程时间的参 数,从而推断出每次刷卡的上车站点.实验结果验证了新方法的正确性和有效性,证明了该方法误差较小,收敛较快.
[27]Long Ying, Zhang Yu, Cui Chengyin.Identifying commuting pattern of Beijing using bus smart card data
. Acta Geographica Sinica, 2012, 67(10): 1339-1352.
基于位置服务(Location Based Service, LBS) 技术为研究城市系统的时空动态规律提供了新的视角, 已往多基于移动通讯(GSM)、全球定位系统(GPS)、社会化网络(SNS) 和无线宽带热点(Wi-Fi) 数据开展研究, 但少有研究利用公交IC 卡刷卡数据进行城市系统分析。普遍存在的LBS数据虽然具有丰富的时间和空间信息, 但缺乏社会维度信息, 使其应用范围受到一定限制。本文基于2008 年北京市连续一周的公交IC 卡(Smart Card Data, SCD) 刷卡数据, 结合2005 年居民出行调查、地块级别的土地利用图, 识别公交持卡人的居住地、就业地和通勤出行, 并将识别结果在公交站点和交通分析小区(TAZ) 尺度上汇总:① 将识别的通勤出行分别从通勤时间和距离角度, 与居民出行调查数据和其他已有北京相关研究进行对比, 显示较好的吻合性;② 对来自3 大典型居住区和去往6 大典型办公区的通勤出行进行可视化并对比分析;③ 对全市基于公交的通勤出行进行可视化, 并识别主要交通流方向。本研究初步提出了从传统的居民出行调查和城市GIS 数据建立规则, 用于SCD数据挖掘的方法, 具有较好的可靠性。
[龙瀛, 张宇, 崔承印. 利用公交刷卡数据分析北京职住关系和通勤出行
. 地理学报, 2012, 67(10): 1339-1352.]
基于位置服务(Location Based Service, LBS) 技术为研究城市系统的时空动态规律提供了新的视角, 已往多基于移动通讯(GSM)、全球定位系统(GPS)、社会化网络(SNS) 和无线宽带热点(Wi-Fi) 数据开展研究, 但少有研究利用公交IC 卡刷卡数据进行城市系统分析。普遍存在的LBS数据虽然具有丰富的时间和空间信息, 但缺乏社会维度信息, 使其应用范围受到一定限制。本文基于2008 年北京市连续一周的公交IC 卡(Smart Card Data, SCD) 刷卡数据, 结合2005 年居民出行调查、地块级别的土地利用图, 识别公交持卡人的居住地、就业地和通勤出行, 并将识别结果在公交站点和交通分析小区(TAZ) 尺度上汇总:① 将识别的通勤出行分别从通勤时间和距离角度, 与居民出行调查数据和其他已有北京相关研究进行对比, 显示较好的吻合性;② 对来自3 大典型居住区和去往6 大典型办公区的通勤出行进行可视化并对比分析;③ 对全市基于公交的通勤出行进行可视化, 并识别主要交通流方向。本研究初步提出了从传统的居民出行调查和城市GIS 数据建立规则, 用于SCD数据挖掘的方法, 具有较好的可靠性。
[28]Ni Zhongyun, Lei Fanggui, Yang Wunian.Application of google earth in Chengdu functional zoning
. Scientific and Technological Management of Land and Resources, 2007, 24(4): 121-124.
在历经了20余年的改革发展 后,成都市在基础设施建设、产业结构调整和生态环境保护等方面都取得了长足的进步,在政治经济文化发展的同时,城市的用地规模日益扩大,城市功能分区日趋 复杂,人地矛盾、功能与形态分区矛盾日渐凸现,城市功能分区已经成为制约城市发展的重要因素之一。为此开展成都市功能分区的研究也愈加重要,利用 Google Earth软件提供的在线浏览的高分辨率遥感图像,结合其特有的空间分析和查询功能,探讨成都市的功能区的分区及成因,并对城市功能分区的发展趋势做出预 测。
[倪忠云, 雷方贵, 杨武年. Google Earth在成都市功能分区研究中的应用
. 国土资源科技管理, 2007, 24(4): 121-124.]
在历经了20余年的改革发展 后,成都市在基础设施建设、产业结构调整和生态环境保护等方面都取得了长足的进步,在政治经济文化发展的同时,城市的用地规模日益扩大,城市功能分区日趋 复杂,人地矛盾、功能与形态分区矛盾日渐凸现,城市功能分区已经成为制约城市发展的重要因素之一。为此开展成都市功能分区的研究也愈加重要,利用 Google Earth软件提供的在线浏览的高分辨率遥感图像,结合其特有的空间分析和查询功能,探讨成都市的功能区的分区及成因,并对城市功能分区的发展趋势做出预 测。
[29]Yu Xiang.Discovering zones of different functions using bus smart card data and points of interest: A case study of Beijing
[D]. Hangzhou: Zhejiang University, 2014.

[于翔. 基于城市公交刷卡数据和兴趣点的城市功能区识别研究[D]
. 杭州: 浙江大学, 2014.]

[30]Zhang Yu, Hu Xinhua.A detection method of expressway traffic congestion with probe car data
. Journal of Transport Information & Safety, 2012, 30(6): 87-89.
https://doi.org/10.3963/j.issn1674-4861.2012.06.018URL [本文引用: 1]摘要
了解居民公交出行乘车特征、掌握公交出行客流规律是公交规划和运 营决策的基础.为了研究不同时段居民公交乘车的分布特性,以北京市分段计价线路公交IC刷卡数据为依据,基于数据挖掘工具分析了居民公交出行乘车的距离特 性,并对乘车距离分布进行曲线拟合,结果表明:北京市居民公交乘车距离服从威布尔分布,在置信水平为95%的条件下,平方误差和小于0.01,拟合优度在 0.97以上.
[章玉, 胡兴华. 基于IC卡数据的居民公交乘车距离研究
. 交通信息与安全, 2012, 30(6): 87-89.]
https://doi.org/10.3963/j.issn1674-4861.2012.06.018URL [本文引用: 1]摘要
了解居民公交出行乘车特征、掌握公交出行客流规律是公交规划和运 营决策的基础.为了研究不同时段居民公交乘车的分布特性,以北京市分段计价线路公交IC刷卡数据为依据,基于数据挖掘工具分析了居民公交出行乘车的距离特 性,并对乘车距离分布进行曲线拟合,结果表明:北京市居民公交乘车距离服从威布尔分布,在置信水平为95%的条件下,平方误差和小于0.01,拟合优度在 0.97以上.
[31]Liu Y, Liu X, Gao S.Social sensing: A new approach to understanding our socioeconomic environments
. Annals of the Association of American Geographers, 2015, 105(3): 1-19.
https://doi.org/10.1080/00045608.2015.1018773URL [本文引用: 3]摘要
The emergence of big data brings new opportunities for us to understand our socioeconomic environments. We use the term social sensing for such individual-level big geospatial data and the associated analysis methods. The word sensing suggests two natures of the data. First, they can be viewed as the analogue and complement of remote sensing, as big data can capture well socioeconomic features while conventional remote sensing data do not have such privilege. Second, in social sensing data, each individual plays the role of a sensor. This article conceptually bridges social sensing with remote sensing and points out the major issues when applying social sensing data and associated analytics. We also suggest that social sensing data contain rich information about spatial interactions and place semantics, which go beyond the scope of traditional remote sensing data. In the coming big data era, GIScientists should investigate theories in using social sensing data, such as data representativeness and quality, and develop new tools to deal with social sensing data.
[32]Blei D M, Ng A Y, Jordan M I.Latent dirichlet allocation
. The Journal of Machine Learning Research, 2003, 3: 993-1022.
https://doi.org/10.1109/MLSP.2011.6064562URL [本文引用: 2]摘要
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model. 1.
[33]Wei X, Croft W B.LDA-based document models for ad-hoc retrieval
. Research and Development in Information Retrieval, 2006: 178-185.
https://doi.org/10.1145/1148170.1148204URL [本文引用: 1]摘要
ABSTRACT Search algorithms incorporating some form of topic model have a long history in information retrieval. For example, cluster-based retrieval has been studied since the 60s and has re cently produced good results in the language model framework. An approach to building topic models based on a formal generative model of documents, Latent Dirichlet Allocation (LDA), is he avily cited in the machine learning literature, but its feasibilit y and effectiveness in information retrieval is mostly un known. In this paper, we study how to efficiently use LDA to impro ve ad-hoc retrieval. We propose an LDA-based document model within the language modeling framework, and evaluate it on several TREC collections. Gibbs sampling is employed to conduct approximate inference in LDA and the computational complexity is analyzed. We show that improvements over retrieval using cluster-based models can be obtained with reasonable efficiency.
[34]Li Wenbo, Sun Le, Zhang Dakun.Text classification based on labeled-LDA model
. Chinese Journal of Computers, 2008, 31(4): 620-627.
https://doi.org/10.3321/j.issn:0254-4164.2008.04.008URL [本文引用: 1]摘要
LDA(Latent Dirichlet Allocation)模型是近年来提出的一种能够提取文本隐含主题的非监督学习模型.通过在传统LDA模型中融入文本类别信息,文中提出了一种附加类别 标签的LDA模型(Labeled-LDA).基于该模型可以在各类别上协同计算隐含主题的分配量,从而克服了传统LDA模型用于分类时强制分配隐含主题 的缺陷.与传统LDA模型的实验对比表明:基于Labeled-LDA模型的文本分类新算法可以有效改进文本分类的性能,在复旦大学中文语料库上 micro_F1提高约5.7%,在英文语料库20newsgroup的comp子集上micro_F1提高约3%.
[李文波, 孙乐, 张大鲲. 基于Labeled-LDA模型的文本分类新算法
. 计算机学报, 2008, 31(4):620-627.]
https://doi.org/10.3321/j.issn:0254-4164.2008.04.008URL [本文引用: 1]摘要
LDA(Latent Dirichlet Allocation)模型是近年来提出的一种能够提取文本隐含主题的非监督学习模型.通过在传统LDA模型中融入文本类别信息,文中提出了一种附加类别 标签的LDA模型(Labeled-LDA).基于该模型可以在各类别上协同计算隐含主题的分配量,从而克服了传统LDA模型用于分类时强制分配隐含主题 的缺陷.与传统LDA模型的实验对比表明:基于Labeled-LDA模型的文本分类新算法可以有效改进文本分类的性能,在复旦大学中文语料库上 micro_F1提高约5.7%,在英文语料库20newsgroup的comp子集上micro_F1提高约3%.
[35]Karlsson C.Clusters, functional regions and cluster policies
. IBS and CESIS Electronic Working Paper Series, 2007, 84:1-24.
URL [本文引用: 1]摘要
This paper gives an overview of research on economic clusters and clustering and is motivated by the growing intellectual and political interest for the subject. Functional regions have the features that agglomeration of economic activities i.e. clusters, benefit from. Functional regions have low intra-regional transaction and transportation cost and has access to the local labour market. The features of spatial economic concentration were for a long time disregarded and it was first in the early 1990s that Krugman brought the subject into the stage light. The scientific interests of cluster and clustering phenomenon have after the “new” introduction rapidly increased in the last decade. Hence, the subject is being thought at various education levels. The importance of cluster and clustering has also been recognized at a national, regional and local level and cluster policies are becoming a major part of political thinking. These policies are however often based on a scarce analysis where no strict criterions are stated.
[36]Mimno D M A. Topic models conditioned on arbitrary features with dirichlet-multinomial regression. ArXiv Preprint ArXiv,1206.3278, 2012.URL [本文引用: 1]摘要
Although fully generative models have been successfully used to model the contents of text documents, they are often awkward to apply to combinations of text data and document metadata. In this paper we propose a Dirichlet-multinomial regression (DMR) topic model that includes a log-linear prior on document-topic distributions that is a function of observed features of the document, such as author, publication venue, references, and dates. We show that by selecting appropriate features, DMR topic models can meet or exceed the performance of several previously published topic models designed for specific data.
[37]URL [本文引用: 1]
相关话题/城市 数据 空间 信息 开发区