java 在底图上绘制线条_使用底图和geonamescache绘制k表示聚类

java 在底图上绘制线条

This is the third of four stories that aim to address the issue of identifying disease outbreaks by extracting news headlines from popular news sources.

这是四个故事中的第三个,旨在通过从流行新闻来源中提取新闻头条来解决识别疾病暴发的问题。

This article aims to determine an easy way to view the clusters determined (in the second article) on a global and US-level scale. First, a list of large cities are gathered, and placed with their corresponding latitude and longitude inside a dataset. Next, a function is made that plots the cluster points on a map with different colors for each respective cluster. Lastly, the function is called for the points in the United States, the centers of the clusters in the United States, the points globally, and the centers of the clusters globally.

本文旨在确定一种简单的方法来查看在全球和美国范围内确定的集群(在第二篇文章中)。 首先,收集大城市列表,并将其对应的纬度和经度放置在数据集中。 接下来,创建一个函数,在每个地图上用不同的颜色绘制地图上的聚类点。 最后,该函数用于美国的点,美国的聚类中心,全球的点以及全球的聚类中心。

A detailed explanation is shown below for how this is implemented:

下面显示了有关如何实现的详细说明:

Step 1: Compiling a List of the Largest Cities in the US

步骤1:编制美国最大城市清单

First, the city name, latitude, longitude, and population are extracted from ‘largest_us_cities.csv’, a file containing the cities in the US with a population over 30,000. Cities with a population over 200,000 were added to the dictionary, and Anchorage and Honolulu were excluded as they skewed the positioning of the map. Next, using the haversine distance formula, which determines the distance between pairs of cities, cities close to one another were excluded and used a population heuristic to determine which city should should be kept.

首先,从“ largest_us_cities.csv”中提取城市名称,纬度,经度和人口,该文件包含美国人口超过30,000的城市。 人口超过200,000的城市被添加到词典中,并且由于锚定地图和地图的位置偏斜,因此将安克雷奇和檀香山排除在外。 接下来,使用Haversine距离公式确定两对城市之间的距离,将彼此靠近的城市排除在外,并使用人口启发法确定应保留的城市。

file2 = open('largest_us_cities.csv', 'r') 
large_cities = file2.readlines()
large_city_data = {}for i in range(1, len(large_cities)):
large_city_values = large_cities[i].strip().split(';')
lat_long = large_city_values[-1].split(',')if ((int(large_city_values[-2]) >= 200000) and (large_city_values[0] != "Anchorage") and (large_city_values[0] != "Honolulu") and (large_city_values[0] != "Greensboro")):
large_city_data[large_city_values[0]] = [lat_long[0], lat_long[1], large_city_values[-2]]def haversine(point_a, point_b):
lon1, lat1 = point_a[0], point_a[1]
lon2, lat2 = point_b[0], point_b[1]
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371return c * rfor i in list(large_city_data.keys()):for j in list(large_city_data.keys()):if ((i != j) and haversine((float(large_city_data[i][0]), float(large_city_data[i][1])), (float(large_city_data[j][0]),
float(large_city_data[j][1]))) < 80.0):if (large_city_data[j][2] > large_city_data[i][2]):
large_city_data[i] = [np.nan, np.nan, large_city_data[i][2]]else:
large_city_data[j] = [np.nan, np.nan, large_city_data[j][2]]
large_city_data['Chicago'] = [41.8781136, -87.6297982, 2718782]

Step 2: Plotting K-Means Clusters and Cluster Centers Using Basemap

步骤2:使用底图绘制K均值聚类和聚类中心

First, a function is created with seven parameters: df1, num_cluster, typeof, path, size, add_large_city, and figsize. Using the basemap library, depending on the typeof parameter, geographic models of the US and world are generated. Furthermore, the figsize parameter changes the model size depending on its value. A dictionary is created where the keys are the cluster labels, subdivided by latitude and longitude. The values contain the latitude and longitude for each headline for each cluster label.

首先,使用七个参数创建一个函数:df1,num_cluster,typeof,路径,大小,add_large_city和figsize。 使用底图库,根据参数的typeof,可以生成美国和世界的地理模型。 此外,figsize参数根据其值更改模型大小。 将创建一个字典,其中键是聚类标签,按纬度和经度细分。 该值包含每个群集标签的每个标题的纬度和经度。

A list of colors is intitialized, and specific colors are assigned to each cluster label. The latitude and longitude points are plotted using these color values on the geographic models made above. If the add_large_city parameter is true, the largest cities will also be added to the graph. The figure is saved to a “.png” file using the path parameter.

颜色列表被初始化,并且特定的颜色分配给每个群集标签。 使用这些颜色值在上述地理模型上绘制纬度和经度点。 如果add_large_city参数为true,则最大的城市也将添加到图形中。 使用path参数将图形保存到“ .png”文件中。

def print_k_means(df1, num_cluster, typeof, path, size, add_large_city, figsize):if (typeof == "US"):
map_plotter = Basemap(projection='lcc', lon_0=-95, llcrnrlon=-119, llcrnrlat=22, urcrnrlon=-64, urcrnrlat=49, lat_1=33, lat_2=45)else:
map_plotter = Basemap()if (figsize):
fig = plt.figure(figsize = (24,16))else:
fig = plt.figure(figsize = (12,8))
coordinates = []for index in df1.index:
coordinates.append([df1['latitude'][index], df1['longitude'][index], df1['cluster_label'][index]])
cluster_vals = {}for i in range(num_cluster):
cluster_vals[str(i)+"_long"] = []
cluster_vals[str(i)+"_lat"] = []for index in df1.index:
cluster_vals[str(df1['cluster_label'][index])+'_long'].append(float(df1['longitude'][index]))
cluster_vals[str(df1['cluster_label'][index])+'_lat'].append(float(df1['latitude'][index]))
num_list = [i for i in range(num_cluster)]
color_list = ['rosybrown', 'lightcoral', 'indianred', 'brown',
'maroon', 'red', 'darksalmon', 'sienna', 'chocolate', 'sandybrown', 'peru',
'darkorange', 'burlywood', 'orange', 'tan', 'darkgoldenrod', 'goldenrod', 'gold', 'darkkhaki',
'olive', 'olivedrab', 'yellowgreen', 'darkolivegreen', 'chartreuse',
'darkseagreen', 'forestgreen', 'darkgreen', 'mediumseagreen', 'mediumaquamarine',
'turquoise', 'lightseagreen', 'darkslategrey', 'darkcyan',
'cadetblue', 'deepskyblue', 'lightskyblue', 'steelblue', 'lightslategrey',
'midnightblue', 'mediumblue', 'blue', 'slateblue', 'darkslateblue', 'mediumpurple', 'rebeccapurple',
'thistle', 'plum', 'violet', 'purple', 'fuchsia', 'orchid', 'mediumvioletred', 'deeppink', 'hotpink',
'palevioletred']
colors = [color_list[i] for i in range(num_cluster+1)]for target,color in zip(num_list, colors):
map_plotter.scatter(cluster_vals[str(target)+'_long'], cluster_vals[str(target)+'_lat'], latlon=True, s = size, c = color)
map_plotter.shadedrelief()if (add_large_city):for index in list(large_city_data.keys()):if (large_city_data[index][1] != np.nan):
x, y = map_plotter(large_city_data[index][1], large_city_data[index][0])
plt.plot(x, y, "ok", markersize = 4)
plt.text(x, y, index, fontsize = 16)
plt.show()
fig.savefig(path)

Step 3: Running the Function

步骤3:运行功能

The print_k_means function is run on the df_no_us dataframe to make a scatterplot of the latitude and longitudes for headlines pertaining to the US. Next, a geographic center to each cluster is determined and stored in another dataframe called df_center_us. The print_k_means function is run on the df_center_us dataframe and adds large cities to determine the cities closest to the disease outbreak centers. Additionally, the size is increased for easier readability. A similar process is run for df_no_world. Each of the dataframes are stored in a “.csv” file.

在df_no_us数据帧上运行print_k_means函数,以制作与美国相关的标题的经度和纬度散点图。 接下来,确定每个群集的地理中心,并将其存储在另一个名为df_center_us的数据框中。 print_k_means函数在df_center_us数据帧上运行,并添加大城市以确定最接近疾病爆发中心的城市。 此外,增加了大小以更易于阅读。 df_no_world运行类似的过程。 每个数据帧都存储在“ .csv”文件中。

print_k_means(df_no_us, us_clusters, "US", "corona_disease_outbreaks_us.png", 50, False, False)
df_no_us.to_csv("corona_disease_outbreaks_us.csv")
df_center_us = {'latitude': [], 'longitude':[] , 'cluster_label': []}for i in range(us_clusters):
df_1 = df_no_us.loc[df_no_us['cluster_label'] == i]
df_1 = df_1.reset_index()del df_1['index']
latitude = []
longitude = []for index in df_1.index:
latitude.append(float(df_1['latitude'][index]))
longitude.append(float(df_1['longitude'][index]))
df_1['latitude'] = latitude
df_1['longitude'] = longitude
sum_latitude = df_1['latitude'].sum()
sum_longitude = df_1['longitude'].sum()if (len(df_1['latitude']) >= 20):
df_center_us['latitude'].append(sum_latitude/(len(df_1['latitude'])))
df_center_us['cluster_label'].append(i)
df_center_us['longitude'].append(sum_longitude/(len(df_1['longitude'])))
df_center_us = pd.DataFrame(data = df_center_us)for index in df_center_us.index:
df_center_us['cluster_label'][index] = index
print_k_means(df_center_us, len(df_center_us['latitude']), "US", "corona_disease_outbreaks_us_centers.png", 500, True, True)
df_center_us.to_csv("corona_disease_outbreaks_us_centers.csv")
df_center_world = {'latitude': [], 'longitude':[] , 'cluster_label': []}for i in range(world_clusters):
df_1 = df_no_world.loc[df_no_world['cluster_label'] == i]
df_1 = df_1.reset_index()del df_1['index']
latitude = []
longitude = []for index in df_1.index:
latitude.append(float(df_1['latitude'][index]))
longitude.append(float(df_1['longitude'][index]))
df_1['latitude'] = latitude
df_1['longitude'] = longitude
sum_latitude = df_1['latitude'].sum()
sum_longitude = df_1['longitude'].sum()if (len(df_1['latitude']) >= 10):
df_center_world['latitude'].append(sum_latitude/(len(df_1['latitude'])))
df_center_world['cluster_label'].append(i)
df_center_world['longitude'].append(sum_longitude/(len(df_1['longitude'])))
df_center_world = pd.DataFrame(data = df_center_world)for index in df_center_world.index:
df_center_world['cluster_label'][index] = index
print_k_means(df_center_world, len(df_center_world['latitude']), "world", "corona_disease_outbreaks_world_centers.png", 500, False, True)
df_center_us.to_csv("corona_disease_outbreaks_world_centers.csv")

Click this link for access to the Github repository for a detailed explanation of the code: Github.

单击此链接可访问Github存储库,以获取代码的详细说明: Github 。

翻译自: https://medium.com/@neuralyte/using-basemap-and-geonamescache-to-plot-k-means-clusters-995847513fc2

java 在底图上绘制线条

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。
如若转载,请注明出处:http://www.pswp.cn/news/389972.shtml
繁体地址,请注明出处:http://hk.pswp.cn/news/389972.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

python selenium处理JS只读(12306)

12306为例 js "document.getElementById(train_date).removeAttribute(readonly);" driver.execute_script(js)time2获取当前时间tomorrow_time 获取明天时间 from selenium import webdriver import time import datetime time1datetime.datetime.now().strftime(&…

Mac上使用Jenv管理多个JDK版本

使用Java时会接触到不同的版本。大多数时候我在使用Java 8&#xff0c;但是因为某些框架或是工具的要求&#xff0c;这时不得不让Java 7上前线。一般情况下是配置JAVA_HOME&#xff0c;指定不同的Java版本&#xff0c;但是这需要人为手动的输入。如果又要选择其他版本&#xff…

交互式和非交互式_发布交互式剧情

交互式和非交互式Python中的Visual EDA (Visual EDA in Python) I like to learn about different tools and technologies that are available to accomplish a task. When I decided to explore data regarding COVID-19 (Coronavirus), I knew that I would want the abilit…

5886. 如果相邻两个颜色均相同则删除当前颜色

5886. 如果相邻两个颜色均相同则删除当前颜色 总共有 n 个颜色片段排成一列&#xff0c;每个颜色片段要么是 ‘A’ 要么是 ‘B’ 。给你一个长度为 n 的字符串 colors &#xff0c;其中 colors[i] 表示第 i 个颜色片段的颜色。 Alice 和 Bob 在玩一个游戏&#xff0c;他们 轮…

Sunisoft.IrisSkin.SkinEngine 设置winform皮肤

Sunisoft.IrisSkin.SkinEngine se; se new Sunisoft.IrisSkin.SkinEngine { SkinAllForm true, SkinFile "..\..\skin\EmeraldColor2.ssk" };Sunisoft.IrisSkin.SkinEngine skin new Sunisoft.IrisSkin.SkinEngine(); //具体样式文件 地址&#xff0c;可以自行修…

docker 相关操作

docker-compose down //关闭所有容器 docker-compose up //开启所有容器docker-compose restart //重启所有容器单独更新某个容器时用脚本$ docker ps // 查看所有正在运行容器 $ docker stop containerId // containerId 是容器的ID$ docker ps -a // 查看所有容器 $…

电子表格转换成数据库_创建数据库,将电子表格转换为关系数据库,第1部分...

电子表格转换成数据库Part 1: Creating an Entity Relational Diagram (ERD)第1部分&#xff1a;创建实体关系图(ERD) A Relational Database Management System (RDMS) is a program that allows us to create, update, and manage a relational database. Structured Query …

【Vue.js学习】生命周期及数据绑定

一、生命后期 官网的图片说明&#xff1a; Vue的生命周期总结 var app new Vue({el:"#app", beforeCreate: function(){console.log(1-beforeCreate 初始化之前);//加载loading},created: function(){console.log(2-created 创建完成);//关闭loading},be…

5885. 使每位学生都有座位的最少移动次数

5885. 使每位学生都有座位的最少移动次数 一个房间里有 n 个座位和 n 名学生&#xff0c;房间用一个数轴表示。给你一个长度为 n 的数组 seats &#xff0c;其中 seats[i] 是第 i 个座位的位置。同时给你一个长度为 n 的数组 students &#xff0c;其中 students[j] 是第 j 位…

Springboot(2.0.0.RELEASE)+spark(2.1.0)框架整合到jar包成功发布(原创)!!!

一、前言 首先说明一下&#xff0c;这个框架的整合可能对大神来说十分容易&#xff0c;但是对我来说十分不易&#xff0c;踩了不少坑。虽然整合的时间不长&#xff0c;但是值得来纪念下&#xff01;&#xff01;&#xff01;我个人开发工具比较喜欢IDEA&#xff0c;创建的sprin…

求一个张量的梯度_张量流中离散策略梯度的最小工作示例2 0

求一个张量的梯度Training discrete actor networks with TensorFlow 2.0 is easy once you know how to do it, but also rather different from implementations in TensorFlow 1.0. As the 2.0 version was only released in September 2019, most examples that circulate …

docker环境 快速使用elasticsearch-head插件

docker环境 快速使用elasticsearch-head插件 #elasticsearch配置 #进入elk容器 docker exec -it elk /bin/bash #head插件访问配置 echo #head插件访问# http.cors.enabled: true http.cors.allow-origin: "*" >>/etc/elasticsearch/elasticsearch.yml#重启el…

476. 数字的补数

476. 数字的补数 给你一个 正 整数 num &#xff0c;输出它的补数。补数是对该数的二进制表示取反。 例 1&#xff1a;输入&#xff1a;num 5 输出&#xff1a;2 解释&#xff1a;5 的二进制表示为 101&#xff08;没有前导零位&#xff09;&#xff0c;其补数为 010。所以你…

zabbix网络发现主机

1 功能介绍 默认情况下&#xff0c;当我在主机上安装agent&#xff0c;然后要在server上手动添加主机并连接到模板&#xff0c;加入一个主机组。 如果有很多主机&#xff0c;并且经常变动&#xff0c;手动操作就很麻烦。 网络发现就是主机上安装了agent&#xff0c;然后server自…

python股市_如何使用python和破折号创建仪表板来主导股市

python股市始终关注大局 (Keep Your Eyes on the Big Picture) I’ve been fascinated with the stock market since I was a little kid. There is certainly no shortage of data to analyze, and if you find an edge you can make some easy money. To stay on top of the …

阿里巴巴开源 Sentinel,进一步完善 Dubbo 生态

为什么80%的码农都做不了架构师&#xff1f;>>> 阿里巴巴开源 Sentinel&#xff0c;进一步完善 Dubbo 生态 Sentinel 开源地址&#xff1a;https://github.com/alibaba/Sentinel 转载于:https://my.oschina.net/dyyweb/blog/1925839

数据结构与算法 —— 链表linked list(01)

链表(维基百科) 链表&#xff08;Linked list&#xff09;是一种常见的基础数据结构&#xff0c;是一种线性表&#xff0c;但是并不会按线性的顺序存储数据&#xff0c;而是在每一个节点里存到下一个节点的指针(Pointer)。由于不必须按顺序存储&#xff0c;链表在插入的时候可以…

离群值如何处理_有理处理离群值的局限性

离群值如何处理ARIMA models can be quite adept when it comes to modelling the overall trend of a series along with seasonal patterns.ARIMA模型可以很好地建模一系列总体趋势以及季节性模式。 In a previous article titled SARIMA: Forecasting Seasonal Data with P…

网络爬虫基础练习

0.可以新建一个用于练习的html文件&#xff0c;在浏览器中打开。 1.利用requests.get(url)获取网页页面的html文件 import requests newsurlhttp://news.gzcc.cn/html/xiaoyuanxinwen/ res requests.get(newsurl) #返回response对象 res.encodingutf-8 2.利用BeautifulSoup的H…

10生活便捷:购物、美食、看病时这样搜,至少能省一半心

本次课程介绍实实在在能够救命、省钱的网站&#xff0c;解决了眼前这些需求后&#xff0c;还有“诗和远方”——不花钱也能点亮自己的生活&#xff0c;获得美的享受&#xff01; 1、健康医疗这么搜&#xff0c;安全又便捷 现在的医疗市场确实有些混乱&#xff0c;由于医疗的专业…