How to get lat/lon from shape files?

86 Views Asked by At

I am a newbie to geospatial data. I was wondering if there are any ways to extract lat/lon points from a shape file or a csv file(MULLTILINE)

I have tried coding it using Geopy,Fiona etc to convert a government shape file to lat and lon but I am not reaching anywhere. Any leads would be appreciated.

1

There are 1 best solutions below

0
Ehsan Hamzei On

Depends on what latitude and longitude do you want. Check the figure below, for a multiline - you can get individual vertexes (multiple points per record) or you can get starts and ends (if it is multiline, again multiple starts and ends per records) or you can get the centroid of the bounding box (a single point that may not be on the individual lines but shows roughly where the lines are)

enter image description here (source: https://autogis-site.readthedocs.io/en/latest/lessons/lesson-1/geometry-objects.html)

To access the geometry you can do as follows:

LineStrings

LineStrings are single objects so you can simply access their coordiantes by referring to the coords property and then the xy property.

Access coordinates xy arrays of a geometry object:

simple_line_geometry.coords.xy
# output example - I used a dataset of rivers in EU - this is the results of a single record
# (array('d', [58.24166666666531, 58.13541666666535, 58.131249999998715, 58.12291666666539, 58.11874999999873, 58.110416666665316, 58.10624999999868, 58.06041666666533]),
# array('d', [81.77708333333223, 81.77708333333223, 81.77291666666562, 81.77291666666562, 81.77708333333223, 81.77708333333223, 81.77291666666562, 81.77291666666562]))

As we see in the results, we have a tuple of coordinate arrays (latitude and longitude in my case).

Centroid of a line:

simple_line_geometry.centroid.xy
# (array('d', [58.15020268837722]), array('d', [81.77567515758616]))

A single centroid point of the line

Start and end of the line:

You can use boundary property which will give you a multipoint object:

boundary_points = simple_line_geometry.boundary
for point in boundary_points.geoms:
    print(point.xy)

# (array('d', [58.24166666666531]), array('d', [81.77708333333223]))
# (array('d', [58.06041666666533]), array('d', [81.77291666666562]))

MultiLineString

Similar to what we have done for the multipoint object, if you want to access geometry coordinates you need to iterate over .geoms of the multi.

Access individual coordinates:

for simple_line_geometry in multi_line_geometry.geoms:
    print(simple_line_geometry.coords.xy)

Access centroids

for simple_line_geometry in multi_line_geometry.geoms:
    print(simple_line_geometry.centroid.xy)

or access a single centroid for all of the objects:

print(multi_line_geometry.centroid.xy)

Starts and ends

boundary_points= multi_line_geometry.boundary
for point in boundary_points.geoms:
    print(point.xy)

How to do when you have a dataframe

Some example, as I cannot cover all different ways here. Data can be found here: https://data.hydrosheds.org/file/HydroRIVERS/HydroRIVERS_v10_eu_shp.zip

centroid in one column

import geopandas as gpd

df = gpd.read_file("HydroRIVERS_v10_eu_shp/HydroRIVERS_v10_eu_shp/HydroRIVERS_v10_eu.shp") 

df['centroid'] = df.centroid  # if you want the centroid geometry
df['centroid_coords'] = df['geometry'].apply(lambda x: x.centroid.xy)  # if you want coordiantes array

centroid coordinates in two columns (x,y or latitude, longitude)

df['centroid_x'] = df['geometry'].apply(lambda x: x.centroid.xy[0].tolist()[0])
df['centroid_y'] = df['geometry'].apply(lambda x: x.centroid.xy[1].tolist()[0])

all vertices of line strings in two x, y columns

df['all_x'] = df['geometry'].apply(lambda x: x.coords.xy[0].tolist())
df['all_y'] = df['geometry'].apply(lambda x: x.coords.xy[1].tolist())