OSM - Boundary Export from XML file - mapping

I've been attempting to export boundary information from an OSM file. My process is nearly there however I have an issue with the polygon I'm generating drawing random lines.
I would appreciate some insight on where I may be going wrong.
Step 1: Export the OSM data into XML
osmfilter -v greater-london-latest.osm --keep="boundary= admin_level= place=" > b.txt
Step 2: Run a script to process the XML.
cycle each relation node
load the member ways
load the nodes from each specified way
record the lat/lon and build a poly set
This produces a series of lat/lon which when I build them as a polygon give the correct overall shape I'm looking for. However, there are issues with the connecting lines I assume..
My polygon output
I'm actually looking for this, which is similar but Im obviously missing something.
Actual Poly Im looking to generate
Again, thanks for any help.

Ways in relations are not necessarily sorted. See answers to this question on how to sort ways, especially the answer by user geocodezip.
Alternatively you can make use of various tools/libraries to do the sorting for you. Unfortunately I can't point you directly to one but there are various tools capable of sorting relation members, including the OSM website itself, JOSM, overpass turbo (I guess), some JS stuff, [...].
Maybe some other user can help out with pointing to some good examples?

Related

Problems plotting time-series interactively with Altair

Description of the problem
My goal is quite basic: to plot time series in an interactive plot. After some research I decided to give a try to Altair.
There are already QGIS plugins for time-series visualisation, but as far as I'm aware, none for plotting time-series at vector-level, interactively clicking on a map and selecting a Polygon. So that's why I decided to go for a self-made solution using Altair, maybe combining it with Folium to add functionalities later on.
I'm totally new to the Altair library (as well as Vega and Vega-lite), and quite new in datascience and data visualisation as well... so apologies in advance for my ignorance!
There are already well explained tutorials on how to plot time series with Altair (for example here, or in the official website). However, my study case has some particularities that, as far as I've seen, have not yet been approached altogether.
The data is produced using the Python API for Google Earth Engine and preprocessed with Python and the pandas/geopandas libraries:
In Google Earth Engine, a vegetation index (NDVI in the current case) is computed at pixel-level for a certain region of interest (ROI). Then the function image.reduceRegions() is mapped across the ImageCollection to compute the mean of the ndvi in every polygon of a FeatureCollection element, which represent agricultural parcels. The resulting vector file is exported.
Under a Jupyter-lab environment, the data is loaded into a geopandas GeoDataFrame object and preprocessed, transposing the DataFrame and creating a datetime column, among others, in order to have the data well-shaped for time-series representation with Altair.
Data overview after preprocessing:
My "final" goal would be to show, in the same graphic, an interactive line plot with a set of lines representing each one an agricultural parcel, with parcels categorized by crop types in different colours, e.g. corn in green, wheat in yellow, peer trees in brown... (the information containing the crop type of each parcel can be added to the DataFrame making a join with another DataFrame).
I am thinking of something looking more or less like the following example, with legend's years being the parcels coloured by crop types:
But so far I haven't managed to make my data look this way... at all.
As you can see there are many nulls in the data (this is due to the application of a cloud masking function and to the fact that there are several Sentinel-2 orbits intersecting the ROI). I would like to just omit the non-null values for earch column/parcel, but I don't know if this data configuration can pose problems (any advice on that?).
So far I got:
The generation of the preceding graphic, for a single parcel, takes already around 23 seconds. Which is something maybe shoud/cloud be improved (how?)
And more importantly, the expected line representing the item/polygon/parcel's values (NDVI) is not even shown in the plot (note that I chose a parcel containing rather few non-null values).
For sure I am doing many things wrong. Would be great to get some advice to solve (some of) them.
Sample of the data and code to reproduce the issue
Here's a text sample of the data in JSON format, and the code used to reproduce the issue is the following:
import pandas as pd
import geopandas as gpd
import altair as alt
df= pd.read_json(r"path\to\json\file.json")
df['date']= pd.to_datetime(df['date'])
print(gdf.dtypes)
df
Output:
lines=alt.Chart(df).mark_line().encode(
x='date:O',
y='17811:Q',
color=alt.Color(
'17811:Q', scale=alt.Scale(scheme='redyellowgreen', domain=(-1, 1)))
)
lines.properties(width=700, height=600).interactive()
Output:
Thanks in advance for your help!
If I understand correctly, it is mostly the format of your dataframe that needs to be changed from wide to long, which you can do either via .melt in pandas or .transform_fold in Altair. With melt, the default names are 'variable' (the previous columns name) and 'value' (the value for each column) for the melted columns:
alt.Chart(df.melt(id_vars='date'), width=500).mark_line().encode(
x='date:T',
y='value',
color=alt.Color('variable')
)
The gaps comes from the NaNs; if you want Altair to interpolate missing values, you can drop the NaNs:
alt.Chart(df.melt(id_vars='date').dropna(), width=500).mark_line().encode(
x='date:T',
y='value',
color=alt.Color('variable')
)
If you want to do it all in Altair, the following is equivalent to the last pandas example above (the transform uses 'key' instead of 'variable' as the name for the former columns). I also use and ordinal instead of nominal type for the color encoding to show how to make the colors more similar to your example.:
alt.Chart(df, width=500).mark_line().encode(
x='date:T',
y='value:Q',
color=alt.Color('key:O')
).transform_fold(
df.drop(columns='date').columns.tolist()
).transform_filter(
'isValid(datum.value)'
)

Best Grasshopper plugin to analyse floor plans

I'm trying to figure out the best way to analyse a grasshopper/rhino floor plan. I am trying to create a room map to determine how many doors it takes to reach an exit in a residential building. The inputs are the room curves, names and doors.
I have tried to use space syntax or SYNTACTIC, but some of the components are missing. Alot of the plugins I have been looking at are good at creating floor plans but not analysing them.
Your help would be greaty appreciated :)
You could create some sort of spine that goes through the rooms that passes only through doors, and do some path finding across the topology counting how many "hops" you need to reach the exit.
So one way to get the topology is to create a data structure (a tuple, keyValuePair) that holds the curve (room) and a point (the door), now loop each room to each other and see if the point/door of each of the rooms is closer than some threshold, if it is, store the relationship as a graph (in the abstract sense you don't really need to make lines out of it, but if you plan to use other plugins for path-finding, this can be useful), then run some path-finding (Dijkstra's, A*, etc...) to find the shortest distance.
As for SYNTACTIC: If copying the GHA after unblocking from the installation path to the special components folder (or pointing the folder from _GrasshopperDeveloperSettings) doesn't work, tick the Memory load *.GHA assemblies using COFF byte arrays option of the _GrasshopperDeveloperSettings.
*Note that SYNTACTIC won't give you any automatic topology.
If you need some pseudo-code just write a comment and I'd be happy to help.

How to use external data in an OSRM profile

It this Mapbox blog post, Lauren Budorick shares how they got working a routing engine with OSRM that uses elevation data in order to give cyclists better routes... AMAZING!
I also want to explore the potential of OSRM's routing when plugging in external (user-generated) data, but I'm still having a hard time grasping how OSRM's profiles work. I think I get the main idea, that every way (or node?) is piped into a few functions that, all toghether, scores how good that path is.
But that's it, there are plenty of missing parts in my head, like what do each of the functions Lauren uses in her profile do. If anyone could point me to some more detailed information on how all of this works, you'd make my next week much, much easier :)
Also, in Lauren's post, inside source_function she loads a ./srtm_bayarea.asc file. What does that .asc file looks like? How would one generate a file like that one from, let's say, data stored in a pgsql database? Can we use some other format, like GeoJSON?
Then, when in segment_function she uses things like source.lon and target.lat, are those refered to the raw data stored in the asc file? Or is that file processed into some standard that maps everything to comply it?
As you can see, I'm a complete newbie on routing and maybe GIS in general, but I'd love to learn more about this standards and tools that circle around the OSRM ecosystem. Can you share some tips with me?
I think I get the main idea, that every way (or node?) is piped into a few functions that, all toghether, scores how good that path is.
Right, every way and every node are scored as they are read from an OSM dump to determine passability of a node and speed of a way (used as the scoring heuristic).
A basic description of the data format can be found here. As it reads, data immediately available in ArcInfo ASCII grids includes SRTM data. Currently plaintext ASCII grids are the only supported format. There are several great Python tools for GIS developers that may help in converting other data types to ASCII grids - check out rasterio, for example. Here's an example of a really simple python script to convert NED IMGs to ASCII grids:
import sys
import rasterio as rio
import numpy as np
args = sys.argv[1:]
with rio.drivers():
with rio.open(args[0]) as src:
elev = src.read()[0]
profile = src.profile
def shortify(x):
if x == profile['nodata']:
return -9999
elif x == np.finfo(x).tiny:
return 0
else:
return int(round(x))
out_elev = [map(shortify, row) for row in elev]
with open(args[0] + '.asc', 'a') as dst:
np.savetxt(dst, np.array(out_elev),fmt="%s",delimiter=" ")
source.lon and target.lat e.g: source and target are nodes provided as arguments by the extraction process. Their coordinates are used to look up data at each location during extraction.
Make sure to read thoroughly through the relevant wiki page (already linked).
Feel free alternately to open a Github issue in
https://github.com/Project-OSRM/osrm-backend/issues with OSRM
questions.

error detection on food packaging -using Open Cv

I am trying to determine when a food packaging have error or not error. Example
the logo " McDonald's " have error misprints or not, as the wrong label, wrong color..( i can not post picture )
What should I do, please help me!!
It's not a trivial task by any stretch of the imagination. Two images of the same identical object will always be different according to lightning conditions, perspective, shooting angle, etc.
Basically you need to:
1. Process the 2 images into "digested" data - dominant color, shapes, etcw
2. Design and run your own similarity algorithm between the 2 objects
You may want to look at Feature detectors in OpenCV: Surf, SIFT, etc.
Along a result I just found your question, so I think I come too late.
If not I think your problem car easily be resolved, it exists since years and is called Sikuli .
While it's for testing purposes, I have been using it in the same way as you need : compare a reference and a production image. Based on OpenCV it does it very well.

GDAL library orginfo -spat option

Has anyone used the ogrinfo [-spat xmin ymin xmax ymax] option in the GDAL tools? I am able to run the - sql query on the shape files and get the answers/shapes but however if I use the lat/long values in spat, I dont get the results (However I dont get an error). I could not find an example at all.
This suggests one of three options:
Your file is not georeferenced as you expect
Your area of interest extent values are entered incorrectly
There are no features that both match your sql query AND lie within the extent as you have defined it.
You can test your understanding of the -spat values by creating a polygon to match and overlaying it in (say) QGIS to see how it overlaps with your data. This will quickly help you eliminate the options above. After that - you'll understand the feature for future use :)

Resources