Count number of buffers that touch a (polygon) feature - buffer

I'm facing the following task in ArcGIS - I'm using ArcMap 10.2
I have a polygon shapefile with counties of (say) a state in US. From this shapefile, I create a layer which marks all counties in which there is at least 1 city of more than 50000 inhabitants (I think of this as the treatment condition). Then I'm creating buffers around the polygons in my layer of counties with those large cities, i.e. I'm drawing a buffer of say 100km around every county that has at least one city with more than 50000 inhabitants.
So far so good!
The final step of this exercise should be to create a count for every polygon with the number of buffers that are touching this polygon. For instance, the buffers around counties B, C and D all touch county A. However county A doesn't have a city of more than 50000 inhabitants. Hence, I want the count for city A to be 3 (it's touched by B, C and D). I created the union of all my buffers but I simply can't find the right way to create this count for every polygon.
I've done an extensive Google search and I'm apologize if I overlooked the obvious solution.
Any help is appreciated!
Michael Kaiser
[Staff Research Assistant UCSD]

If I understand what you want correctly, then creating the union of buffers won't help you - as it leaves you with a single object and you need the count of all buffered objects intersecting against every object in the original table.
In SQL I would join the original (all counties) layer to your new (filtered, buffered) layer using the STIntersects() method. Something like the following:
DECLARE #original TABLE
(
[Original_Id] INT NOT NULL,
[Original_Geom] GEOGRAPHY NOT NULL
);
DECLARE #filtered TABLE
(
[Buffered_Id] INT NOT NULL,
[Buffered_Geom] GEOGRAPHY NOT NULL
);
-- We'll pretend the above tables are filled with data
SELECT
ORIGINAL.[Original_Id],
COUNT(FILTERED.[Filtered_Id]) AS [NumberOfIntersections]
FROM
#original AS ORIGINAL
JOIN
#filtered AS FILTERED ON (ORIGINAL.[Original_Geom].STIntersects(FILTERED.[Filtered_Geom] = 1)
GROUP BY
ORIGINAL.[Original_Id]
Explanation:
In this example, the #original table would contain all of your counties in your given state - as they were before you buffered them. [Original_Id] would contain something that you can relate to or use to relate back to your data and [Original_Geometry] would contain the county's boundary.
The #filtered table would contain a subset of #original - in your case only those with at least 1 city of 50,000 inhabitants. The [Buffered_Id] would match records in [Original_Id] (as an example Orange County may have Id 32) and [Buffered_Geometry] would contain the county's boundary, buffered by (as in your example) 100km.
Using my example exactly, you need to get the required data out of your tables and in to mine, but you should be able to use your tables and adjust as necessary to reference them.
NOTE: If you do not wish "Orange County" to count "Orange County (Buffered)" in the above query, you will need to add a WHERE clause to filter them out.
I haven't the data to hand to test this, but it should be mostly there. Hope it helps.

Related

why does rows get the "duplicated" records in Tableau when the tables are integrated?

I cannot find any helpful information about Tableau. For your information, I have to make generic tables.
I have used the particular data source, Snowflake, as a database via Tableau. With tableau's data source, I am able to edit the connection for each endpoint's environment such as sandbox, production, etc.
At this point, in Snowflake I made some queries to check the accurate data. I have to make up the simple table and query below. The US table and England table get for every 300 rows (it's just an example).
SELECT name, gender, age,
FROM USA a
LEFT JOIN England b ON (a.gender = b.gender)
WHERE gender = 'female';
Above the query is technically making double rows that are duplicated. The rows get 200. I need to avoid the duplicated rows, so I need to take care of joins operation and group by.
SELECT name, gender, age, COUNT(*)
FROM USA a
LEFT JOIN England b ON (a.gender = b.gender)
WHERE gender = 'female'
GROUP BY 1,2,3;
As a result, the data is accurate and gets 100 rows in Snowflake.
I created the dashboard in Tableau since I integrated the two connections from different environments such as Production or Sandbox. I think this screenshot is a bit helpful for different colors in the connections section (its source is here). I have identified the inaccurate data for 200 rows since I made logical or physical tables in the data source. Also, I made a relationship and index for the US and England table.
With Tableau, in my goal, I need 100 rows for the dashboard. How can I fix this to avoid the duplicated rows in Tableau's data source?
Infomation:
The version Tableau 2021.4

Google Big Query: How to select subset of smaller table using nested fake join?

I would like to solve the problem related to How can I join two tables using intervals in Google Big Query? by selecting subset of of smaller table.
I wanted to use solution by #FelipeHoffa using row_number function Row number in BigQuery?
I have created nested query as follows:
SELECT a.DevID DeviceId,
a.device_make OS
FROM
(SELECT device_id DevID, device_make, A, lat, long, is_gps
FROM [Data.PlacesMaster] WHERE not device_id is null and is_gps is true) a JOIN (select ROW_NUMBER() OVER() row_number,top_left_lat, top_left_long, bottom_right_lat, bottom_right_long, A, count from (SELECT top_left_lat, top_left_long, bottom_right_lat,bottom_right_long, A, COUNT(*) count from [Karol.fast_food_box]
GROUP BY (....?)
ORDER BY COUNT DESC,
WHERE row_number BETWEEN 1000 AND 2000)) b ON a.A=b.A
WHERE (a.lat BETWEEN b.bottom_right_lat AND b.top_left_lat)
AND (a.long BETWEEN b.top_left_long AND b.bottom_right_long)
GROUP EACH BY DeviceId,
OS
Could you help in finalising it please? I cannot break the smaller table by "group by", i need to have consistency between two tables and select only items with lat,long from MASTER.table that fit into the given bounding box of a smaller table. I need to match lat,long into box really, my solution form How can I join two tables using intervals in Google Big Query? works only for small tables (approx 1000 to 2000 rows), hence this issue. Thank you in advance.
It looks like you're applying two approaches at once: 1) split a table into chunks of rows, and run on each, and 2) include a field, "A", tagging your boxes and your points into 'regions', that you can equi-join on. Approach (1) just does the same total work in more pieces (also, it's adding complication), so I would suggest focusing on approach (2), which cuts the work down to be ~quadratic in each 'region' rather than quadratic in the size of the whole world.
So the key thing is what values your A takes on, and how many points and boxes carry each A value. For example, if A is a country code, that has the right logical structure, but it probably doesn't help enough once you get a lot of data in any one country. If it goes to the state or province code, that gets you one step farther. Quantized lat/long grid cells generalize better. Sooner or later you do have to deal with falling across a region edge, which can be somewhat tricky. I would use a lat/long grid myself.
What A values are you using? In your data, what is the A value with the maximum (number of points * number of boxes)?

Given collection of points and polygons, determine which point lies in which polygon (or not)

My question is almost similar to this. But in my case, the polygons are not necessarily touching/overlapping each other. They are present all over the space.
I have a big set of such polygons. Similarly, I have a huge set of points. I am currently running a RoR module that takes 1 point at a time and checks the intersection with respect to 1 polygon at a time. The database is PostGIS. The performance is quite slow.
Is there a faster or optimal way of doing this?
Can be done as one select statement, but for performance....look into a gist index on your polygons. FOr simplicity, lets say I have a table with a polygon field (geom data type) and a point field (geom data type). If you are doing a list of points in a list of polygons, do a cross join so each polygon and each point is compared.
select *
from t1 inner join t2 on 1=1
where st_contains(t1.poly,t2.point) = 't'
(modified to include the table join example. I'm using a cross join, which means every polygon will be joined to every point and compared. If we're talking a large record set, get those GIS tree indexes going)
I'm currently doing this to locate a few million points within a few hundred polygons. If you have overlapping polygons, this will return multiple rows for every point thats located in 2 or more polygons.
May fail pending on the data type your points are stored as. If they are in a geom field, it'll flow fine. If you are using text values, you'll need to use the st.geomfromtext statement to turn your characters into a point. This will look more like:
st_contains(poly, st_geomfromtext('POINT('||lon||' ' ||lat ||')')) = 't'
I used a lat/lon example...only thing to watch for here is the geomfromtext requires you to create the point using || to create the string from your field. Let me know if you need assistance with the st_geomfromtext concept.

rails 3 + activerecord: is there a single query to count(field1) grouped by field2?

I'm trying to find the best way to summarize the data in a table
I have a table Info with fields
id
region_number integer (NOT associated with another table)
member_name string
member_active T/F
Members belong to a region, have a name, and are either active or not.
I'm wondering if there is a single query that will create a table with 3 columns, and as many rows as there are unique region_numbers:
For each unique region_number:
region_number
COUNT of members in that region
COUNT of members in that region with active=TRUE
Suppose I have 50 regions, I see how to do it with 2x50 queries but that surely is not the right approach!
You can always group on several things if you're prepared to do a tiny bit of post-processing:
SELECT region_number, COUNT(*) AS instances, member_active
GROUP BY region_number, member_active
WHERE region_number IN ?
This allows you do to one query for all region numbers at the same time. There will be one row for the T values, one for the F, but only if those are present.
If you see a case where you're doing a lot of queries that differ only in identifiers, that's something you can usually execute in one shot like this.

IBM Informix using spatial datablade

I need to use IBM Informix for my project where I have point coordinates and I need to find which points are present in query rectangular region.
Informix has spatial datablade module with ST_POINT and ST_POLYGON data objects.
I know how to create, insert and create r-tree index on tables with such objects.
But problem is how to do a SELECT statement, something which list all the points in a particular rectangular region.
You've got the Spatial Datablade documentation at your fingertips? It is available in the IDS 11.50 Info Centre.
For example, the section in Chapter 1 discusses performing spatial queries:
Performing Spatial Queries
A common task in a GIS application is to retrieve the visible subset of spatial data for display in a window. The easiest way to do this is to define a polygon representing the boundary of the window and then use the SE_EnvelopesIntersect() function to find all spatial objects that overlap this window:
SELECT name, type, zone FROM sensitive_areas
WHERE SE_EnvelopesIntersect(zone,
ST_PolyFromText('polygon((20000 20000,60000 20000,60000 60000,20000 60000,20000 20000))', 5));
Queries can also use spatial columns in the SQL WHERE clause to qualify the result set; the spatial column need not be in the result set at all. For example, the following SQL statement retrieves each sensitive area with its nearby hazardous waste site if the sensitive area is within five miles of a hazardous site. The ST_Buffer() function generates a circular polygon representing the five-mile radius around each hazardous location. The ST_Polygon geometry returned by the ST_Buffer() function becomes the argument of the ST_Overlaps() function, which returns t (TRUE) if the zone ST_Polygon of the sensitive_areas table overlaps the ST_Polygon generated by the ST_Buffer() function:
SELECT sa.name sensitive_area, hs.name hazardous_site
FROM sensitive_areas sa, hazardous_sites hs
WHERE ST_Overlaps(sa.zone, ST_Buffer(hs.location, 26400));
sensitive_area Summerhill Elementary School
hazardous_site Landmark Industrial
sensitive_area Johnson County Hospital
hazardous_site Landmark Industrial

Resources