how to extract twitts with special location and also special subject - twitter

please let me know if I am unclear,
I am extracting twits using Twitter API
my final goal is to extract twits in Chicago, also with the subject of pizza,
I mean finding all the twitts that are about pizza, and also the user location is in chicago,
I would have gotten two approaches:
I tried the first approaach:
applying geolocation in my search,
there is a problem with that, it seems it can't apply both the filters(chicago, pizza) with each other.
this is one screen shot of the result:
maybe because users don't say about their locations when they are twiting, so it can't be applied I am not sure.
this is the code I am using:
FilterQuery fq = new FilterQuery();
String keywords[] = {"pizza"};
double lat = 41.793474;
double lon = 87.984886;
double lon1 = lon - .5;
double lon2 = lon + .5;
double lat1 = lat - .5;
double lat2 = lat + .5;
double box[][] = {{lon1, lat1}, {lon2, lat2}};
fq.locations(box);
fq.track(keywords);
twitterStream.addListener(listener);
twitterStream.filter(fq);
2. the another approach and the better one is to extract the location of the user and use that for filtering.
I have no idea of how to apply this to my code.
so I have two questions:
what is wrong with the first approach I am using(do you think its because the user does not say about her or his location)?
if the first approach can't be applied for my case, how I can mix the second approach? I mean how I can extract the location of the user from their profile then apply that to the filter.
many thanks for taking the time :)

When you stream and track both a word and a location, the two filters are OR'ed. You want them to be AND'ed. In other words, you will get results that match either filter. The best approach is to filter only for location. Then, write your own code that finds tweets in those results that match your word filter.
Another problem you will face is that very few users opt to include location information.

Related

Twitter4j Exception with track and geolocation

I am trying to get tweets with either the word "python" in it or the ones that are around my city
This is my code:
StatusListener listener = new MyStatusListener(twitter);
twitterStream.addListener(listener);
FilterQuery query = new FilterQuery();
String[] arr = { "python" };
double lat = 18.5203;
double lon = 73.8567;
double[][] locations = { { lat, lon } }; // for Pune city
query.track(arr);
query.locations(locations);
twitterStream.filter(query);
When I run this I get following exception:
Returned by the Streaming API when one or more of the parameters are not suitable for the resource. The track parameter, for example, would throw this error if:
The track keyword is too long or too short.
The bounding box specified is invalid.
No predicates defined for filtered resource, for example, neither track nor follow parameter defined.
Follow userid cannot be read.
Location track items must be given as pairs of comma separated lat/longs: [Ljava.lang.String;#405ef8c2
[Thu Jun 26 19:06:58 GMT+05:30 2014]Parameter not accepted with the role. 406:Returned by the Search API when an invalid format is specified in the request.
Returned by the Streaming API when one or more of the parameters are not suitable for the resource. The track parameter, for example, would throw this error if:
The track keyword is too long or too short.
The bounding box specified is invalid.
No predicates defined for filtered resource, for example, neither track nor follow parameter defined.
Follow userid cannot be read.
Location track items must be given as pairs of comma separated lat/longs: [Ljava.lang.String;#405ef8c2
I get the same message in pair. If I remove the locations condition, the code works fine. I am not sure what the issue is here. Can someone help please?
That was my bad. I was assuming that if I dont provide a pair, twitter will assume a radius (like it does on web UI- near a city). But I guess you have to provide a bounding box. Giving a bounding box worked for me.

Putting random latitude and longitude in google maps geolocation?

I have latitude and longitude of some cities and I want to input it the geocoder randomly. Can anyone please tell me how do I do that. The values are both negative and float. for an instance mexico lat/long 23.634501,-102.552784
myLatlng = new google.maps.LatLng(1.352083,103.819836);
I want to put values randomly from a set of values in Latlng function. Thanks
You don't specify your programming language, so I'll answer language agnostic.
I'll assume you have a Point type that can take two double values, e.g.
class Point
{
double Latitude;
double Longitude;
}
and that you have a list or set of instances of Point representing the set of latitude/longitude pairs you want to select from.
List<Point> points;
Just generate a random integer between 0 and (set size -1). Use that random integer as an index into the list.
int index = Random(0, points.Length-1);
Point myRandomPoint = points[index];
Now use that Point in your call to Google
myLatlng = new google.maps.LatLng(myRandomPoint.Latitude, myRandomPoint.Longitude);

finding locations within a particular distance using db2

I am using html5 geolocation api to get my position in latitude and longitude. I want to store them in a table of locations and want to retrieve those locations within a particular distance.
my current latitudes and longitudes are stored in variables "latval", "longval", "distance"
My table is "location"
columns are "location", "lat", "long"
I am using DB2 Express C as database and latitude and longitude columns are set as double type now. What type should I use to store these values and what would be the query to get location names within a distance
Thank you.
It looks like there's an extension for Express C that includes Spatial processing. I've never used it (and can't seem to get access at the moment), so I can't speak to it. I'm assuming that you'd want to use that (find all locations within a radius is a pretty standard query).
If for some reason you can't use the extension, here's what I would do:
Keep your table as-is, or maybe use a float data-type, although please use full attribute names (there's no reason to truncate them). For simple needs, the name of the 'location' can be stored in the table, but you may want to give it a numeric id if more than one thing is at the same location (so actual points are only in there once).
You're also going to want indicies covering latitude and longitude (probably one each way, or one covering each column).
Then, given a starting position and distance, use this query:
SELECT name, latitude, longitude
FROM location
WHERE (latitude >= :currentLatitude - :distance
AND latitude <= :currentLatitude + :distance)
AND (longitude >= :currentLongitude - :distance
AND longitude <= :currentLongitude + :distance)
-- The previous four lines reduce the points selected to a box.
-- This is, or course, not completely correct, but should allow
-- the query to use the indicies to greatly reduce the initial
-- set of points evaluated.
-- You may wish to flip the condition and use ABS(), but
-- I don't think that would use the index...
AND POWER(latitude - :currentLatitude, 2) + POWER(longitude - :currentLongitude, 2)
<= POWER (:distance, 2)
-- This uses the pythagorean theorem to find all points within the specified
-- distance. This works best if the points have been pre-limited in some
-- way, because an index would almost certainly not be used otherwise.
-- Note that, on a spherical surface, this isn't completely accurate
-- - namely, distances between longitude points get shorter the farther
-- from the equator the latitude is -
-- but for your purposes is likely to be fine.
EDIT:
Found this after searching for 2 seconds on google, which also reminded me that :distance would be in the wrong units. The revised query is:
WITH Nearby (name, latitude, longitude, dist) as (
SELECT name, latitdude, longitude,
ACOS(SIN(RADIANS(latitude)) * SIN(RADIANS(:currentLatitude)) +
COS(RADIANS(latitude)) * COS(RADIANS(:currentLatitude)) *
COS(RADIANS(:currentLongitude - longitude))) * :RADIUS_OF_EARTH as dist
FROM location
WHERE (latitude >= :currentLatitude - DEGREES(:distance / :RADIUS_OF_EARTH)
AND latitude <= :currentLatitude + DEGREES(:distance / :RADIUS_OF_EARTH))
AND (longitude >= :currentLongitude -
DEGREES(:distance / :RADIUS_OF_EARTH / COS(RADIANS(:currentLatitude)))
AND longitude <= :currentLongitude +
DEGREES(:distance / :RADIUS_OF_EARTH / COS(RADIANS(:currentLatitude))))
)
SELECT *
FROM Nearby
WHERE dist <= :distance
Please note that wrapping the distance function in a UDF marked DETERMINISTIC would allow it to be placed in both the SELECT and HAVING portions, but only actually be called once, eliminating the need for the CTE.

Detecting Possible Duplicates in Rails

I have a Rails 3 application that has a model w/ a Name, and a Geographic Location (lat/lng). How would I go about search for possible duplicates in my model. I want to create a cron job or something that checks to see if two objects have a similar name and that they are less than 0.5 miles away from each other. If this matches then we'll flag the objects or something.
I am using Ruby Geocoder and ThinkingSphinx in my application.
Levenshtein is as good a way as any for judging the similarity of two text strings, ie the names.
What i would suggest is to (as well as, or instead of, the single "lat;long" string) store the latitude and longitude seperately. Then you can do an sql query to find other records that are within a certain distance, THEN run the levenshtein on their names. You want to try to run the lev as few times as possible as it's slow.
Then you could do something like this: let's say your model name is "Place":
class Place < ActiveRecord::Base
def nearby_places
range = 0.005; #adjust this to get the proximity you want
#lat and long are fields to hold the latitude and longitude as floats
Place.find(:all, :conditions => ["id <> ? and lat > ? and lat < ? and long > ? and long < ?", self.id, self.lat - range, self.lat + range, self.long - range, self.long + range])
end
def similars
self.nearby_places.select do |place|
#levenshtein logic here - return true if self.name and place.name are similar according to your criteria
end
end
end
I've set range to 0.005 but i've no idea what it should be for 1/2 a mile. Let's work it out: google says one degree of latitude is 69.13 miles, so i guess half a mile in degrees would be 1/(69.13 * 2) which gives 0.0072, so not a bad guess :)
Note that my search logic would return places that are anywhere within a square which is a mile per side, with our current place in the centre. This would potentially include more places than a circle with 1/2 mile radius with our current place in the centre, but it's probably fine as a quick way of getting some nearby places.

zipcode distance range formula. Which formula is correct?

I need to find all the zipcodes with a certain range from a zipcode. I have all the zipcode lat/lon in a DB.
I have found two formulas online that vary slightly from each other. Which one is correct?
Formula 1:
def latRange = range/69.172
def lonRange = Math.abs(range/(Math.cos(Math.toRadians(zip.latitude)) * 69.172));
def minLat = zip.latitude - latRange
def maxLat = zip.latitude + latRange
def minLon = zip.longitude - lonRange
def maxLon = zip.longitude + lonRange
Formula 2: (is identical to formula one except for the following:)
def lonRange = Math.abs(range/(Math.cos(zip.latitude) * 69.172));
(The second one does not have Math.toRadians )
After I get the min/max Lat/Lon, I intend to search a DB table using a between criteria. Which formula is correct?
It depends on the units of your lat/long. If they are in degrees, you need to convert to radians.
I would suggest letting the db do all the heavy lifting. Several DBs have geo add-ons, but here are a couple examples: MongoDB, postgres+postgis
If your latitude and longitude data isn't already in radians then you'll need to use the one that converts. You should be aware though that the way things are set up now you'll end up with a square range.
You'll be better off doing the distance calculation in your mysql query, using the Pythagorean theorem to find the distance so you end up with a circular range.
This question should help you get started with that: mySQL select zipcodes within x km/miles within range of y .
If you need to improve it's performance you can index it using Sphinx.

Resources