Is there a way to visualize Google keyword planner historical search frequencies on a map (as some form of Geodata)? - geolocation

I am rather new in the field and now I am faced by the following challenge:
I would like to use historical and current Google search frequency data from the keyword planner (e.g. how often are people in the UK (but also in different UK regions) search for mountainbikes) and then visualize this data on a global map (as a means to visualize market potential). Ideally automated and through an API, but could also be through manual csv import.
Additionally I am thinking of visualizing the number and location of sport retailers by region (or city) - and also other indicators where I can get Geolocation data from (income levels, population density, ...).
Any ideas on how to best approach this from a technical and tool side?

Related

Storing any number series data in a time-series database

I would like to make use of time-series database InfluxDb to store data points indexed by another number instead of time which every data point is stored against. So I can take advantage all the features for a series of datapoints against this number..
For example I have a rocket doing multiple launches on which I have several sensors recording temperature, air pressure, fuel level &c. And I want to graph these datapoints against elevation not time..
I realise I could store elevation itself against time then from the time for say a temperature reading work out the elevation and project the results - but that working out would lose the performance characteristics of just querying the datapoints indexed by elevation. Also third party tools which use the time-series database won't be able to simply get these datapoints against elevation as opposed to time to graph them out, e.g. Grafana, without me putting something in-between to marry the data up..
One idea I had was to have a fake time where meters = seconds and store against this, then I would need make that a composite with something else to differentiate rocket launches, e.g. increment year by 1 starting at year 0.. So I don't see every launch starting at the same elevation and can separate the "number-series" from each other - I guess I would have that problem anyway and the proper way to that would be through tags..
What makes you believe that this approach would be more efficient than storing the elevation jointly with your other sensor data? Fetching data is pretty cheap so the performance gain might be very light compared to the augmented complexity of your keys. Not to mention that you would still need to have the time make part of your elevation-timestamp, otherwise you will end up with duplicate pseudo timestamps and therefore incomplete data as most time series databases do not allow multiple values at the same timestamp for a given series.
I would encourage you to also have a look at other time series databases which include elevation as part of their standard data model. Check out Warp 10 for that matter (std disclaimer, I am the co-founder of SenX, maker of Warp 10).

Recommender system result curation

so I want to ask if there's some sort of curation algorithm that arranges/sends results from a recommender system to a user.
For example, how Twitter recommends feeds to users. Is there some sort of algorithm that does that or Twitter just sorts it by highest number of interactions with that tweet (based on time posted too).
No, there is nothing like that.
Actually the recommendation system model is made in such a way, where it sort it based on Content Based filtering or Collaborative filtering according to the view stats of the user.
There are some algorithms like calculating co-relation between the view stats of the user and the content which is in twitter, and then recommend it.
Or Cosine Similarity and Cosine distance can also be used to calculate distance between view stats and content of twitter to recommend.
You must explore also other recommendation system, which is based on other algo's like Pearson Correlation, Weighted Average,etc.

information retrieval feedback in practical

From the course "Text Retrieval and Search Engines" on Coursera I learnt some feedback algorithms in information retrieval system, like Rocchio. But I still can't understand how feedback is used in practical.
Why all feedback algo update the query vector instead of updating the document rank directly?
Are the document click through feedback stored in Postings list?
Thanks
But I still can't understand how feedback is used in practical.
Since you've studied the Rocchio feedback, I'll try to explain with reference to this particular approach although this will be applicable to any other feedback methods as well, e.g. relevance modeling.
The Rocchio algorithm first modifies the current query representation (by adding new terms and re-weighting initial query terms). It then performs a 2nd pass retrieval and obtains a new ranked list.
Why all feedback algo update the query vector instead of updating the document rank directly?
This is because if the initial query representation is not good enough, the initial ranked list wont have a high recall. This means that even reranking the results won't be much useful (unless of course you're doing a highly precision oriented task and all you care about is P#10). Additional terms in the query will often have a significant impact in retrieving more relevant documents in top-1000.
Are the document click through feedback stored in Postings list?
No, the postings list may additionally contain per-document statistics for a particular term (the head of the list), e.g. term positions etc. The information of whether a document was clicked or not is a global information, not pertaining to a specific term.
Also, user clicks are not used to modify the ranking of the current query. They could be used, rather, to build user profiles of interest.

Applying AI, recommendation or machine learning techniques to search feature

I'm new to the area of AI, machine learning, recommendation engines and data mining however would like to find a way to get into the area.
I'm working on an conference room booking application which will recommend meeting rooms to employees at which it calculates to be the most suitable time and location. The recommendations are based on criteria which an employee will enter before submitting a search. The criteria can include meeting attendees (which can be in different locations and timezones), room capacity (based on attendees) and types of equipment required.
The recommendation engine will take into consideration timezones and locations and recommend one or more meetings rooms , depending on whether employees are in different builings/geo-graphical regions.
Can anyone recommend recommendation engine, machine learning or AI techniques which i could apply to solving the solution? I'm new to this area so all suggestions are greatly appreciated.
This looks more like an optimization problem. You have some hard constraints and some preferences. Look at Linear Programming. Also google Constraint based Scheduling, there are several tutorials.
Just a warning: This is in general an NP-hard problem, so unless you are trying to solve it for a small number participants, you will need to use some heuristics and approximations. If you want to go a little bit overboard, there is a coursera class on optimization running right now.

Factual API vs Google Places API in terms of Distance Matrix (distance and time)

I need enough accuracy in my app but Google Places seems to be poorly accurate filtering by category. So I'm considering migrating to Factual API. Do you guys have used it? What do you think about its accuracy?
In the other hand, I NEED to know the distance to a place and the estimated travel time. I'm getting this info with Google Distance Matrix API, but I don't know if Factual has this functionality or not.
Thanks in advance.
I used Factual's api for one app and the result is worse than Google Place's, at least for the super-market/grocery category
If the Factual API allows you to display the data on a Google Map, you can use the Factual data with the Distance Matrix.
Factual provides distance in query results(in meters from search center). It has a much better category tree system. Factual allows "IncludeAny(Category ids)" (Google only has single level types and does not allow multiple types search). What I do is use Factual for initial search and Google Places for detail on a particular place. Google places has photo[s], reviews(3)and openNow(boolean).
The quality of data is slightly better in Google. (Both need work)

Resources