Slowness in the geolocation API - geolocation

I'm working on a project that uses HERE's geolocation service.
The project is basically a feature in our system that will route a list of addresses. This routing will happen every day and will have around 7000 points, at least.
Today we use the HERE service to geolocate these addresses and send them to our routing service. However, we are facing a huge bottleneck in this implementation: Of the 7000 points we use for testing, we were able to send only about 200 to geolocate, if we send a larger number of points, we simply do not receive any more response, nor the return of timeout or anything like that.
About the implementation: we do not send all points in the same request, each point to be geocoded is sent in a request. We adjusted our software to send only four requests per second thinking that there could be a QPS block, but we were not successful in solving the problem. We thought about also implementing a massage queue, but this could end up increasing the total time of geolocation + routing, which for us makes the solution unfeasible.
In the code, we have an array that stores the addresses to be geocoded, and for each position of the array we execute a GET request for the following URL: https://geocoder.ls.hereapi.com/6.2/geocode.json?apiKey=TOKEN&searchtext=ADDRESS
If you can help me find a solution.

For a large numbers of geocodes you may wish to consider the Batch Geocoder API:
https://developer.here.com/documentation/batch-geocoder/dev_guide/topics/quick-start-batch-geocode.html
I cannot replicate a problem with more than 200 Geocoder requests in a row, so we may need to see some code before we can help further.

Are you using our freemium service ? just to let you know that our 6.2 version of geocoder API is no longer support any new feature development, and hence if you are still implmenting the use case. Please try to switch to V7. Do you mean that you are not able to send entire 7000 addresses and fetch response even in chunks. It could be also due to Linux system that has restricted number of pool network connections on the same moment.try to send requests from some home endpoint (that not behind firewall ) and from Windows system

Related

Cloud Run: real world experience with custom domains and issue with latency

In researching Google's Cloud Run, one issue listed is:
High request latency with custom domains when invoking from some
regions
https://cloud.google.com/run/docs/issues#latency-domains
Unfortunately it does not say which regions, only ones from which it is worse. I assume very high is in the order of many seconds.
So:
a) is it the region the app is in, or the region the user is in? So like if an app is in eu-west1, and the user is in us-east4, does that result in high latency?
b) have people got some real world examples of numbers?
c) is the only way to avoid it (given the need for a custom domain) to proxy the request to it from a CDN? Alas Cloudflare does not let you send a custom hostname, but I assume others will. Though not sure if they would permit e.g. POST requests

Graphql and round trips. Is this just an ios app issue?

I'm taking another look at graphql, and I'm trying to understand why saving round trips is a benefit to developers. Is the cost of making a request so expensive? I'm coming from a web development background.
Lets compare a standard rest api with a graphql api.
I need to retrieve a users personal info, and a list of their friends. Traditional rest api might require 2 calls, one to get the personal info, and one to get the friends.
With graphql, I get the same results with one request.
But as a frontend developer, I want my page to have the shortest possible stagnant period. I would like to render only a portion of the page as fast as possible, rather than wait for all the information I need in one go to then render the page.
Now from my understanding, graphql was partly created to solve mobile application api issues. Is there something about ios apps that makes it more beneficial to load all the data at once, as opposed to parallel requests? Or is there something else that I'm missing?
So generally speaking, network traffic inside the system you control (i.e. your backend) is fast. A request that may take 205ms (200ms network, 5ms data) to the outside world, might take just 6ms (1ms network, 5ms data) internally.
If you want to build a screen of your app, and that requires making two REST requests because the 2nd depends on the result of the first, you're looking at (given these crude numbers) 410ms to get the data you need to render your screen.
With GraphQL (or any other gateway layer that consolidates the data), you'd get everything in slightly over 212ms (200ms latency to GraphQL server + 6ms for each internal call).
In scenarios where you could make all your requests in parallel (i.e. they don't depend on the data from other requests), the performance benefits aren't quite as apparent, but you'll find that these situations are actually pretty rare as the complexity of your app grows.
The general rule of thumb with GraphQL is that your initial query fetches enough data for the page to be functional, if there's less important content you can always fetch that with another query.
Alongside the performance benefits, letting mobile devices make fewer network requests is a big win when it comes to battery life. Network usage is expensive, and should be avoided as much as possible.
Before using GraphQL instead of REST API one needs to understand the benefits of GraphQL over REST
GraphQL is a query language and
it uses the type system you define GraphQLInt, GraphQLString, customType...
REST
multiple round trips - slow
overloaded data
Many EndPoint
GraphQL
1 Endpoint
declaratives:
types, queries, mutations
exact shape of the data you want in one call
no underfetching, no overfetching of data
query == result, better performance

How can I check that a computer can access a certain URL and send and receive e-mail?

With Delphi XE2, what is the most reliable method to detect if the computer is able to do the following things?
reach a specific website with HTTP which does not have a fixed IP address
send and receive e-mail with any local or remote e-mail client
There are too many factors involved (type of Internet connection, firewall/router rules, proxies, etc). The most reliable approach is to simply not try to determine the current state and just attempt the desired operation (perform the actual HTTP request, or the SMTP/POP3/IMAP operation, etc), and just be prepared to react to any errors. You can detect connection-related errors and prompt the user to check their Internet connection before retrying.
Use TIdHTTP.Get and try to download http://google.com.
Of course it depends on the definition of being connected to the internet. Sometimes web traffic (port 80) is blocked while other ports are open. Fortunately, nowadays most people are actually allowed to browse the web, since it also provides help with their daily activities. Google is probably one of the least firewalled websites with one of the highest uptimes.
But still, it's a lucky guess. Depending on what you need it for, you might as well just try your thing and see if it works. If not, apparently the computer was not properly connected, even if it could reach Google. :)
[edit]
Because of the discussion. InternetCheckConnection is a good alternative too, but it also checks the connection by pinging an actual server.
MSDN says
Use the InternetCheckConnection function to check the connection to
the Internet. It attempts to ping the server designated by the URL
that is passed to the function. If the FLAG_ICC_FORCE_CONNECTION flag
is set and the URL is NULL, the function checks to see if there is an
entry in the server database for the nearest server. If one exists,
the function pings that server
But since this function uses ping, it may be a bit faster than actually retrieving content. On the other hand, many firewalls actively refuse pings.

Large number of WebSocket connections

I am writing an application that keeps track of content pushed around between users of a certain task. I am thinking of using WebSockets to send down new content as they are available to all users who are currently using the app for that given task.
I am writing this on Rails and the client side app is on iOS (probably going to be in Android too). I'm afraid that this WebSocket solution might not scale well. I am after some advice and things to consider while making the decision to go with WebSockets vs. some kind of polling solution.
Would Ruby on Rails servers (like Heroku) support large number of WebSockets open at the same time? Let's say a million connections for argument sake. Any material anyone can provide me of such stuff?
Would it cost a lot more on server hosting if I architect it this way?
Is it even possible to maintain millions of WebSockets simultaneously? I feel like this may not be the best design decision.
This is my first try at a proper Rails API. Any advice is greatly appreciated. Thx.
Million connections over WebSockets, using Ruby, I can't see its real if you not using clustering to spread connections between different instances to handle all the data processing.
The problem here is serializing and deserializing data.
As well you have to research of how often you will need to pull data to client from server, and if it worth to have just periodical checks using AJAX, then handling connection for whole time. Because if you do handle connection and then you not using it - it is waste of resources. WebSockets are build on top of TCP layer, and all connections are not "cheap" as well going through for OS and asking them for data available again is not the simple process, with millions connections it is something really almost impossible without using most advanced technologies in the world.
I head that Erlang is able to handle millions of connections, but I don't have details over it. As well connection is one thing, another is processing data and interaction between connections - this you might want to check, because if you have heavy processing algorithms, then you definitely need to look into horizontal scaling options over clustering solutions.
If you are implementing chat, use websockets.
If you are implementing 1 way messages in realtime use server sent events.
If you are implementing 1 way messages sent every few hours or so, use APNS.
The saying goes phone in hand, use websockets / server sent events.
Phone in pocket, use APNS.
APNS will alleviate wifi dips, tcp/ip socket hangs and many other issues. Really useful. There is the chance that it may take a little time to get through. But then again, there is the chance that websockets will take
Recent versions of iOS let you send APNS to the client without a popup message to the client so it can ask the server for more information. That along with some backgrounding implementations really improves things.
If possible, do not implement totally anonymous clients. It is very tricky to detect if a client reinstalls the app. So you'll end up sending duplicates to the client. Need to take that into account.
APNS looks trivial to implement in ruby, but I'd suggest avoiding the urge and going to using an existing gem/service out there that supports both google and apple. It is much trickier to implement than it may seem at first.
If you decide to stick with websockets, it may make sense to just leverage websockets in nginx like https://github.com/wandenberg/nginx-push-stream-module
ASIDE:
Using SMS where speed is critical is very expensive. $1/month per phone number only sends a max rate of 1 message per second. So sending 100 messages per second = $100/month plus message fees. Do note that 100 messages at a rate of 50 messages/second = $50/month. But if you want to send 1k messages, that takes 20 seconds.
Good luck

Exactly how accurate is IP Geolocation?

I'm setting up a iPhone tracking system for my friends, so they can submit their location to my website by their iPhone, anywhere, anytime - by WiFi or cellular data.
The website will use Google Maps for their coordination's so that my other friends can track where they are, however, it is the accuracy of the IP to coordinates to Google Maps is what I'm concerned about, exactly how accurate is it to use Google Maps that would track down the locations by an IP address?
I was thinking about 95%, but this was tested in a village which was quite fairly accurate, but what happens if it was in a city? Would this cause unaccurate locations?
Any kind help appreciated.
IP geolocation is really hit-or-miss, depending on both how the user's ISP assigns IPs and on the IP geolocation database you're using. For instance, I made a simple PHP script, IP2FireEagle, which looks up your IP. I found that the database kept placing me 10+ km to the west of where I really was. Updating my entry in Host IP wasn't the greatest, as it soon got reverted, presumably by someone also occasionally assigned that IP by my ISP! That being said, I found that Clarke has very accurate coordinates (not that this it's using IP geolocation per se but rather Skyhook's API and their WiFi geolocation database).
If it's a website for your friends and you know they have iPhones, I would suggest using its browser's support for navigator.geolocation.getCurrentPosition(). That is, get the location via Javascript and submit it to your server via an AJAX call. Even better since you want to use Google Maps, they give you a short tutorial on how get your friends' locations and then update a map.
Excerpt From:
http://www.clickz.com/822881
IP targeting has been around since the early days of ad serving. It's not very hard to write code that will strip the IP address from a request, compare it to a database, and deliver an ad accordingly. The true difficulty, as we shall see, is building and maintaining an IP database.
One of the first applications of information in an IP database was targeting to specific geographic regions. Most commercial ad management systems have IP databases that can make geographic targeting possible. However, there are a couple weaknesses in this method. The first (and biggest) problem is that, for various reasons, not all IPs can be mapped to an accurate location.
Take all the IPs associated with AOL users, for instance. Anybody who has seen a WebTrends report knows that all AOL users appear to be coming from somewhere in Virginia. This is caused by AOL's use of proxy servers to handle their web requests.
In the interest of saving space, we won't get into the reasons why AOL makes use of proxy servers. The important thing is that AOL does use them, and as a result, all its users appear to be accessing the web from Virginia. Thus, it is impossible to attach meaningful geographic location data to an AOL IP, and those IPs must be discarded from any database that wants to maintain a reasonable degree of accuracy.
Other ISPs and networks may use a method known as dynamic IP allocation for its users. In other words, a user might have a different IP address every time he visits the Internet. You can see how this might affect the accuracy of a database.
But the real difficulty in discerning geography from an IP address has to do with the level of specificity that a media planner might expect from this targeting method. The first few geo-targeted campaigns that I put together early in my career had to be accurate to the ZIP code level. This level of specificity is not practical via IP targeting.

Resources