Geo-location by Zip/Postal Code - geolocation

I'm looking for a service to get a latitude/longitude dynamically by zip/postal code.
I also need this service to give me an address (city/state/country) by providing an IP Address.
I will be using this service on my website and don't want to download and maintain a database.
I looked at a few services, some too expensive and some free with a max amount of daily/monthly lookups.
What are some good free services and what are some good paid services (Not too expensive) that allow for a large amount of queries?
I am using asp.net c# with MS SQL Server 2008 or later.
Thanks in advance.

The US zip codes are free (maybe just not very well maintained):
http://www.census.gov/geo/www/tiger/tigermap.html#ZIP
Also see a (whole world) crowd sourcing project:
http://www.freethepostcode.org/
The database of the UK zip code location was leaked last year or so. Or maybe it was made public by some government scheme I can't remember. It is definitely available here: http://www.freepostcodes.org.uk/

For lat/long: Google and Yahoo allow for several thousand queries per day, at least the last time I used them.
For GeoIP lookup, I can't say. In the past, I've used aggregate data from Google AdWords. This may be true of other advertising networks, or some may give you info per user.

Related

Does Cloud Dataflow streaming workers get the Sustained Usage Discount?

On the old pricing page they mention that all the Google Compute instances used by Cloud Dataflow workers are billed based on sustained use price rules, but the new pricing page does not mention it anymore.
I asume that since internally it is using the same Compute instances, the discount should probably apply, but since I couldn't find any mention of it anywhere, I would appreciate if anyone is able to confirm this.
Old Pricing
New Pricing
In the new pricing model there is no sustained use discount.

Geodata Querying Optimisations

I am planning to write a Node.js-powered RESTful web service that I will use for a mobile application which provides some sort of location based features. The most basic use case is going to look something like this:
the user can create a resource by sending a request to the web service containing the resource's name and the user's current location (latitude and longitude)
the web service will store the metadata about this resource internally in some sort of collection
the user can query the web service for a list of resources within 5km of his current location
One of the first problems that came up in my mind was scalability. Let's suppose that at some point in the future the server will hold metadata for 1 million resources. When a user will query for nearby results, looping through 1 million entries to compute the distance will take forever.
There are many services out there that have the same flow, so I thought implementing something like this is not going to take me a lot of time. I might have been wrong.
I am now two days into researching proven methods and algorithms. By now I have read everything I could put my hands on about QuadTrees, Geohases, databases with spatial indexing support, formulas and so on. However, I still can't get the whole picture of how everything is going to work.
I was hoping that maybe someone who has worked on something similar could share his insight on what approach might be the most suitable considering this use case and the technologies that I am planning to use. Also, a short description of how it can be implemented would help me a lot!
For those who are also looking for more information on this topic out of curiosity, my answer might not provide much clearance. However, some answers in here might help you understand how you could achieve proximity searches using Geohashes.
My approach, after doing a little research on Redis, will be not to overcomplicate things and just use the tools that are already out there. It has out of the box support for spatial indexing and will most probably meet all my persistance requirements for this project.
Apparently MongoDB also comes with built-in support for geodata. In fact, even RDBMS like MySQL or SQLite do come with such capabilities.

Using multiple MBaaS accounts for a business

Is it ethical, (and also legal) to use Parse.com or a MBaaS in a way that I can create apps for businesses, by creating a separate account for each business? For example, to limit the amount of requests that my one account makes, I wouldn't put 10 different business apps on the same parse account, rather, I would create 10 separate accounts for each of the 10 businesses and log into the respective one when I need to.
If not, what is the recommended solution to create a scale-able MBaaS that could handle such usage, because I heard Parse is a great solution for small apps, but when your requests start to build up (Which they would if I have 50 businesses all going through me) that the costs increase exponentially more than other MBaaS providers.
I am looking for the most ethical, and clean (and preferably low-cost) way to do this, to just get my business on it's feet. I look forward to any suggestions! Thanks.
That's a great question. It would be annoying to have 10 separate logins, that's for sure. I'm not certain what the answer is but I wouldn't suggest creating 10 dif accounts.
If scalability and price are your major concerns I'd be happy to try and help you out. I work with another MBaaS, CloudMine. Costs are clearly important to you because you're just starting out but you also don't want to end up paying for something that you won't receive support on.
I have some great introductory pricing that may be a good fit for you. I can't exactly answer your question but I can offer a suggestion for a platform if you're open to other options. If you're interested in learning more I'm happy to provide some more info.
cheers
Just so I understand, you are looking to have a single BaaS account and have a number of sub-app accounts under that... if so there are a few multi-tenant mBaaS solutions out there that do that. Some also give you separate logins per app, so if your building apps for others you can let them log in and see data on the app.
Kumulos does this, it may work for you.
www.kumulos.com

Process Automation: for IT or each department to implement?

As an IT web developer I write mostly process automation code and reporting for all departments in the company (IT, Legal, HR, Engineering, Tech Writers, Finance & Accounting, Marketing, etc).
However, some other departments also have small programming teams (Engineering, HR and Marketing) which do some department specific work which is part of their "core job".
For instance marketing maintains our external website and therefore needs some graphic artists and HTML/CSS/JS developers to implement it. HR has a dedicated staff that only works with our salary/payment system as it's highly confidential. Engineers automate some debugging/testing with scripts that require advanced engineering knowledge to make.
How can you draw the line between which projects these small, expert, non-IT teams should handle and which IT should handle? Are there best practices or a list of criteria that could be used?
This issue is both political and technical, but I'm looking for best practices and the ideal way to draw the line, not political considerations.
You should draw the line based on the org chart and expected responsibilities, the more you can reference existing org documents the better. Examples, Marketing is doing front end work for the company website, but IT should be in charge on an internal intranet site.
Your org docs should already have IT in charge of internal information systems, perhaps HR is the exception with the need for privacy. That exception would provide the boundary for you, anything not contained in the exception is the roll of IT and not HR. They work on their code base and hold the keys to their database. But if the systems the code and database run on need tweaking that should be IT and should be in line with company wide standards.
Using this example something like optimizing part of the network for the Engineering team would be easy to answer. That is an IT job. Optimizing a test case would fall to Engineering. Code for backing up and encrypting Financial data is IT's responsibility, you don't need to know what the information is really, just it's basic properties. Writing code to analyze Financial documents would go to someone within Finance, because access to sensitive documents would be needed, etc.

Suitability of Amazon SimpleDB for large temporal data sets eminating from thousands of separate devices

I'm trying to establish whether Amazon SimpleDB is suitable for a subset of data I have.
I have thousands of deployed autonomous sensor devices recording data.
Each sensor device essentially reports a couple of values four times an hour each day, over months and years. I need to keep all of this data for historic statistical analysis. Generally, it is write once, read many times. Server-based applications run regularly to query the data to infer other information.
The rows of data today, in SQL look something like this:
(id, device_id, utc_timestamp, value1, value2)
Our existing MySQL solution is not going to scale up much further, with tens of millions of rows. We query things like "tell me the sum of all the value1 yesterday" or "show me the average of value2 in the last 8 hours". We do this in SQL but can happily change to doing it in code. SimpleDBs "eventual consistency" appears fine for our puposes.
I'm reading up all I can and am about to start experimenting with our AWS account, but it's not clear to me how the various SimpleDB concepts (items, domains, attributes, etc.) relate to our domain.
Is SimpleDB an appropriate vehicle for this and what would a generalised approach be?
PS: We mostly use Python, but this shouldn't matter when considering this at a high level. I'm aware of the boto library at this point.
Edit:
Continuing to search on solutions for this I did come across Stack Overflow question What is the best open source solution for storing time series data? which was useful.
Just following up on this one many months later...
I did actually have the opportunity to speak to Amazon directly about this last summer, and eventually got access to the beta programme for what eventually became DynamoDB, but was not able to talk about it.
I would recommend it for this sort of scenario, where you need a primary key and what might be described as a secondary index/range - eg timestamps. This allows you much greater confidence in search, ie "show me all the data for device X between monday and friday"
We haven't actually moved to this yet for various reasons but do still plan to.
http://aws.amazon.com/dynamodb/
I my opinon, Amazon SimpleDb as well as Microsoft Azure Tables is a fine solution as long as your queries are quite simple. As soon as you trying to do stuff that's absolutely a non-issue on relational databases like aggregates you begin to run into trouble. So if you are going to do some heavy reporting stuff it might get messy.
It sounds like your problem may be best handled by a round-robin database (RRD). An RRD stores time variable data in such a way so that the file size never grows beyond its initial setting. It's extremely cool and very useful for generating graphs and time series information.
I agree with Oliver Weichhold that a cloud based database solution will handle the usecase you described. You can spread your data across multiple SimpleDB domains (like partitions) and stored your data in a way that most of your queries can be executed from a single domain without having to traverse the entire database. Defining your partition strategy will be key to the success of moving towards a cloud based DB. Data set partitioning is talked about here

Resources