How exactly do address lines work in QuickBooks? - quickbooks

Right now I'm only trying to read addresses and display them. Ignoring IPP right now, just inside QB, I'm not understanding the algorithm that manages the address lines.
Further, when accessing the customer address object via IPP, there are more differences, adding to my confusion. I'll call the three areas I'm looking at the freeform block, field block and IPP object. Here's an example where I typed the text into the field block and made the text match the field name:
The freeform block and IPP object took the City, State and Zip values and combined them into line 3. The IPP object has the Note value in Line 4. And the Country value ends up in the City field in IPP and field block.
Here's an example where I simply typed "line 1 ... line 5" in the freeform block:
Lines 1 - 4 look ok in the field block after the conversion, and put "line 5" into the City field. The IPP object is missing Line 4 field and value altogether.
Can someone share with us how this works? I'm trying to read these addresses and display them in my app in a consistent way.

I'm not familiar with Quickbooks. But I think you're looking for "address standardization" since you aren't sure in what format the address will come from Quickbooks.
Addresses are tricky (trust me, I work at SmartyStreets, where we have to be smart ... about streets) but there are services -- free and paid -- which will standardize addresses and put them into a consistent "componentized" format.
Take a look at LiveAddress API for starters... or you could use the batch/list service if you export your data into a file. Either way, it's free to use for a certain number of addresses.
(Tip: You can submit addresses for standardization and verification in two fields: "Street" and "Last Line" and still get good results -- so if you're not exactly sure where the city/state are, just put anything that's not the street address in the last line field.)

Related

Is there a common way to write addresses of facilities in HL7 SIU/ADT messages?

I've found several references to SAD/AD/XAD data types in the HL7 spec, but don't see anything about this info being used to describe facilities, such as those described in the AIL segment (e.g. "2^BLUE HILL FACILITY"). But where, if anywhere, can I expect to get data about that facility, such as name, street address, etc.?
In the PV1 segment definition for HL7 2.x there isn't a dedicated field for Facility Address specifically. If you are the receiving system in this case, you may have to request that data be sent or added in from the sending system. This could be as simple as placing it in the PV1-5.X field (X being whatever subfield the sending system decides on), or adding in a custom Z segment (ZPV for example) to have this data transmitted.
Since there is no standard field for this kind of data in the PV1 segment though, you will have to confirm that the sending system can in fact send this data.
The facility name will be in a separate field. It varies from segment to segment. You just have to read the segment definition and pick the right field.

Parsing six figure Ordnance Survey Grid Reference input

An extensive search has produced no answer to the question, "Is there a class or function that parses input for soundness relating to UK Ordnance Survey Grid References".
The UK is mapped by the UK Ordnance Survey who produce detailed maps of the United Kingdom with many types of referencing. One of these is the commonly used six figure Grid Reference, we are at SO896804.
We already use a postcode (zip) checker to make sure that the information entered into the postcode field is sound, but we can't find the same for the OS Grid Reference.
Does such a Grid Reference function exist, or do we down tools and write one?
Thank you.
Since you tagged this "parsing", rather than using some GIS-like tag, I'd say that a reasonably valid OS grid reference corresponds to the regular expression:
(H[PTUWXYZ]|N[ABCDFGHJKLMNORQSTUWXYZ]|OV|S[CDEHJKMNOPRSTUVWXYZ]|T[AFGLMQRV])([0-9]{6})
If you were prepared to accept 4-digit 100ha blocks as well as the six-digit 1ha blocks, you could replace the second parenthesized expression with ([0-9]{4}|[0-9]{2}).
Of course, some of the two-letter blocks are almost completely marine. You could almost certainly ignore OV, for example. NW contains a little bit of Dumfries & Galloway (Portpatrick, NW995545), and a much larger but irrelevant part of Northern Ireland.

iOS - Check pasteboard for valid Mailing Address

I am looking for some guidance for how I could check the pasteboard in iOS for a valid mailing address.
If someone pastes
1234 Apple Street
New York, NY 10011
It parses each part of the string to fill in Address, City, State and Zip. It could be any address and It would be ideal if it could be found inside a longer string.
For example
Meet me at 1234 Apple Street New York, NY 10011 See you there!
Still will parse the correct Address, City, State and Zip.
Any help would be much appreciated!
-Wes
I was a developer at SmartyStreets. We were kind of crazy about street addresses, and street addresses drove me crazy (especially parsing them). It's a two-way street. (Am I done with the street puns?)
First, let's talk about the case where the address is all by itself, because that's easier, albeit still difficult...
Please reference this other question and answer about the very same thing. I also strongly encourage you follow the links to related questions in both the question and the answer. Parsing addresses is a can of worms, but it's not impossible. It's just really hard to do it reliably.
Notice in the answer to that question how many different formats valid addresses can appear in. What guarantees do you have that the user will type it in any of those? And that's only a few. There are others. Consider military, PO box, rural route, and other "special" addresses that don't adhere to the typical format. What about addresses that have a two-or-three-word city name? What about addresses that use a grid system like 100 N 500 E, or secondary numbers like suite, apartment, floor, etc? What about addresses with "1/2", hyphens (as a required punctuation), etc? Addresses missing zip codes or city/state?
All of these and more could be valid. And that's only for US addresses.
If all your addresses, or even most of them (which isn't the case), came in the form like you proposed above, as an example:
[Primary Number] [Street Name] [Any of these street suffixes]
[City Name Followed by a Comma], [State Abbreviation] [5-digit ZIP code]
Then this would be quite easy. Wouldn't that be nice?
You could try to write a regular expression like this guy or that guy, but that only works if addresses are a regular language. They're not regular, and regular expressions are not the answer.
There are a few services which can do this for you because they have a master list (kind of), and the software has to meet rigorous certification standards.
Obviously, since I work at SmartyStreets, I'm prone to suggest starting your search for an answer there. You can try some freeform addresses on the homepage (just fill out the "Street" field). But be aware of a few things that will probably always be an issue. LiveAddress API will be able to parse street addresses for you, most of the time. Shop around, but this should give you an idea.
Now your second question: extract a street address from a string of text. This has been extensively covered elsewhere on S.O. and the interwebs, so I won't go into a lot of detail. Basically, to do this reliably, you'll probably need some Natural Language Processing and human interaction to confirm or correct the best guess.
Don't ever assume these things about un-standardized addresses:
Starts with a number
Ends with a number
Everything between the two numbers is an address
Has a ZIP code present
No more than 2 numbers will be in an address
It's unambiguous
It exists
A street suffix will always be present
It's spelled correctly
...etc.
Again, refer to some other linked posts about this issue. You can make guesses, but always always always have a human confirm the guess if you do that. (Some Mac apps do this. If they detect an address, it will get highlighted, and you can add that address to your contacts. Unfortunately I've seen false positives a lot, and it also misses them a lot.)
Good luck!
I also work at SmartyStreets, and since I'm not a developer I'm not bound by any constraints such as "it can't be done" or "there's no way to do it reliably". In fact the ideas that I come up with may not even always be possible, but, I'm a problem-solver, a solution-finder, and this particular problem absolutely has a solution.
You'll need the following: a little regex, knowledge of a scripting language (python, php, whatever you prefer) and access to an address validation tool (this is required so that you know when you get it right).
So, let's start with the example sentence:
Meet me at 1234 Apple Street New York, NY 10011 See you there!
We can be sure that every address has a beginning and an end. (you can take that to the bank!)
So, if you run a regular expression that looks for the beginning of the address within the string you can eliminate everything before the address begins. Here's a regex that will do just that:
(^(.*(?=p\.?o\.? box|h\.?c\.?r\.? |c\.?m\.?r\.?)|^[^0-9]+))
This will give you back the following:
1234 Apple Street New York, NY 10011 See you there!
Now, you're halfway there but you'll need to loop through the remaining string. Another assumption that you can certainly make is that an address will never be longer than 328 charachters long (I made up that number, but you get the picture. An address has to have an end as well and you can shorten the string by determining the max acceptable USPS address length.)
You're going to loop through the address string until you get a valid address out of it. To do this, start at the beginning and move one word to the right with each additional permutation. This is where the address validation service come in handy, because you have no idea where the address ends and that's what you need to know. So, each permutation you generate from the string (remember, you're starting from the left side) will be sent for validation. Since no valid address can have fewer than two words, You'll start there. Here are the permutations from the example address as well as the validation results (I'm trying each address by entering it in the address line of the address search box on smartystreets.com:
1234 Apple ==> fail
1234 Apple Street ==> fail
1234 Apple Street New ==> fail
1234 Apple Street New York ==> fail
1234 Apple Street New York, NY ==> Bingo, valid address match. No need to keep iterating.
Now, obviously this is not a valid address but you can try the same thing with a real address and you'll get the same results. Obviously this isn't the most sophisticated method to extract a valid address from a string but it certainly works. And, since SmartyStreets allows you to send up to 100 addresses per query, you could permute the address string up to 99 times and get the results back in under 300ms. This won't work with every address, as you'll certainly find out, but it can very easily handle a large majority of them, regardless of how obscured the address is within the text string.
So, we started with this meet me at 1234 Apple Street New York, NY 10011 See you there! and within less than half a second came up with this 1234 Apple Street New York, NY 10011-1000.
Pretty cool huh? It even sounds really easy coming from a non-programmer.
Let's try it with a real address:
Meet me at 4219 jon young orlando fl 32839 See you there!
Apply regex and you get:
4219 jon young orlando fl 32839 See you there!
Permute, iterate, validate:
4219 jon ==> fail
4219 jon young ==> fail
4219 jon young orlando ==> fail
4219 jon young orlando fl ==> Bingo, valid address match.

localising postal / physical address display from database fields

Can anyone point me to a list of international postal / residential / delivery address format templates that use some kind of parseable standard vocabulary for address parts?
The ideal list contains a country code then a format using replaceable tokens so I can substitute database address fields into a template to produce something printable in the local format.
for example
NZ | [first_name] [family_name]\n[company_name]\n[street_address]\n[city] [post_code]\n[country]
AU | [first_name] [family_name]\n[company_name]\n[street_address]\n[city]\n[state] [post_code]\n[country]
US | etc
UK | etc
Background: I used to have a simple freetext field to accept addresses. Moving to support vCard download, which requires addresses to be broken down into specific fields. Thats all fine: we can do the migration. I'm looking for a way to display the fields in the "correct" order for each country. thanks for your help!
This MSDN page has the information in the format you need and seems accurate, but covers only 33 countries. Maybe they are enough.
The Universal Postal Union offers all the information you need for a lot of countries here. This is top quality information; however, it is split across as many PDF documents as there are countries and is not in the format you need.
This page provides the information in a slightly more accessible form. As far as I can judge, it is accurate (and contains a lot of valuable info), but I can't speak to its quality nor its currentness.
Google have a JSON-based API that they use for their Android address input field library that contains this kind of formatting information.
The field you'd be interested in is fmt. There doesn't seem to be any formal documentation on the format they use, but a proposal to include this information as part of the Unicode CLDR has matching fields (scroll down to "Detailed Breakdown of elements"); there are also some clues in Google's libaddressinput source code.

Parsing a full name into its constituents

We are in need of developing a back end application that can parse a full name into
Prefix (Dr. Mr. Ms. etc)
First Name
Last Name
Middle Name
etc
Challenge here is that it has to support names of multiple countries and languages. One assumption that we have is we will always get a country and language along with the full name as input.
The full name may come in any format. For the same country / language combination, it may come in with first name last name or the reverse. Comma will not be a part of the Full Name.
Is is feasible? We are also open to any commercially available software.
I think this is impossible. Consider Ralph Vaughan Williams. His family name is "Vaughan Williams" and his first name is "Ralph". Contrast this with Charles Villiers Stanford, whose family name is "Stanford", with first name "Charles" and middle name "Villiers".
Both are English-speaking composers from England, so country and language information is not sufficient to establish the correct parsing logic.
Since the OP was open to any commercially available offering...
The "IBM InfoSphere Global Name Analytics" appears to be a commercial solution satisfying the original request for the parsing of a [free-form unstructured] personal name [full name]; apparently with a degree of certainty in regards to resolving some of the name ambiguity issues alluded to in other responses.Note: I have no personal experience nor association with the product, I had merely encountered this discussion and the following reference links while re-investigating effectively the same concern as described by the OP. HTH.
A general product documentation link:
http://publib.boulder.ibm.com/infocenter/gnrgna/v4r1m0/topic/com.ibm.gnr.gna.ic.doc/topics/gnr_gna_con_gnaoverview.html
Refer to the "Parsing names using NameParser" at
http://publib.boulder.ibm.com/infocenter/gnrgna/v4r1m0/topic/com.ibm.gnr.gna.ic.doc/topics/gnr_np_con_parsingnamesusingnameparser.html
The NameParser is a component API for the product per
http://publib.boulder.ibm.com/infocenter/gnrgna/v4r1m0/topic/com.ibm.gnr.gna.ic.doc/topics/gnr_gnm_con_logicalarchitecturecapis.html
Refer to the "Parsing names using IBM NameWorks" at
http://publib.boulder.ibm.com/infocenter/gnrgna/v4r1m0/topic/com.ibm.gnr.gna.ic.doc/topics/gnr_gnm_con_parsingnamesusingnameworks.html
"IBM NameWorks combines the individual IBM InfoSphere Global Name Recognition components into a single, unified, easy-to-use application programming interface (API), and also extends this functionality to Java applications and as a Web service"
http://publib.boulder.ibm.com/infocenter/gnrgna/v4r1m0/topic/com.ibm.gnr.gna.ic.doc/topics/gnr_gnm_con_logicalarchitecturenwapis.html
To clarify why I think this answers the question, ameliorating some of the previous alluded difficulties in accomplishing the task... If I understood correctly what I read, the APIs use the "NameHunter Server" to search the "IBM InfoSphere Global Name Data Archive (NDA)" which is described as "a collection of nearly one billion names from around the world, along with gender and country of association for each name. This large repository of name information powers the algorithms and rules that IBM InfoSphere Global Name Recognition products use to categorize, classify, parse, genderize , and match names."
FWiW I also ran across a "Name Parser" which uses a database of ~140K names as noted at:
http://www.melissadata.com/dqt/websmart-web-services.htm
The only reasonable approach is to avoid having to do so in the first place. The most obvious (and common) way to do that is to have the user enter the title, first/given name, last/family name, suffix, etc., separately from each other, rather than attempting to parse them out of a single string.
Ask yourself: do you really need the different parts of a name? Parsing names is inherently un-doable, since different cultures use different conventions (e.g. "middle name" is a typical USA-ism) and some small percentage of names will always be treated wrongly.
It is much preferable to treat a name as an "atomic" not-splittable entity.
Here are two free PHP name parsing libraries for those on a budget:
https://code.google.com/p/php-name-parser/
http://jasonpriem.org/human-name-parse/
And here is a Javasript library in Node package manager:
https://npmjs.org/package/name-parser
I wrote a simple human name parser in javascript as an npm module:
https://www.npmjs.org/package/humanparser
humanparser
Parse a human name string into salutation, first name, middle name, last name, suffix.
Install
npm install humanparser
Usage
var human = require('humanparser');
var fullName = 'Mr. William R. Jenkins, III'
, attrs = human.parseName(fullName);
console.log(attrs);
//produces the following output
{ saluation: 'Mr.',
firstName: 'William',
suffix: 'III',
lastName: 'Jenkins',
middleName: 'R.',
fullName: 'Mr. William R. Jenkins, III' }
A basic algorithm could do the following:
First see if incoming string starts with a title such as Mrs and remove it if it does, checking against a fixed list of titles.
If there is one space left and one space exactly, assume first word is first name and second word is surname (which will be incorrect at times)
To go beyond that would be lots of work, see How to parse full names to identify avenues for improvement and see these involved IBM docs for further implementation clues
"Ashton Jordan" "Jordan Ashton" -- u can't tell which is the surname and which is the give name.
Also people in South India apparently don't have a surname. The same with Sherpas in the Himalayas.
But say you have a huge list of all surnames (which are never used as given names) then maybe you can use that to identify other parts of the name (Salutations/Given/Middle/Jr/Sr/I/II/...) And if there is ambiguity your name-parser could ask for human input.
As others have explained, the problem is not solvable. The best approach I can think of to storing names is storing the full name, followed by the start (and potentially also ending) offsets into a "primary collating subfield" which the person entering the name could have indicated by highlighting it or such. For example
John Robert Miller, Jr.
where the boldface is indicating what was marked as the "primary collating subfield". This range would then be moved to the beginning of the string when generating the collating key.
Of course this approach alone may not be sufficient if you also want to support titles (and ignoring them for collation purposes)...

Resources