FTS4 indexing phone number - fts4

How to index phone number of type (001)12-345-6789 to allow user searching by sub number, for example 567, without using the - sign?

Related

How can I write a Rails query to find phone number which are all saved with various formats

I'm using Rails 5.2.0
I am trying to write a query to find a phone number based off user input. However, the format of phone numbers saved in the database varies (e.g. (123)-456-7890, +1(123)-456-7890, +1 123456789, and so on). Is there any way I can format the records saved in my database in this query? I've thought of adding a second column to the table that would simply be formatted_telephone, but I have tens of thousands of records. Can I add a method in the User controller to update these records when they are fetched?
Here is what I have so far:
User.where("REGEXP_REPLACE(telephone, '[^[:digit:]]', '') ~* ?", "%#{input}%")
Right now this is still only returning phone numbers with this format: 1234567890.
Am I on the right track with this? Or is it not possible to format columns when querying?
Normally with a where clause and a regexp we are asking something like "find me everything that matches this regexp" but you are asking the DB for a phone number that matches "12035551212" and want the where clause to apply a regexp to every single phone number in the table while searching to match it. I guess you try something like (this can be streamlined but I'm breaking it down to make it easier to follow):
my_phone = '12035551212'
phone_arr = my_phone.split('')
#=> ["1", "2", "0", "3", "5", "5", "5", "1", "2", "1", "2"]
regx = '^\D?' + phone_arr[0] + '?' + '\D*' + phone_arr[1..-1].join('\D*') + '$'
#=> '^\D?1?\D*2\D*0\D*3\D*5\D*5\D*5\D*1\D*2\D*1\D*2$'
now you have a regexp that matches only your phone number regardless of format. So now you can try:
User.where('phone_number ~* ?', regx)
this should ask Postgres to match your very specific regexp based on the phone number you are searching for. This should get you what you need. But I would look at refactoring it.
In the long run I would standardize all numbers in the DB. You could add a phone_number_e164 column to Users and convert every one to the E164 format using a regexp. Then remove the old phone number column and rename the new one to phone_number. You would also need to add code to standardize any new numbers coming into the DB.
As a stop-gap measure you could also create a Postgres view that grabs the User records and applies regexp to the phone number to transform it to E164 format, and access that view instead of the User table.

How to filter Quickblox users?

I want to filter my application users based on their phone number or email, but I don't want exact match, instead part of the email or part of the number users should return in response. Is there a way in Quickblox iOS SDK?
Suppose, I've some quickblox users like below :
ID NAME Email Address Mobile Number
User1 | yuyuqabc#somedomain.com | +91-12345-67890
User2 | qerqrorp#somedomain.com | +1-123-000-7891
User3 | xyzabcqry#somedomain.com | +64-123-456-78
Now the filter should apply like this,
if I want to query on email, which contains "abc" then should return 1st and 3rd user.
if I want to query on phone number, which contains "23" then should return all users.
if I want to query on phone number, which contains "234" then should return 1st and 3rd user.
Is it possible?
It is possible only if you will use CustomObjects module instead of Users.
So you will need to create User class in CustomObjects and you will have all operators working there.
In Users it is impossible.

Solr Search Stripping Phone Number in Database before Comparison

I have a list of restaurants that I want to match with the restaurants in my database using phone number. The problem is that the numbers are formatted differently (i.e. (123)123-1234 or 123 123-123 or other combinations).
My current Solr search looks like this:
search = Restaurant.solr_search do
with(:phone, SunspotHelper.sanitize_term(pr.phone).gsub(/\s+/, ""))
paginate page: 1, per_page: 15
end
SunspotHelper.sanitize_term(pr.phone).gsub(/\s+/, "") will strip down my search query to just numbers. However, the values in my database still contain other non-numeric characters and thus, search.hits returns an empty array because I'm not getting any results.
Is there a way to strip my database values (:phone) before Solr does a search?
Thanks.
Configure WordDelimiterFilterFactory for your phone number field.
This will allow you to index phone data in various format and make them searchable as well.
You would not need to do any change to the database.

Difference between "{single_word"} and {single_word} searches on google docs

For some queries with documentlist api (and also within UI) I get different results for this queries:
1. "single_word"
2. single_word
For example for this:
"mody" - I receive 69 results
mody - I receive >200 results (many of them don't contains this word)
(This happens also for combination of words that contains this word. For example:
**"mody" company** and **mody company** returns different results)
Which is the difference between this searches? And how it is recommanded to search for best exact matches results? I don't want to receive results that contains only mod (for example) words.
"mody" matches tokens exactly with only the characters m-o-d-y, and whitespace on either side. It only matches mody.
mody matches tokens containing the characters m-o-d-y. It would match mymody, mody, asdfmody, modyasdf, etc.

How to identify a country from a normalized phone number?

I have a list of international phone numbers and a List of Country calling codes.
I would like to identify the Country from the numbers but I can't find a fast and elegant way to do it.
Any idea? The only I got is to have an hardcoded check (Eg. "look at the first number, look at the second number: if it's X then check for the third number. If the second number is Y then the Country is Foo", etc.).
I'm using PHP and a DB (MySQL) for the lists, but I think that any pseudocode will help.
Alternatively, you could use a tool like Twilio Lookup.
The CountryCode property is always returned when you make an API request with Lookup.
https://www.twilio.com/docs/api/lookups#lookups-instance-properties
[Disclosure: I work for Twilio]
i was after something similar to this, but i also wanted to determine the region/state - if available. in the end i hacked up something based on a tree of the digits leading digits (spurred on by the description at wikipedia)
my implementation is available as a gist.
I'm currently using an implementation of Google's libphonenumber in Node, which works fairly well. I suppose you could try a PHP implementation, e.g. libphonenumber-for-php.
The hard-coded check can be turned into a decision tree generated automatically from the list of calling codes. Each node of the tree defines the 'current' character, the list of possible following characters (tree nodes) or a country in case it's a terminal node. The root node will be for the leading '+' sign.
The challenge here is that some countries share the same phone country code. E.g. both Canada and the US have phone numbers starting with +1.
I'm using https://github.com/giggsey/libphonenumber-for-php as following:
/**
* Get country
* #param string $phone
* #param string $defaultCountry
* #return string Country code, e.g. 'CA', 'US', 'DE', ...
*/
public static function getCountry($phone, $defaultCountry) {
try {
$PhoneNumberUtil = \libphonenumber\PhoneNumberUtil::getInstance();
$PhoneNumber = $PhoneNumberUtil->parse($phone, $defaultCountry);
$country = $PhoneNumberUtil->getRegionCodeForNumber($PhoneNumber);
return $country;
} catch (\libphonenumber\NumberParseException $e) {
}
return $defaultCountry;
}
You can easily do a simple lookup starting with the first number, then the second, and so on until you find it. This will work correctly because no calling code is a prefix of another code, i.e. the international calling codes form a "prefix code" (the phone system relies on this property).
I'm not good any good at PHP so here is a simple python implementation; hopefully it is easy to follow:
>>> phone_numbers = ["+12345", "+23456", "+34567", "+45678"]
>>> country_codes = { "+1": "USA", "+234": "Nigeria", "+34" : "Spain" }
>>> for number in phone_numbers:
... for i in [2, 3, 4]:
... if number[:i] in country_codes:
... print country_codes[number[:i]]
... break
... else:
... print "Unknown"
...
USA
Nigeria
Spain
Unknown
Essentially you have an associative array between prefixes and countries (which I assume you can easily generate from that Wikipedia article. You try looking up the first digit of the phone number in the associative array. If it's not in the array you try the first two digits, then the first three. If there is no match after three digits then this number doesn't start with a valid international calling code.

Resources