Thinking sphinx word match - ruby-on-rails

The data what i have is
kiran#test.com - first record
kiran1#test.com - second record
I need to search using the email address. I have forums and users indexed in my web app.
First scenario
I kept the '#' symbol in the charset table which is working fine problem is for example if the search keyword as 'kiran#test.com' it is giving me the exact result but if i user only 'test' no results found.
Second scenario
If i won't keep '#' symbol in the charset table. If the i use 'kiran#test.com' i am getting both the results and for 'test' also i am getting both the results
Expected Scenario
If i use the entire email 'kiran#test.com' - I need to get only first record
If i use only 'test' - I need to get both the records
In plain mysql something like "select users where email like '%search-key%'"
I use the following code for searching
ThinkingSphinx.search params[:search_key],:star => Regexp.new('\w+#*\w+', nil, 'u') (I don't want to treat '#' as the separator)
Please suggest me any options i can pass to achieve the expected result.
Thanks
Kiran

Take a look at blended char support
http://sphinxsearch.com/docs/current.html#conf-blend-chars
Or if you really what [ email like '%search-key%'" ] style support maybe min_infix_len. (leaving . and # in charset table)

To search for full email you could use Phrase Search operator.
http://sphinxsearch.com/docs/current.html#extended-syntax
So, if you could determine search query as email use "phrase search" otherwise use general search.

Related

How to quote colons in graph query

I have some code which gets details of lists in a SharePoint site then later wants to find out if a list with the same name still exists. This works fine except for list names that contain a colon - I find Graph misinterprets the colon and 'corrupts' the URL.
For instance, in Graph Explorer when I give it the following query:
https://graph.microsoft.com/v1.0/sites('mysite.sharepoint.com,aa-aa-aa,bb-bb-bb')/lists('19:abcdef#thread.tacv2_wiki')
The error response contains the following in the 'message' property:
The expression \"sites('mysite.sharepoint.com,aa-aa-aa,bb-bb-bb')/lists('19')/abcdef#thread.tacv2_wiki\" is not valid.
Note that it's split the original URL, thinking the colon is the start of a new segment in the path, even though it's inside a quote.
I've tried all sorts of quoting of the colon (%3A and %253A and %25253A) and different styles of quote characters, but they all either return the same error or give a parsing error.
More information - I specifically want to search by name not by original id (which would be much easier), I'm acutually using Graph Managed API in code but it generates the same error (you'd think it would internally know how to quote), the list is actually a hidden one created in a Teams site to manage channel information.
I was also able to reproduce your issue but as a work around you can use the filter query parameter to get the list by using below query.
https://graph.microsoft.com/v1.0/sites/soaadteam.sharepoint.com,c1178396-d845-46fa-bc0c-453d2951dad5,19ee9a1e-001d-48f1-9ee8-b0adfde54e45/lists?$filter=displayName eq '19:abcdef#thread.tacv2_wiki'

Advanced site search with google

I try to find out which URLs exists for a specific domain and a specific domain-path in the google index. The urls have the following schema:
https://example.org/path1/<keyword>/path2/
the following google search works fine:
site:https://example.org/path1/*/path2/
but it delivers more than 40.000 findings. So I'll try to search for
https://example.org/path1/a*/path2/
but there where no results found (what can't be). Whats wrong? Any chance to deliver only Findings where Site-URL contains keywords starting with an "a"?
Thank you,
Jan
You can try the following
https://example.org/path1/*a
This will search for all the URL's which starts with https://example.org/path1/ which also contains the keyword a
You can refine your search by specifying multiple keywords:
https://example.org/path1/*a*/path2/
This will search for the same as in the 1st example but will conatin the /path2/ part of the URL as well. However this will match URL's if the keyword a is either before or after the 2nd path /path2/

How to retrieve hashtaged tweets from a list of users

Is there a way retrieving all the tweets from a list of profiles (3) which are tagged with certain #hashtag in a single call to the Twitter API using 1.1?
If not, obviously, I'd be retrieving a hundred tweets from each user, and filtering out those which do not have the #hashtag .. but it's not very efficient, right?
Thanks in advance
Note: I've updated the library so I suggest you grab the newly updated version before trying this - I've made it so you don't need to manually encode each individual character.
This page shows you how to use search, and has a number of operators down toward the bottom of the page. One of these is the OR operator:
Getting tweets for multiple users
OR - Either this or that
love OR hate - containing either "love" or "hate" (or both)
From - From this user
from:twitterapi sent from the user #twitterapi
So, armed with the above knowledge, I'm guessing your query would look like this:
Translated into a GET request:
?q=from:user1+OR+from:user2
Getting tweets for specific hashtags
So that's how you get tweets for multiple users. The next step you want is for certain hashtags.
#haiku - containing the hashtag "haiku"
That translated individually into the correct format becomes:
?#haiku (or %2C haiku, depending on the library / urlencoding you're using)
Combining the above
The standard AND operator looks like this:
twitter search - containing both "twitter" and "search". Default operator
For a get request:
?twitter+search
So let's combine all this!
?q=#hashtag1+#hashtag2+from:user1+OR+from:user2
And, because you're using my lib, here's the code for that:
$url = 'https://api.twitter.com/1.1/search/tweets.json';
$requestMethod = 'GET';
$getfield = '?q=#hashtag1+OR+#hashtag2+from:username1+OR+from:username2';
$twitter = new TwitterAPIExchange($settings);
$json = $twitter->setGetfield($getfield)
->buildOauth($url, $requestMethod)
->performRequest();
var_dump(json_decode($json));

IMAP search on header not working as expected

I am building library for IMAP my search command works file for the Inbox folder it returns me a number which I can use to fetch the mail. However when I try to search on Sent Items it does not work it does not give an error but just returns Search OK without any numbers. Can you please point out why this behavior. I am hitting Exchange 2010.
My search command is something like:
search all HEADER Message-ID "<cc6aed80-955b-4800-a3ac-6c3942ceecac>"
This is exactly how it is described in http://support.microsoft.com/kb/302965
Possibly of no use, but I ran into possibly the same problem.
In a mailbox with an email from "Bill Gates ", a search with the expression '(FROM "billy#microsoft.com")' returned nothing; a search for '(FROM gates)' return a hit.
I had to change my code to '(HEADER FROM "billy#microsoft.com")' to get it to work.
ALTERNATIVELY:
You may be able to use IMAP4.uid(command, arg[, ...])
See http://docs.python.org/2/library/imaplib.html#imaplib.IMAP4.uid

Can't parse new google urls - HTTP_REFERER doesn't contain parameters anymore

It seems a little odd to my, but although everybody knows about the new google search urls (see Google using # instead of search? in URL. Why?) no one has a problem with the HTTP_REFERER.
I'm using the referrer to parse the google string for the searchquery (&q= ) but as this is all in a hash-tag it wont be sent to the server and all i get is "http://www.google.de/".
So do you know a way of getting the query the user searched for, befor landing on my site?
Due to late-2011 Google security changes, this is no longer possible when the search was performed by a signed-in Google user. See:
http://googleblog.blogspot.com/2011/10/making-search-more-secure.html
http://analytics.blogspot.com/2011/10/making-search-more-secure-accessing.html
Since there are multiple q's in the query string you have to match the "q" parameter globally and take the last one:
/[?|&|#]q=([^&|^#]+)/ig
Get rid of "site:" searches (there are others, but I haven't done them)
/[\+|?|&]?site:([^&|^#])+/g, '');
Then parse the results.
/[\w^'\(\)\{\}]+|"[^"]+"/g
This has been working well for me.

Resources