PIG field_name MATCHES string

PIG field_name MATCHES string - parsing

say field_1 MATCHES 'a_string'
I would like select any entries with field_1 CONTAINING(if this exist) 'string' so that field_1 with 'a_string' will be included.
For an example
Entry Field_1
'a_string'
'string'
'strang'
Entries 1 and 2 will be selected.
May I know what is the most elegant way to do this?
The actual strings(chararray) I am dealing with are urls with different levels of depth i.e. www.abc.com/depth1/depth2/...
I was planning to parse the chararray using '/' as delimitter but its just too ugly. I will need to change the number for columns should a inner depth appeared.
Your assistance is deeply appreciated!!

I'm not sure if I understand your question correctly, but I think you can use the filter operation with matches for this. The second parameter is a regular expression.
X = FILTER A BY (field_1 matches '.*a_string.*');
See the Docs for more detail.

Related

Rails, Postgres 12, Query where pattern matches regex, and contains substring

I have a field in the database which contains strings that look like: 58XBF2022L1001390 I need to be able to query results which match the last letter(in this case 'L'), and match or resemble the last four digits.
The regular expression I've been using to find records which match the structure is: \d{2}[A-Z]{3}\d{4}[A-Z]\d{7}, So far I've tried using a scope to refine the results, but I'm not getting any results. Here's my scope
def self.filter_by_shortcode(input)
q = input
starting = q.slice!(0)
ending = q
where("field ~* ?", "\d{2}[A-Z]{3}\d{4}/[#{starting}]/\d{3}[#{ending}]\g")
end
Here are some more example strings, and the substring that we would be looking for. Not every string stored in this database field matches this format, so we would need to be able to first match the string using the regex provided, then search by substring.
36GOD8837G6154231
G4231
13WLF8997V2119371
V9371
78FCY5027V4561374
V1374
06RNW7194P2075353
P5353
57RQN0368Y9090704
Y0704
edit: added some more examples as well as substrings that we would need to search by.

I do not know Rails, but the SQL for what you want is relative simple. Since your string if fixed format, once that format is validated, simple concatenation of sub-strings gives your desired result.
with base(target, goal) as
( values ('36GOD8837G6154231', 'G4231')
, ('13WLF8997V2119371', 'V9371')
, ('78FCY5027V4561374', 'V1374')
, ('06RNW7194P2075353', 'P5353')
, ('57RQN0368Y9090704', 'Y0704')
)
select substr(target,10,1) || substr(target,14,4) target, goal
from base
where target ~ '^\d{2}[A-Z]{3}\d{4}[A-Z]\d{7}$';

GSheets - How to query a partial string

I am currently using this formula to get all the data from everyone whose first name is "Peter", but my problem is that if someone is called "Simon Peter" this data is gonna show up on the formula output.
=QUERY('Data'!1:1000,"select * where B contains 'Peter'")
I know that for the other formulas if I add an * to the String this issue is resolved. But in this situation for the QUERY formula the same logic do not applies.
Do someone knows the correct syntax or a workaround?

How about classic SQL syntax
=QUERY('Data'!1:1000,"select * where B like 'Peter %'")
The LIKE keyword allows use of wildcard % to represent characters relative to the known parts of the searched string.

See the query reference: developers.google.com/chart/interactive/docs/querylanguage You could split firstname and lastname into separate columns, then only search for firstnames exactly equal to 'Peter'. Though you may want to also check if lowercase/uppercase where lower(B) contains 'peter' or whitespaces are present in unexpected places (e.g., trim()). You could also search only for values that start with Peter by using starts with instead of contains, or a regular expression using matches. – Brian D
It seems that for my case using 'starts with' is a perfect fit. Thank you!

Denodo: How to aggregate varchar data types?

I'm creating an aggregate from a anstime column in a view table in Denodo and I'm using a Cast to convert it to float and it works only for those numbers with period (example 123.123) but does not work for the numbers without period (example 123). Here's my code which only works for those numbers with period:
SELECT row_date,
case
when sum(cast(anstime as float)) is null or sum(cast(anstime as float)) = 0
then 0
else sum(cast(anstime as float))
end as xans
FROM table where anstime like '%.%'
group by row_date
Can someone please help me how to handle those without period?

My guess is you've got values in anstime which are are not numeric, hence why not having the where anstime like '%.%' predicate causes a failure, as has been mentioned in other comments.
You could try adding in an intermediate view before this one which strips out any non numeric values (leaving the decimal point character of course) and this might then allow you to not have to use the where anstime like '%.%' filter.
Perhaps the REGEXP function which would possibly help there

Your where anstime like '%.%' clause is going to restrict possible responses to places where anstime has a period in it. Remove that if you want to allow all values.

I appreciate those who responded to my concern. In the end we had to reach out to our developers to fix the data type of the column from varchar to float rather than doing a workaround.

Solr and Rails: [* TO *] value instead of nil (asterisk TO asterisk)

Inside my model at searchable block I have index time added_at.
At search block for searching I added with(:added_at, nil), made reindex and now inside search object I have:
<Sunspot::Search:{:fq=>["-added_at_d:[* TO *]"]...}>
What is the meaning of this [* TO *] ? Something went wrong?

By adding with(:added_at, nil) you narrow down the search results to documents having no values in the field added_at, so we can expect the corresponding query filter to be defined as :
fq=>["added_at_d:null"] # not valid
The problem is that Solr Standard Query Parser does not support searching a field for empty/null value. In this situation the filter needs to be negated (exluding documents having any value in the field) so that the query remains valid.
The operator - can be used to exclude the field, and the wildcard character * can be used to match any value, now we can expect the query filter to look like :
fq=>["-added_at_d:*"]
However, although the above is valid for the query parser, using a range query should be preferred to prevent inconsitent behaviors when using wildcard within negative subqueries.
Range Queries allow one to match documents whose field(s) values are
between the lower and upper bound specified by the Range Query. Range
Queries can be inclusive or exclusive of the upper and lower bounds.
A * may be used for either or both endpoints to specify an open-ended range query.
Eventually there is nothing wrong with this filter that ends up looking like :
fq=>["-added_at_d:[* TO *]"]
cf. Lucene Range Queries, Solr Standard Query Parser

Postgresql text searching, matching multiple words

I don't know the name for this kind of search, but I see that it's getting pretty common.
Let's say I have records with the following file names:
'order_spec.rb', 'order.sass', 'orders_controller_spec.rb'
If I search with the following string 'oc' I would like the result to return 'orders_controller_spec.rb' due to match the o in orders and the c in controller.
If the string is 'os' then I'd like all 3 to match, 'order_spec.rb', 'order.sass', 'orders_controller_spec.rb'.
If the string is 'oco' then I'd like 'orders_controller_spec.rb'
What is the name for this kind of search and how would I go about getting this done in Postgresql?

This is a called a subsequence search. One simple way to do it in Postgres is to use the LIKE operator (or several of the other options in those docs) and fill the spaces between your letters with a wildcard, which for LIKE is %. To match anything with an o followed by an s in the words column, that would look like this:
SELECT * FROM table WHERE words LIKE '%o%s%';
This is a relatively expensive search, but you can improve performance with a varchar_pattern_ops or text_pattern_ops index to support faster pattern matching.
CREATE INDEX pattern_index ON table (words varchar_pattern_ops);

Categories

HOME

quartz.net

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

PIG field_name MATCHES string - parsing

I'm not sure if I understand your question correctly, but I think you can use the filter operation with matches for this. The second parameter is a regular expression. X = FILTER A BY (field_1 matches '.a_string.'); See the Docs for more detail.

Related

Rails, Postgres 12, Query where pattern matches regex, and contains substring

GSheets - How to query a partial string

Denodo: How to aggregate varchar data types?

Solr and Rails: [* TO *] value instead of nil (asterisk TO asterisk)

Postgresql text searching, matching multiple words

Categories

Resources

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

PIG field_name MATCHES string - parsing

I'm not sure if I understand your question correctly, but I think you can use the filter operation with matches for this. The second parameter is a regular expression. X = FILTER A BY (field_1 matches '.*a_string.*'); See the Docs for more detail.

Related

Rails, Postgres 12, Query where pattern matches regex, and contains substring

GSheets - How to query a partial string

Denodo: How to aggregate varchar data types?

Solr and Rails: [* TO *] value instead of nil (asterisk TO asterisk)

Postgresql text searching, matching multiple words

Categories

Resources

I'm not sure if I understand your question correctly, but I think you can use the filter operation with matches for this. The second parameter is a regular expression. X = FILTER A BY (field_1 matches '.a_string.'); See the Docs for more detail.