XSLT: Test if string is contained in a sequence - xslt-2.0

I have two sequences of strings. I wanna wether get a sequence which is reduced by those items, which are also in sequence 2; or to compare those two sequences and get the information, if at least one item of sequence 1 is also in sequence 2.
A simple compare ( $seq1 = $seq2 ) works for me only with a sequence of numbers, or am I doing something wrong?
Glad about any help! :)

The = operator should suffice, see example http://xsltransform.net/gWmuiJ6 which does
<xsl:variable name="seq1" select="'foo', 'bar', 'foobar'"/>
<xsl:variable name="seq2" select="'a', 'foo', 'b'"/>
<xsl:variable name="seq3" select="'a', 'b', 'c'"/>
<xsl:value-of select="$seq1 = $seq2, $seq1 = $seq3"/>
and outputs true false.
If you want some value based intersection then see also http://www.xsltfunctions.com/xsl/functx_value-intersect.html.

Related

Find sum of all elements using ruby and selenium

In my web page there are 5 values given in the text field(like $10, $20, $30, $40 and $50) and I am trying to sum the values using ruby and selenium WebDriver.
Here is my code:
def get_sum_of_all_elements()
#logger.info("Searching element #{value1, value2, value3, value4, value5}");
allelements = #driver.find_elements(:id = "lbl_val_")
#logger.info("Total Elements Found with locator #{locator} are : #{allelements.size}");
if allelements.start_with?("$")
allelements = "((allelements))".tr('$', '') #removing '$' sign from values
iSum =0
allelements.each do|i|
iSum += i
end
end
end
I am expecting to see output as 150. Do I need to store values in an array?
Any help would be appreciated.
There a couple of things you should modify in your code to make it work:
Fix how arguments are passed to find_elements; it should be id: "lbl_val_".
find_elements returns an array of WebDriver::Element objects, so you must check the value for each object.
The string "Searching element #{value1, value2, value3, value4, value5}" is not valid since you are trying to interpolate the value of 5 variables chained with a comma. You either need to interplate only the variable (keeping commas as strings) or use square brackets ([]) to interpolate an array.
Now your code should look something like this1:
def get_sum_of_all_elements
#logger.info("Searching element #{[value1, value2, value3, value4, value5]}")
allelements = #driver.find_elements(id: "lbl_val_")
#logger.info("Total Elements Found with locator #{locator} are : #{allelements.size}");
if allelements.all? { |elem| elem.value.start_with?("$") }
elements = allelements.map { |elem| elem.value.tr('$', '').to_i }
elements.reduce(:+)
end
end
A few things to note:
Parenthesis (()) were removed in method definition, ruby doesn't need them when no arguments are passed.
There is no longer need to assign the final value to a variable (e.g iSum) since ruby will return the result of last evaluated code.
If any value doesn't start with "$", it will return false. You could change this by adding a default value after if block.
Semicolons (;) were removed, you don't need them in ruby (unless you want to chain multiple statements in a singe line).
One more thing, the variables value1, value2, value3, value4, value5 and locator doesn't seem to be set anywhere in your method; you must set them within your method (or pass them as arguments) or you will get an error.
1 This considers the same logic that you seemed to be looking for in your code, that is, sum all values only if all of them start with "$".
It's hard to say exactly what you are trying to do, but this might help. I assume you have an array of string values with dollar signs:
>> allelements = ["$10", "$20", "$30", "$40", "$50"]
=> ["$10", "$20", "$30", "$40", "$50"]
We can make a new array stripping out all non-numeric characters and transforming the string values to integers:
>> integers = allelements.map { |e| e.gsub(/[^\d]/, '').to_i }
=> [10, 20, 30, 40, 50]
Now use inject to sum the values:
>> integers.inject(:+)
=> 150

XSLT 2.0: select some #class names but not others

I have documents with indent names in a class, such as: <div class="if"> and <div class="i1 note">, and so on. (the indent classes are: if, i1, i2, i3, i4, i5, i6).
The non-indent CSS selectors vary widely. But the indent CSS selector names are consistent.
Objective: I'd like to store the non-indent classes in a variable for future use.
Here's the direction I was going in, without success:
<xsl:variable name="indent-names"
select="'if i1 i2 i3 i4 i5 i6'"/>
<xsl:variable name="non-indent-classes"
select="replace(#class,$indent-names,'')"/>
Suggestions? I'm thinking a <xsl:analyze-string select="#class" regex=" .... "> is the way to go. No success yet.
UPDATE
Based on Martin's answer, I did this:
<xsl:variable name="indent-names"
select="'if i1 i2 i3 i4 i5 i6'"/>
<xsl:variable name="non-indent-classes"
select="tokenize(#class, ' ')[not(. = tokenize($indent-names, ' '))]"/>
Works great. I must become more familiar with tokenize().
I would use tokenize(#class, ' ')[not(. = tokenize($indent-names, ' '))]. Obviously it is more efficient to do the inner tokenize once and store it in a variable <xsl:variable name="names" select="tokenize($indent-names, ' ')"/>, then use that variable tokenize(#class, ' ')[not(. = $names)].

What does [ ... || ... <- ...] do in this snippet of code?

I need your help again, I am trying to understand this piece of erlang code.
Line="This is cool".
Lines = [Line || _Count <- lists:seq(1,5)].
output is
["This is cool","This is cool","This is cool","This is cool","This is cool"]
I don't understand the logic behind it printing the required number of times. What does Line || _***** means?
Since the value of Line is not changed in the right hand side of the list comprehension, the value of each element is the same, the value of Line.
The right side of the list comprehension is just determining the number of elements.
Look at this piece of code:
Line = "This is cool".
Lines = [{Line, Count} || Count <- lists:seq(1, 5)].
Here you create a list of tuples of size 2 where first element is constant and the second is taken from the source list of list comprehension. And if you remove an element from the tuple it won't change list's structure.
it can be read like this: NewListe = [Dosomething || Element <- Liste]
create a NewListe this way: for each Element of Liste, build a new element with Dosomething.
Step by step it gives Liste = lists:seq(1,5) = [1,2,3,4,5];
for each Element, just discard the value of element (it is why it is written as _Count) and
Dosomething is only send back the value "This is cool",
and the result is a list of 5 times "This is cool"
["This is cool","This is cool","This is cool","This is cool","This is cool"]
<- is called a generator; after the sign || you may have generators or filters. For example if we imagine that you have a list of different elements and want to get only the printable list items, turned to upper case, you will need a generator:
X <- ["toto",5,"Hello",atom] to get each element
a filter:
io_lib:printable_list(X) to select only the printable lists
and a transformation:
string:to_upper(X) to turn to upper case
all together you have what is expected:
1> [string:to_upper(X) || X <- ["toto",5,"Hello",atom], io_lib:printable_list(X)].
["TOTO","HELLO"]
2>

Sorting an array in Ruby (Special Case)

I have an array in Ruby which has values as follows
xs = %w(2.0.0.1
2.0.0.6
2.0.1.10
2.0.1.5
2.0.0.8)
and so on. I want to sort the array such that the final result should be something like this :
ys = %w(2.0.0.1
2.0.0.6
2.0.0.8
2.0.1.5
2.0.1.10)
I have tried using the array.sort function, but it places "2.0.1.10" before "2.0.1.5". I am not sure why that happens
Using a Schwartzian transform (Enumerable#sort_by), and taking advantage of the lexicographical order defined by an array of integers (Array#<=>):
sorted_ips = ips.sort_by { |ip| ip.split(".").map(&:to_i) }
Can you please explain a bit more elaborately
You cannot compare strings containing numbers: "2" > "1", yes, but "11" < "2" because strings are compared lexicographically, like words in a dictionary. Therefore, you must convert the ip into something than can be compared (array of integers): ip.split(".").map(&:to_i). For example "1.2.10.3" is converted to [1, 2, 10, 3]. Let's call this transformation f.
You could now use Enumerable#sort: ips.sort { |ip1, ip2| f(ip1) <=> f(ip2) }, but check always if the higher abstraction Enumerable#sort_by can be used instead. In this case: ips.sort_by { |ip| f(ip) }. You can read it as "take the ips and sort them by the order defined by the f mapping".
Split your data into chunks by splitting on '.'. There is no standard function to do it as such so you need to write a custom sort to perform this.
And the behaviour you said about 2.0.1.10 before 2.0.1.5 is expected because it is taking the data as strings and doing ASCII comparisons, leading to the result that you see.
arr1 = "2.0.0.1".split('.')
arr2 = "2.0.0.6".split('.')
Compare both arr1 and arr2 element by element, for all the data in your input.

Walking over strings to guess a name from an email based on dictionary of names?

Let's say I have a dictionary of names (a huge CSV file). I want to guess a name from an email that has no obvious parsable points (., -, _). I want to do something like this:
dict = ["sam", "joe", "john", "parker", "jane", "smith", "doe"]
word = "johnsmith"
x = 0
y = word.length-1
name_array = []
for i in x..y
match_me = word[x..i]
dict.each do |name|
if match_me == name
name_array << name
end
end
end
name_array
# => ["john"]
Not bad, but I want "John Smith" or ["john", "smith"]
In other words, I recursively loop through the word (i.e., unparsed email string, "johndoe#gmail.com") until I find a match within the dictionary. I know: this is incredibly inefficient. If there's a much easier way of doing this, I'm all ears!
If there's not better way of doing it, then show me how to fix the example above, for it suffers from two major flaws: (1) how do I set the length of the loop (see problem of finding "i" below), and (2) how do I increment "x" in the example above so that I can cycle through all possible character combinations given an arbitrary string?
Problem of finding the length of the loop, "i":
for an arbitrary word, how can we derive "i" given the pattern below?
for a (i = 1)
a
for ab (i = 3)
a
ab
b
for abc (i = 6)
a
ab
abc
b
bc
c
for abcd (i = 10)
a
ab
abc
abcd
b
bc
bcd
c
cd
d
for abcde (i = 15)
a
ab
abc
abcd
abcde
b
bc
bcd
bcde
c
cd
cde
d
de
e
r = /^(#{Regexp.union(dict)})(#{Regexp.union(dict)})$/
word.match(r)
=> #<MatchData "johnsmith" 1:"john" 2:"smith">
The regex might take some time to build, but it's blazing fast.
I dare suggest a brute force solution that is not very elegant but still useful in case
you have a large number of items (building a regexp can be a pain)
the string to analyse is not limited to two components
you want to get all splittings of a string
you want only complete analyses of a string, that span from ^ to $.
Because of my poor English, I could not figure out a long personal name that can be split in more than one way, so let's analyse a phrase:
word = "godisnowhere"
The dictionary:
#dict = [ "god", "is", "now", "here", "nowhere", "no", "where" ]
#lengths = #dict.collect {|w| w.length }.uniq.sort
The array #lengths adds a slight optimization to the algorithm, we will use it to prune subwords of lengths that don't exist in the dictionary without actually performing dictionary lookup. The array is sorted, this is another optimization.
The main part of the solution is a recursive function that finds the initial subword in a given word and restarts for the tail subword.
def find_head_substring(word)
# boundary condition:
# remaining subword is shorter than the shortest word in #dict
return [] if word.length < #lengths[0]
splittings = []
#lengths.each do |len|
break if len > word.length
head = word[0,len]
if #dict.include?(head)
tail = word[len..-1]
if tail.length == 0
splittings << head
else
tails = find_head_substring(tail)
unless tails.empty?
tails.collect!{|tail| "#{head} #{tail}" }
splittings.concat tails
end
end
end
end
return splittings
end
Now see how it works
find_head_substring(word)
=>["god is no where", "god is now here", "god is nowhere"]
I have not tested it extensively, so I apologize in advance :)
If you just want the hits of matches in your dictionary:
dict.select{ |r| word[/#{r}/] }
=> ["john", "smith"]
You run a risk of too many confusing subhits, so you might want to sort your dictionary so longer names are first:
dict.sort_by{ |w| -w.size }.select{ |r| word[/#{r}/] }
=> ["smith", "john"]
You will still encounter situations where a longer name has a shorter substring following it and get multiple hits so you'll need to figure out a way to weed those out. You could have an array of first names, and another of last names, and take the first returned result of scanning for each, but given the diversity of first and last names, that doesn't guarantee 100% accuracy, and will still gather some bad results.
This sort of problem has no real good solution without further hints to the code about the person's name. Perhaps scanning the body of the message for salutation or valediction sections will help.
I'm not sure what you're doing with i, but isn't it as simple as:
dict.each do |first|
dict.each do |last|
puts first,last if first+last == word
end
end
This one bags all occurrences, not necessarily exactly two:
pattern = Regexp.union(dict)
matches = []
while match = word.match(pattern)
matches << match.to_s # Or just leave off to_s to keep the match itself
word = match.post_match
end
matches

Resources