Sorting query results with German umlaut - neo4j

is there a chance to sort results of a cypher query including german umlaut like ä,ö,ü? At the moment I get a list alphabetical sorted and nodes starting with an umlaut are put at the end of the list. Normally they should be within the list e.g. an 'Ö' should be equal to 'OE'.
Any ideas are appreciated, thanks.

Since you asked specifically about Cypher, the query below is an example of how to sort umlauted characters as if they were their ligatured equivalents (e.g., treating 'Ü' as if it was 'UE').
WITH ['Dorfer', 'Dörfener'] AS names
UNWIND names AS name
RETURN name
ORDER BY
REDUCE(s = name, x IN [
['ä', 'ae'], ['Ä', 'AE'],
['ö', 'oe'], ['Ö', 'OE'],
['ü', 'ue'], ['Ü', 'UE']] |
REPLACE(s, x[0], x[1]));
The above query will return 'Dörfener' first, and 'Dorfer' second.
However, the above approach is not very efficient, since it calls the REPLACE function six times for each name. It would be much more efficient to write a user-defined procedure in Java that took the entire names list as input and returned the sorted list in a single call.

Yes, you can use localeCompare or Intl.Collator to achieve this:
// Our original array
// Outputs ["u", "x", "ü", "ü", "ü"]
const input1 = ['ü','u','ü','x','ü'];
const output1 = input1.sort();
console.log(output1);
// Intl.Collator
// Outputs ["u", "ü", "ü", "ü", "x"]
const input2 = ['ü','u','ü','x','ü'];
const output2 = input2.sort(Intl.Collator().compare);
console.log(output2)
// localeCompare
// Outputs ["u", "ü", "ü", "ü", "x"]
const input3 = ['ü','u','ü','x','ü'];
const output3 = input3.sort(function (a, b) {
return a.localeCompare(b);
});
console.log(output3)

Related

How do I sort a simple Lua table alphabetically?

I have already seen many threads with examples of how to do this, the problem is, I still can't do it.
All the examples have tables with extra data. For example somethings like this
lines = {
luaH_set = 10,
luaH_get = 24,
luaH_present = 48,
}
or this,
obj = {
{ N = 'Green1' },
{ N = 'Green' },
{ N = 'Sky blue99' }
}
I can code in a few languages but I'm very new to Lua, and tables are really confusing to me. I can't seem to work out how to adapt the code in the examples to be able to sort a simple table.
This is my table:
local players = {"barry", "susan", "john", "wendy", "kevin"}
I want to sort these names alphabetically. I understand that Lua tables don't preserve order, and that's what's confusing me. All I essentially care about doing is just printing these names in alphabetical order, but I feel I need to learn this properly and know how to index them in the right order to a new table.
The examples I see are like this:
local function cmp(a, b)
a = tostring(a.N)
b = tostring(b.N)
local patt = '^(.-)%s*(%d+)$'
local _,_, col1, num1 = a:find(patt)
local _,_, col2, num2 = b:find(patt)
if (col1 and col2) and col1 == col2 then
return tonumber(num1) < tonumber(num2)
end
return a < b
end
table.sort(obj, cmp)
for i,v in ipairs(obj) do
print(i, v.N)
end
or this:
function pairsByKeys (t, f)
local a = {}
for n in pairs(t) do table.insert(a, n) end
table.sort(a, f)
local i = 0 -- iterator variable
local iter = function () -- iterator function
i = i + 1
if a[i] == nil then return nil
else return a[i], t[a[i]]
end
end
return iter
end
for name, line in pairsByKeys(lines) do
print(name, line)
end
and I'm just absolutely thrown by this as to how to do the same thing for a simple 1D table.
Can anyone please help me to understand this? I know if I can understand the most basic example, I'll be able to teach myself these harder examples.
local players = {"barry", "susan", "john", "wendy", "kevin"}
-- sort ascending, which is the default
table.sort(players)
print(table.concat(players, ", "))
-- sort descending
table.sort(players, function(a,b) return a > b end)
print(table.concat(players, ", "))
Here's why:
Your table players is a sequence.
local players = {"barry", "susan", "john", "wendy", "kevin"}
Is equivalent to
local players = {
[1] = "barry",
[2] = "susan",
[3] = "john",
[4] = "wendy",
[5] = "kevin",
}
If you do not provide keys in the table constructor, Lua will use integer keys automatically.
A table like that can be sorted by its values. Lua will simply rearrange the index value pairs in respect to the return value of the compare function. By default this is
function (a,b) return a < b end
If you want any other order you need to provide a function that returs true if element a comes befor b
Read this https://www.lua.org/manual/5.4/manual.html#pdf-table.sort
table.sort
Sorts the list elements in a given order, in-place, from list[1] to
list[#list]
This example is not a "list" or sequence:
lines = {
luaH_set = 10,
luaH_get = 24,
luaH_present = 48,
}
Which is equivalent to
lines = {
["luaH_set"] = 10,
["luaH_get"] = 24,
["luaH_present"] = 48,
}
it only has strings as keys. It has no order. You need a helper sequence to map some order to that table's element.
The second example
obj = {
{ N = 'Green1' },
{ N = 'Green' },
{ N = 'Sky blue99' }
}
which is equivalent to
obj = {
[1] = { N = 'Green1' },
[2] = { N = 'Green' },
[3] = { N = 'Sky blue99' },
}
Is a list. So you could sort it. But sorting it by table values wouldn't make too much sense. So you need to provide a function that gives you a reasonable way to order it.
Read this so you understand what a "sequence" or "list" is in this regard. Those names are used for other things as well. Don't let it confuse you.
https://www.lua.org/manual/5.4/manual.html#3.4.7
It is basically a table that has consecutive integer keys starting at 1.
Understanding this difference is one of the most important concepts while learning Lua. The length operator, ipairs and many functions of the table library only work with sequences.
This is my table:
local players = {"barry", "susan", "john", "wendy", "kevin"}
I want to sort these names alphabetically.
All you need is table.sort(players)
I understand that LUA tables don't preserve order.
Order of fields in a Lua table (a dictionary with arbitrary keys) is not preserved.
But your Lua table is an array, it is self-ordered by its integer keys 1, 2, 3,....
To clear up the confusing in regards to "not preserving order": What's not preserving order are the keys of the values in the table, in particular for string keys, i.e. when you use the table as dictionary and not as array. If you write myTable = {orange="hello", apple="world"} then the fact that you defined key orange to the left of key apple isn't stored. If you enumerate keys/values using for k, v in pairs(myTable) do print(k, v) end then you'd actually get apple world before orange hello because "apple" < "orange".
You don't have this problem with numeric keys though (which is what the keys by default will be if you don't specify them - myTable = {"hello", "world", foo="bar"} is the same as myTable = {[1]="hello", [2]="world", foo="bar"}, i.e. it will assign myTable[1] = "hello", myTable[2] = "world" and myTable.foo = "bar" (same as myTable["foo"]). (Here, even if you would get the numeric keys in a random order - which you don't, it wouldn't matter since you could still loop through them by incrementing.)
You can use table.sort which, if no order function is given, will sort the values using < so in case of numbers the result is ascending numbers and in case of strings it will sort by ASCII code:
local players = {"barry", "susan", "john", "wendy", "kevin"}
table.sort(players)
-- players is now {"barry", "john", "kevin", "susan", "wendy"}
This will however fall apart if you have mixed lowercase and uppercase entries because uppercase will go before lowercase due to having lower ASCII codes, and of course it also won't work properly with non-ASCII characters like umlauts (they will go last) - it's not a lexicographic sort.
You can however supply your own ordering function which receives arguments (a, b) and needs to return true if a should come before b. Here an example that fixes the lower-/uppercase issues for example, by converting to uppercase before comparing:
table.sort(players, function (a, b)
return string.upper(a) < string.upper(b)
end)

How to filter out Flux<Example> that don't contain some value from Flux<String>

So let's say I have a Flux<String> firstLetters containing "A", "B", "C", "D" and Flux<String> lastLetters containing "X", "Y", "Z"
And I have a Flux containing many:
data class Example(val name: String)
And from the whole Flux<Example> I want to split the elements to two variables: one Flux<Example> containing all that name IN ("A", "B", "C", "D") and second Flux<Example> that has name IN ("X", "Y", "Z") and save those two Fluxes two variables.
Is it possible to do so in one flow without doing same logic first for firstLetters and then for lastLetters
Is it possible to do so in one flow without doing same logic first for firstLetters and then for lastLetters
As the problem stands I don't believe so, as you'll have to process each element multiple times (one per each value on the list to see if it contains the value you need.) You can call cache() on the Flux though to ensure that the values are only retrieved once, or convert to another data structure entirely.
Given that you have to re-evaluate anyway, and assuming you still want to stick with raw Flux objects, filterWhen() and any() can be used quite nicely here:
Flux<Example> firstNames = names.filterWhen(e -> firstLetters.any(e.name::contains));
Flux<Example> lastNames = names.filterWhen(e -> lastLetters.any(e.name::contains));
You can of course pull the Predicate out into a separate method if you're concerned about code duplication there.
If Flux<String> firstLetters/lastLetters can be replaced with Set<String> firstLetters/lastLetters then you can easily leverage Flux::groupBy method on Flux<Example> to split it into different groups.
enum Group {
FIRST, LAST, UNDEFINED
}
Group toGroup(Example example) {
if (firstLetters.contains(example.name)) return FIRST;
else if (lastLetters.contain(example.name)) return LAST;
else return UNDEFINED;
}
Flux<GroupedFlux<Group, Example>> group(Flux<Example> examples) {
return examples.groupBy(example -> toGroup(example));
}
You can then get the group by calling GroupedFlux<K, V>::key.

Is this ordering guaranteed?

Is this script:
local data =
{
{ "data1", "1"},
{ "data5", "2"},
{ "3453453", "3"},
{ "zzz", "4"},
{ "222", "5"},
{ "lol", "6"},
{ "asdf", "7"},
{ "hello", "8"},
}
local function test()
local count = #data
for i = 1, count do
print(data[i][1] .. " = " .. data[i][2])
end
end
test()
Guaranteed to output:
data1 = 1
data5 = 2
3453453 = 3
zzz = 4
222 = 5
lol = 6
asdf = 7
hello = 8
If not then why, and what is best way performance wise to make it so?
I read something about pairs VS ipairs not returning a fixed order of results
ipairs is an iterator of the array elements of a table, in order from first to last. "Array elements" being defined as the members of a table with keys that are numeric values on the range [1, #tbl], where #tbl is the length operator applied to the table.
pairs is an iterator over all of the elements of a table: array and non-array elements alike. Non-array elements of a table have no intrinsic order to Lua, so pairs will return them in any order. And even though the array elements do technically have an order, pairs will not make an exception for them; it always operates in an arbitrary order.
Your code works like ipairs: iterating over each of the numeric keys of the table from 1 to its length.

Intersection of 2 array of strings with different upper and lower case

I want to get the intersection of 2 arrays of strings. The first array has different upper and lower case. The resulting array I want should respect the first arrays casing, but the comparison between the 2 should ignore the upper/lower case. E.g.
letters = ['Aaa', 'BbB', 'CCC']
permitted = ['aaa', 'bbb']
The result should be:
['Aaa', 'BbB']
Im doing:
letters.map(&:downcase) & permitted.map(&:downcase)
But this returns ['aaa', 'bbb']
What's a neat way of doing this? The longer way of doing it is:
letters.each { |letter|
if permitted.include?(letter.downcase)
accepted.push(letter)
end
}
But is there a shorter/neater way?
You can use select:
search = permitted.map(&:downcase)
letters.select{|letter|
search.include?(letter.downcase)
}
Or even neater (imho):
-> search {
letters.select{|x| search.include?(x.downcase)}
}.call(permitted.map(&:downcase))
Demonstration
There's a method for comparing string in a case-insensitive manner, String#casecmp:
letters = ['Aaa', 'BbB', 'CCC']
permitted = ['aaa', 'bbb']
letters.select{|l| permitted.detect{|p| p.casecmp(l) == 0 } } # => ["Aaa", "BbB"]
You can also use regular expressions. :)
letters = ['Aaa', 'BbB', 'CCC']
permitted = ['aaa', 'bbb']
letters.grep(Regexp.new(permitted.join('|'), Regexp::IGNORECASE)) # => ["Aaa", "BbB"]

Ruby array, convert to two arrays in my case

I have an array of string which contains the "firstname.lastname?some.xx" format strings:
customers = ["aaa.bbb?q21.dd", "ccc.ddd?ew3.yt", "www.uuu?nbg.xcv", ...]
Now, I would like to use this array to produce two arrays, with:
the element of the 1st array has only the string before "?" and replace the "." to a space.
the element of the 2nd array is the string after "?" and include "?"
That's I want to produce the following two arrays from the customers array:
1st_arr = ["aaa bbb", "ccc ddd", "www uuu", ...]
2nd_arr = ["?q21.dd", "?ew3.yt", "?nbg.xcv", ...]
What is the most efficient way to do it if I use customers array as an argument of a method?
def produce_two_arr customers
#What is the most efficient way to produce the two arrays
#What I did:
1st_arr = Array.new
2nd_arr = Array.new
customers.each do |el|
1st_Str, 2nd_Str=el.split('?')
1st_arr << 1st_str.gsub(/\./, " ")
2nd_arr << "?"+2nd_str
end
p 1st_arr
p 2nd_arr
end
Functional approach: when you are generating results inside a loop but you want them to be split in different arrays, Array#transpose comes handy:
ary1, ary2 = customers.map do |customer|
a, b = customer.split("?", 2)
[a.gsub(".", " "), "?" + b]
end.transpose
Anytime you're building an array from another, reduce (a.k.a. inject) is a great help:
But sometimes, a good ol' map is all you need (in this case, either one works because you're building an array of the same size):
a, b = customers.map do |customer|
a, b = customer.split('?')
[a.tr('.', ' '), "?#{b}"]
end.transpose
This is very efficient since you're only iterating through customers a single time and you are making efficient use of memory by not creating lots of extraneous strings and arrays through the + method.
Array#collect is good for this type of thing:
arr1 = customers.collect{ |c| c.split("?").first.sub( ".", "" ) }
arr2 = customers.collect{ |c| "?" + c.split("?").last }
But, you have to do the initial c.split("?") twice. So, it's effecient from an amount of code point of view, but more CPU intensive.
1st_arr = customers.collect{ |name| name.gsub(/\?.*\z/,'').gsub(/\./,' ') }
2nd_arr = customers.collect{ |name| name.match(/\?.*\z/)[0] }
array1, array2 = customers.map{|el| el.sub('.', ' ').split /(?:\?)/}.transpose
Based on #Tokland 's code, but it avoids the extra variables (by using 'sub' instead of 'gsub') and the re-attaching of '?' (by using a non-capturing regex).

Resources