I have a neo4j database which has nearly 500k CK_ITEM nodes defined as follows:
CK_ITEM: {
id (String),
name (String),
description (String)
}
Suppose we have this sample data:
+--------+----+-----------------+
| name | id | description |
+--------+----+-----------------+
| Mark | 1 | A lot of things |
| Gerald | 9 | Coff2e |
| Carl | 2 | 1 mango |
| James | 3 | 5 lemons |
| Edward | 4 | Coffee |
+--------+----+-----------------+
I need to order the data by description ASC. This is my query:
MATCH (n:CK_ITEM)
ORDER BY
n.description ASC
This results in:
+--------+----+-----------------+
| name | id | description |
+--------+----+-----------------+
| Carl | 2 | 1 mango | <-- '1' < '5'
| James | 3 | 5 lemons | <-- '5' < 'A'
| Mark | 1 | A lot of things | <-- 'A' < 'C'
| Gerald | 9 | Coff2e | <-- '2' < 'e'
| Edward | 4 | Coffee |
+--------+----+-----------------+
Now, the customer asked me to order the results so that they are still in ascending order, but numbers are left last.
Basically he wants the results to be:
+--------+----+-----------------+
| name | id | description |
+--------+----+-----------------+
| Mark | 1 | A lot of things |
| Edward | 4 | Coffee |
| Gerald | 9 | Coff2e | <-- Coff2e after Coffee
| Carl | 2 | 1 mango | <-- 1 and 5 ASC after everything
| James | 3 | 5 lemons |
+--------+----+-----------------+
Translated to a pseudo-query, it would be something like this:
MATCH CK_ITEM ORDER BY letters(description) ASC numbers(description) ASC
Is it possible to have this kind of sorting (letters first ascending, numbers last ascending) in a single query? How?
The following is a Cypher query that will perform a sort where digits come last (at every character position).
NOTE: THIS APPROACH IS NOT EFFICIENT, but is presented as an example of how to do this in Cypher if you absolutely needed to.
The query splits every description value into single-character strings, tests each character to see if it is a digit, constructs a new string (character by character) -- replacing every digit with a corresponding UTF-16 character in the hex range FFF6 to FFFF (these are the highest possible UTF-16 character encodings, and your raw data is unlikely to already be using them), and uses that new string for sorting purposes.
WITH {`0`:'\uFFF6',`1`:'\uFFF7',`2`:'\uFFF8',`3`:'\uFFF9',`4`:'\uFFFA',`5`:'\uFFFB',`6`:'\uFFFC',`7`:'\uFFFD',`8`:'\uFFFE',`9`:'\uFFFF'} AS big
MATCH (n:CK_ITEM)
WITH n, SPLIT(n.description, '') AS chars, big
RETURN n
ORDER BY
REDUCE(s='', i IN RANGE(0, LENGTH(chars)-1) |
CASE WHEN '9' >= chars[i] >= '0'
THEN s + big[chars[i]]
ELSE s + chars[i]
END)
You can order this way:
MATCH (n:CK_ITEM)
RETURN n
ORDER BY
substring(n.description,0,1) in ['0','1','2','3','4','5','6','7','8','9'], n.name
You can use reqular expression and UNION:
MATCH (n:CK_ITEM) WHERE NOT n.description =~ '[0-9].*'
RETURN n
ORDER BY
n.description ASC
UNION
MATCH (n:CK_ITEM) WHERE n.description =~ '[0-9].*'
RETURN n
ORDER BY
n.description ASC
Lets say I want to create a grammar that is similar to Lisp where all expressions are between open and close parentheses.
For example:
(+ 1 2)
I also want the grammar to be able to parse the string ('(def foo)) to a parse tree which is similar to (expression ( literal '(def foo) )).
That means it should successfully associate the parentheses in the literal expression to the literal.
Well, LISP in general is very user-extensible in terms of its grammar, so I don't know how possible it would be to get any BNF(+) form of it. Here is a discussion about it; I'm sure there are more if you search for it.
But for toy examples, this will probably be fine:
<s_expression> ::= <atomic_symbol>
| "(" <s_expression> "." <s_expression> ")"
| <list> .
<_list> ::= <s_expression> <_list>
| <s_expression> .
<list> ::= "(" <s_expression> <_list> ")" .
<atomic_symbol> ::= <letter> <atom_part> | "'" <s_expression> .
<atom_part> ::= <empty> | <letter> <atom_part> | <number> <atom_part> .
<letter> ::= "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j"
| "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t"
| "u" | "v" | "w" | "x" | "y" | "z" .
<number> ::= "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" | "0" .
<empty> ::= " ".
modified from here
I modified the grammar in a hurry, so please tell me if you see any problems with it.
Also, I haven't used ANTLR in a long time, so I don't know if it's exactly in a format it excepts. But it should be trivial to format it right though.
I have created a BNF for a certain language and want to check if a certain input is valid for that BNF. For instance, if I have a BNF like
<palindrome> ::= a <palindrome> a | b <palindrome> b |
c <palindrome> c | d <palindrome> d |
e <palindrome> e | ...
| z <palindrome> z
<palindrome> ::= <letter>
<letter> ::= a | b | c | ... | y | z
the string 'bcdcb' and 'hannah' will return true.
the string 'joe' will return false.
Can someone describe an algorithm that can do this.
This algorithm doesn't work with joe because it's checking are first and last letter same, it's searching palindromes words. 'joe' is not palindrome word. So it's ok that it doesn't pass.
Is there any formal algorithm or steps to rewrite a grammar that has no left-recursion and shows right precedence. Such as that simple algorithm for eliminating left recursion described in Wikipedia
For example, given the following algorithm:
1 <goal> ::= <expr>$
2 <expr> ::= <expr><op><expr>
3 | num
4 | id
5 <op> ::= +
6 |-
7 |*
8 |/
The desired output should be:
1. <expr> ::= <term><expr'>
2. <expr'> ::= +<term><expr'>
3. | epsilon
4. | -<term><expr'>
5. <term> ::= <factor><term'>
6. <term'> ::= *<factor><term'>
7. | epsilon
8. | /<factor><term'>
I'm attempting to come up with a non-ambiguous grammar for arithmetic expressions to make an Earley parser faster but I seem to be having trouble.
This is the given ambiguous grammar
S -> E | S,S
E -> E+E | E-E | E*E | (E) | -E | V
V -> a | b | c
this is my attempt at making it unambiguous
S -> S+E | S-E | E | (S+E) | (S-E) | (E)
E -> E*T | E
T -> -V | V
V -> a | b | c
It parses everything fine but there isn't any significant speedup as compared to using the ambiguous one.