Custom order on string - ruby-on-rails

I have a project model. Projects have a code attribute, which is in AAXXXX-YY format like "AA0001-18", "ZA0012-19", where AA is two characters, XXXX is a progressive number, and YY is the last two digits of the year of its creation.
I need to define a default scope that orders projects by code in a way that the year takes precedence over the other part. Supposing I have the codes "ZZ0001-17", "AA0001-18", and "ZZ002-17", "ZZ001-17" is first, "ZZ002-17" is second, and "AA001-18" is third.
I tried:
default_scope { order(:code) }
but I get "AA001-18" first.

Short answer
order("substring(code from '..$') ASC, code ASC")
Wait but why?
So as you said, you want to basically sort by 2 things:
the last 2 characters in the code string. YY
the rest of the code AAXXXX-
So first things first,
the order function as per Rails documentation will take the arguments you added and use them in the ORDER BY clause of the query.
Then, the substring function according to the documentation of PostgreSQL is:
substring(string from pattern)
If we want 2 characters .. from the end of the string $ we use ..$
Hence, substring(code from '..$')
For more information about pattern matching please refer to the documentation here.
Now finally, with the second part of our ordering the code which already will act as a sorter for all the preceding characters AAXXXX-.

Related

Denodo: How to aggregate varchar data types?

I'm creating an aggregate from a anstime column in a view table in Denodo and I'm using a Cast to convert it to float and it works only for those numbers with period (example 123.123) but does not work for the numbers without period (example 123). Here's my code which only works for those numbers with period:
SELECT row_date,
case
when sum(cast(anstime as float)) is null or sum(cast(anstime as float)) = 0
then 0
else sum(cast(anstime as float))
end as xans
FROM table where anstime like '%.%'
group by row_date
Can someone please help me how to handle those without period?
My guess is you've got values in anstime which are are not numeric, hence why not having the where anstime like '%.%' predicate causes a failure, as has been mentioned in other comments.
You could try adding in an intermediate view before this one which strips out any non numeric values (leaving the decimal point character of course) and this might then allow you to not have to use the where anstime like '%.%' filter.
Perhaps the REGEXP function which would possibly help there
Your where anstime like '%.%' clause is going to restrict possible responses to places where anstime has a period in it. Remove that if you want to allow all values.
I appreciate those who responded to my concern. In the end we had to reach out to our developers to fix the data type of the column from varchar to float rather than doing a workaround.

Rails query by number of digits in field

I have a Rails app with a table: "clients". the clients table has a field: phone. phone data type is string. I'm using postgresql. I would like to write a query which selects all clients which have a phone value containing more than 10 digits. phone does not have a specific format:
+1 781-658-2687
+1 (207) 846-3332
2067891111
(345)222-777
123.234.3443
etc.
I've been trying variations of the following:
Client.where("LENGTH(REGEXP_REPLACE(phone,'[^\d]', '')) > 10")
Any help would be great.
You almost have it but you're missing the 'g' option to regexp_replace, from the fine manual:
The regexp_replace function provides substitution of new text for substrings that match POSIX regular expression patterns. [...] The flags parameter is an optional text string containing zero or more single-letter flags that change the function's behavior. Flag i specifies case-insensitive matching, while flag g specifies replacement of each matching substring rather than only the first one.
So regexp_replace(string, pattern, replacement) behaves like Ruby's String#sub whereas regexp_replace(string, pattern, replacement, 'g') behaves like Ruby's String#gsub.
You'll also need to get a \d through your double-quoted Ruby string all the way down to PostgreSQL so you'll need to say \\d in your Ruby. Things tend to get messy when everyone wants to use the same escape character.
This should do what you want:
Client.where("LENGTH(REGEXP_REPLACE(phone, '[^\\d]', '', 'g')) > 10")
# --------------------------------------------^^---------^^^
Try this:
phone_number.gsub(/[^\d]/, '').length

Rails – assign an order number to each record

So I am importing passages from a book into my application. I am giving all the passages in a given book the class Passage. i.e. Passage.all
I do have many books so I also have a class Book. Therefore, when I am finding all the passages from one book I call:
Passage.where(book_id: self.book_id)
When I use the where method, does it preserve the "natural order", which Passage.all would generally return. If not, I could change the code to:
Passage.where(book_id: self.book_id).order("created_at ASC")
Anyway, I then proceed to write this code:
a = Passage.where(book_id: self.book_id)
b = a.index(self)+1
self.passage_number = b
[first line: returns all the passages in the book]
[second line: returns their number in the array + 1 to account for the 0 starting value thing (pardon the colloquialism)]
[third line: assigns that index value to the passage number]
Ultimately, I am trying to compute passage numbers, without having to hard code them.
SO WHAT'S MY ISSUE? Right now I am getting three passage #3's, and two passage #4's. My last passage is this:
Passage.last.passage_number = 217
Passage.where(book_id: 5).count = 241
It is skipping numbers and incorrectly indexing, so I think I need to code a better method! What's a better way to index an array in this context?
There is no such thing as "natural order": without an order clause Passage.all may return things in any order the database wants (which could depend on things like location of items on disk, query plan etc).
The first and last methods are special in that they order by id if your relation does not already have an order applied to it.
If you need things in a specific order then add an order clause.

How do I repeat a capturing group?

I have an input string that looks something like this:
HLI6Ch60000Ch500C0Ch46400Ch30000Ch21888Ch10E79CS07LCU3Ch37880Ch27800Ch16480CS8CA00000000000000000000
Now I don't care about the part that follows the last letter A, it'll always be A and exactly 20 numbers that are of no use to me. I do, however, need the part before the last letter A, and ideally, I'd need it to be separated into two different captures, just like this:
1: HLI6Ch60000Ch500C0Ch46400Ch30000Ch21888Ch10E79CS07
2: LCU3Ch37880Ch27800Ch16480CS8C
The only way to identify these matches is that they end with characters CS followed by two hexadecimal characters. I thought that a regular expression like (.+?CS.{2})+ (or (.+?CS[[:xdigit:]]{2})+) would do the job but when tried on www.regex101.com, it only captures the last group and gives the following warning:
Note: A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data
Which I thought suggests that I should use regular expression like ((.+?CS.{2})+) instead and I mean – sure, now I get two captures, but they look like this:
1: HLI6Ch60000Ch500C0Ch46400Ch30000Ch21888Ch10E79CS07LCU3Ch37880Ch27800Ch16480CS8C
2: LCU3Ch37880Ch27800Ch16480CS8C
Meaning the first one is… slightly longer than I'd like it to be. If it helps in any way, I should point out that the final regular expression will be part of an iOS application so an instance of NSRegularExpression class will be used – not sure if that's a helpful information at all, it's just that I know that NSRegularExpression doesn't support every part of the world of regular expressions.
(.+?CS.{2})
You can direclty use this.See demo.Grab the group or capture.
https://regex101.com/r/vD5iH9/68
It doesn't seem like you need a capturing group at all:
(?:(?!CS[0-9A-F]{2}).)+CS[0-9A-F]{2}
will match all strings that end in CS + 2 hex digits.
Test it live on regex101.com.
Explanation:
(?: # Start a group.
(?!CS[0-9A-F]{2}) # Make sure we can't match CSff here,
. # if so, match any character.
)+ # Do this at least once.
CS[0-9A-F]{2} # Then match CSff.
Change your regex to,
(.+?CS[[:xdigit:]]{2})
DEMO
You don't need to put the regex inside another capturing group and make it to repeat one or more times. Just print the group index 1 to get your desired output.

How to sort a list of 1million records by the first letter of the title

I have a table with 1 million+ records that contain names. I would like to be able to sort the list by the first letter in the name.
.. ABCDEFGHIJKLMNOPQRSTUVWXYZ
What is the most efficient way to setup the db table to allow for searching by the first character in the table.name field?
The best idea right now is to add an extra field which stores the first character of the name as an observer, index that field and then sort by that field. Problem is it's no longer necessarily alphabetical.
Any suggestions?
You said in a comment:
so lets ignore the first letter part. How can I all records that start with A? All A's no B...z ? Thanks – AnApprentice Feb 21 at 15:30
I issume you meant How can I RETURN all records...
This is the answer:
select * from t
where substr(name, 1, 1) = 'A'
I agree with the questions above as to why you would want to do this -- a regular index on the whole field is functionally equivalent. PostgreSQL (with some new ones in v. 9) has some rather powerful indexing capabilities for special cases which you might want to read about here http://www.postgresql.org/docs/9.1/interactive/sql-createindex.html

Resources