How to sort a list of 1million records by the first letter of the title - ruby-on-rails

I have a table with 1 million+ records that contain names. I would like to be able to sort the list by the first letter in the name.
.. ABCDEFGHIJKLMNOPQRSTUVWXYZ
What is the most efficient way to setup the db table to allow for searching by the first character in the table.name field?
The best idea right now is to add an extra field which stores the first character of the name as an observer, index that field and then sort by that field. Problem is it's no longer necessarily alphabetical.
Any suggestions?

You said in a comment:
so lets ignore the first letter part. How can I all records that start with A? All A's no B...z ? Thanks – AnApprentice Feb 21 at 15:30
I issume you meant How can I RETURN all records...
This is the answer:
select * from t
where substr(name, 1, 1) = 'A'

I agree with the questions above as to why you would want to do this -- a regular index on the whole field is functionally equivalent. PostgreSQL (with some new ones in v. 9) has some rather powerful indexing capabilities for special cases which you might want to read about here http://www.postgresql.org/docs/9.1/interactive/sql-createindex.html

Related

How do i remove rows based on comma-separated list of values in a Power BI parameter in Power Query?

I have a list of data with a title column (among many other columns) and I have a Power BI parameter that has, for example, a value of "a,b,c". What I want to do is loop through the parameter's values and remove any rows that begin with those characters.
For example:
Title
a
b
c
d
Should become
Title
d
This comma separated list could have one value or it could have twenty. I know that I can turn the parameter into a list by using
parameterList = Text.Split(<parameter-name>,",")
but then I am unsure how to continue to use that to filter on. For one value I would just use
#"Filtered Rows" = Table.SelectRows(#"Table", each Text.StartsWith([key], <value-to-filter-on>))
but that only allows one value.
EDIT: I may have worded my original question poorly. The comma separated values in the parameterList can be any number of characters (e.g.: a,abcd,foo,bar) and I want to see if the value in [key] starts with that string of characters.
Try using List.Contains to check whether the starting character is in the parameter list.
each List.Contains(parameterList, Text.Start([key], 1)
Edit: Since you've changed the requirement, try this:
Table.SelectRows(
#"Table",
(C) => not List.AnyTrue(
List.Transform(
parameterList,
each Text.StartsWith(C[key], _)
)
)
)
For each row, this transforms the parameterList into a list of true/false values by checking if the current key starts with each text string in the list. If any are true, then List.AnyTrue returns true and we choose not to select that row.
Since you want to filter out all the values from the parameter, you can use something like:
= Table.SelectRows(#"Changed Type", each List.Contains(Parameter1,Text.Start([Title],1))=false)
Another way to do this would be to create a custom column in the table, which has the first character of title:
= Table.AddColumn(#"Changed Type", "FirstChar", each Text.Start([Title],1))
and then use this field in the filter step:
= Table.SelectRows(#"Added Custom", each List.Contains(Parameter1,[FirstChar])=false)
I tested this with a small sample set and it seems to be running fine. You can test both and see if it helps with the performance. If you are still facing performance issues, it would probably be easier if you can share the pbix file.
This seems to work fairly well:
= List.Select(Source[Title], each Text.Contains(Parameter1,Text.Start(_,1))=false)
Replace Source with the name of your table and Parameter1 with the name of your Parameter.

GSheets - How to query a partial string

I am currently using this formula to get all the data from everyone whose first name is "Peter", but my problem is that if someone is called "Simon Peter" this data is gonna show up on the formula output.
=QUERY('Data'!1:1000,"select * where B contains 'Peter'")
I know that for the other formulas if I add an * to the String this issue is resolved. But in this situation for the QUERY formula the same logic do not applies.
Do someone knows the correct syntax or a workaround?
How about classic SQL syntax
=QUERY('Data'!1:1000,"select * where B like 'Peter %'")
The LIKE keyword allows use of wildcard % to represent characters relative to the known parts of the searched string.
See the query reference: developers.google.com/chart/interactive/docs/querylanguage You could split firstname and lastname into separate columns, then only search for firstnames exactly equal to 'Peter'. Though you may want to also check if lowercase/uppercase where lower(B) contains 'peter' or whitespaces are present in unexpected places (e.g., trim()). You could also search only for values that start with Peter by using starts with instead of contains, or a regular expression using matches. – Brian D
It seems that for my case using 'starts with' is a perfect fit. Thank you!

Issue regarding fetching only those records where field value contains only one occurrence of a particular character in rails

I need to write a rails active record where clause where I have to fetch those rows where name (name is a column in my table) contains only one occurrence of the character '.'
For example, if there is two rows in the table where name is "a.b" and "a.b.c", then my query should return the row having name "a.b" only.
Please help me to solve this.
Thanks in advance!
You can for example remove the dots and compare length.
SELECT * FROM table WHERE (char_length(name) - char_length(replace(name, '.', '')))=1
This is not very efficient though, because indexes can't be utilized.
To make things smoother, you could store the number of dots (depth?) in its own column with an index and query based on that. This could be done in insert/update trigger or in application layer, whatever suits your situation.
dbfiddle
with regexp ? you can try this :
select * from "table" where "name" ~ '^[^\.]*\.[^\.]*$'

Custom order on string

I have a project model. Projects have a code attribute, which is in AAXXXX-YY format like "AA0001-18", "ZA0012-19", where AA is two characters, XXXX is a progressive number, and YY is the last two digits of the year of its creation.
I need to define a default scope that orders projects by code in a way that the year takes precedence over the other part. Supposing I have the codes "ZZ0001-17", "AA0001-18", and "ZZ002-17", "ZZ001-17" is first, "ZZ002-17" is second, and "AA001-18" is third.
I tried:
default_scope { order(:code) }
but I get "AA001-18" first.
Short answer
order("substring(code from '..$') ASC, code ASC")
Wait but why?
So as you said, you want to basically sort by 2 things:
the last 2 characters in the code string. YY
the rest of the code AAXXXX-
So first things first,
the order function as per Rails documentation will take the arguments you added and use them in the ORDER BY clause of the query.
Then, the substring function according to the documentation of PostgreSQL is:
substring(string from pattern)
If we want 2 characters .. from the end of the string $ we use ..$
Hence, substring(code from '..$')
For more information about pattern matching please refer to the documentation here.
Now finally, with the second part of our ordering the code which already will act as a sorter for all the preceding characters AAXXXX-.

Rails – assign an order number to each record

So I am importing passages from a book into my application. I am giving all the passages in a given book the class Passage. i.e. Passage.all
I do have many books so I also have a class Book. Therefore, when I am finding all the passages from one book I call:
Passage.where(book_id: self.book_id)
When I use the where method, does it preserve the "natural order", which Passage.all would generally return. If not, I could change the code to:
Passage.where(book_id: self.book_id).order("created_at ASC")
Anyway, I then proceed to write this code:
a = Passage.where(book_id: self.book_id)
b = a.index(self)+1
self.passage_number = b
[first line: returns all the passages in the book]
[second line: returns their number in the array + 1 to account for the 0 starting value thing (pardon the colloquialism)]
[third line: assigns that index value to the passage number]
Ultimately, I am trying to compute passage numbers, without having to hard code them.
SO WHAT'S MY ISSUE? Right now I am getting three passage #3's, and two passage #4's. My last passage is this:
Passage.last.passage_number = 217
Passage.where(book_id: 5).count = 241
It is skipping numbers and incorrectly indexing, so I think I need to code a better method! What's a better way to index an array in this context?
There is no such thing as "natural order": without an order clause Passage.all may return things in any order the database wants (which could depend on things like location of items on disk, query plan etc).
The first and last methods are special in that they order by id if your relation does not already have an order applied to it.
If you need things in a specific order then add an order clause.

Resources