Implement autocomplete on MongoDB

Implement autocomplete on MongoDB - ruby-on-rails

Say I have a collection of users and want to implement autocomplete on the usernames of those users. I looked at the mongodb docs and $regex seems to be one way to do this. Is there a better way? By better I mean more performant/better practice.

As suggested by #Thilo, you can use several ideas including prefixing.
The most important thing is to have very quick request (because you want autocomplete to feel instaneous). So you have to use query which will use properly indexes.
With regexp : use /^prefix/ (the important thing is the ^ to specify the beginning of line which is mandatory to make the query use index).
The range query is good too : { $gt : 'jhc', $lt: 'jhd' } }
More complicated but faster : you can store prefix-trees in mongo (aka tries) with entries like :
{usrPrefix : "anna", compl : ["annaconda", "annabelle", "annather"]}
{usrPrefix : "ann", compl : ["anne", "annaconda", "annabelle", "annather"]}
This last solution is very fast (if indexes on compl of course) but not space efficient at all. You know the trade-off you have too choose.

We do it using regex and it's fast as long as you have an index and you use /^value/
Be aware you can't use the case insensitive option with an index, so you may want to store a lower case version of your string as another field in your document and use that for the autocomplete.
I've done tests with 3 million+ documents and it still appears instantaneous.

If you are looking for prefixes, you could use a range query (not sure about the exact syntax):
db.users.find({'username': { $gt : 'jhc', $lt: 'jhd' } } )
And you want an index on the username field.

Related

Directly return a table entry from a (simplest) function in Lua

I wanted to write the simplest possible function which let me return the desired value in a nameless table and, ideally, it should be something like this:
function RL_MyTool:Version(n)
return {"0.4.0", "20221003-0230", "13.5.5"}[n]
end
But, of course, that's not allowed in Lua...
So, off the top of my head, I can think on these two other possibilities:
1:
function RL_MyTool:Version(n)
local t = {"20221003-0230", "13.5.5"}
return t[n] or "0.4.0"
end
2:
function RL_MyTool:Version(n)
local n, t = n or 1, {"0.4.0", "20221003-0230", "13.5.5"}
return t[n]
end
Both of them slightly different from each other but doing the same, counting with the advantage of returning a default value if no argument is given, which is good. BUT... Do you think I could still have a possibility of writing it like in the very simplest fashion way above? Basically, what I'd like is not even have to use a single variable or table declaration along the function but still let me return the specified table entry when called.
Well, that's all. Of course if it's finally not possible (as I'm afraid) it won't be the end of the world 🙄, but I wanted to be sure I wasn't missing any Lua trick or something that let me do it more like I firstly imagined... Thanks!
P.S. Oh, I don't see how, but of course if it could be achieved without the necessity of even using a table at all, that would be equally valid or even better.
EDIT: BTW, for the record and based in #Piglet (great!) answer, I got to reduce it even more this way:
function RL_MyTool:Version(n)
return ({"0.4.0", "20221003-0230", "13.5.5"})[n or 1]
end
Improving code usability/maintenance a bit at the same time by avoiding duplicated values... Kind of a win-win-win 😁

Just put the table in parenthesis.
function RL_MyTool:Version(n)
return ({"0.4.0", "20221003-0230", "13.5.5"})[n] or "0.4.0"
end
But what is the purpose of this? Code should be easy to read and easy to work on. There is absolutely no reason to not use a local table. You don't have to pay a dollar for each line of code.

Cypher query with literal map syntax & dynamic keys

I'd like to make a cypher query that generates a specific json output. Part of this output includes an object with a dynamic amount of keys relative to the children of a parent node:
{
...
"parent_keystring" : {
child_node_one.name : child_node_one.foo
child_node_two.name : child_node_two.foo
child_node_three.name : child_node_three.foo
child_node_four.name : child_node_four.foo
child_node_five.name : child_node_five.foo
}
}
I've tried to create a cypher query but I do not believe I am close to achieving the desired output mentioned above:
MATCH (n)-[relone:SPECIFIC_RELATIONSHIP]->(child_node)
WHERE n.id='839930493049039430'
RETURN n.id AS id,
n.name AS name,
labels(n)[0] AS type,
{
COLLECT({
child.name : children.foo
}) AS rel_two_representation
} AS parent_keystring
I had planned for children.foo to be a count of how many occurrences of each particular relationship/child of the parent. Is there a way to make use of the reduce function? Where a report would generate based on analyzing the array proposed below? ie report would be a json object where each key is a distinct RELATIONSHIP and the property value would be the amount of times that relationship stems from the parent node?
Thank you greatly in advance for guidance you can offer.

I'm not sure that Cypher will let you use a variable to determine an object's key. Would using an Array work for you?
COLLECT([child.name, children.foo]) AS rel_two_representation

I think, Neo4j Server API output by itself should be considered as any database output (like MySQL). Even if it is possible to achieve, with default functionality, desired output - it is not natural way for database.
Probably you should look into creating your own server plugin. This allows you to implement any custom logic, with desired output.

Search records having comma seperated values that contains any element from the given list

I have a domain class Schedule with a property 'days' holding comma separated values like '2,5,6,8,9'.
Class Schedule {
String days
...
}
Schedule schedule1 = new Schedule(days :'2,5,6,8,9')
schedule1.save()
Schedule schedule2 = new Schedule(days :'1,5,9,13')
schedule2.save()
I need to get the list of the schedules having any day from the given list say [2,8,11].
Output: [schedule1]
How do I write the criteria query or HQL for the same. We can prefix & suffix the days with comma like ',2,5,6,8,9,' if that helps.
Thanks,

Hope you have a good reason for such denormalization - otherwise it would be better to save the list to a child table.
Otherwise, querying would be complicated. Like:
def days = [2,8,11]
// note to check for empty days
Schedule.withCriteria {
days.each { day ->
or {
like('username', "$day,%") // starts with "$day"
like('username', "%,$day,%")
like('username', "%,$day") // ends with "$day"
}
}
}

In MySQL there is a SET datatype and FIND_IN_SET function, but I've never used that with Grails. Some databases have support for standard SQL2003 ARRAY datatype for storing arrays in a field. It's possible to map them using hibernate usertypes (which are supported in Grails).
If you are using MySQL, FIND_IN_SET query should work with the Criteria API sqlRestriction:
http://grails.org/doc/latest/api/grails/orm/HibernateCriteriaBuilder.html#sqlRestriction(java.lang.String)
Using SET+FIND_IN_SET makes the queries a bit more efficient than like queries if you care about performance and have a real requirement to do denormalization.

How to sum all properties of a nested collection?

Given I got User.attachments and Attachment.visits as an integer with the number count.
How can I easily count all the visits of all images of that user?

Use ActiveRecord::Base#sum:
user.attachments.sum(:visits)
This should generate an efficient SQL query like this:
SELECT SUM(attachments.visits) FROM attachments WHERE attachments.user_id = ID

user.attachments.map{|a| a.visits}.sum

There's also inject:
user.attachments.inject(0) { |sum, a| sum + a.visits }
People generally (and quite rightly) hate inject, but since the two other main ways of achieving this have been mentioned, I thought I may as well throw it out there. :)

The following works with Plain Old Ruby Objects, and I suspect the following is marginally faster than using count += a.visits, plus it has an emoticon in it:
user.attachments.map(&:visits).inject(:+)

Ruby: Case-Insensitive Array Comparison

Just found out that this comparison is actually case-sensitive..Anyone know a case-insensitive way of accomplishing the same comparison?
CardReferral.all.map(&:email) - CardSignup.all.map(&:email)

I don't think there is any "direct" way like the minus operator, but if you don't mind getting all your results in lowercase, you can do this:
CardReferral.all.map(&:email).map(&:downcase) - CardSignup.all.map(&:email).map(&:downcase)
Otherwise you'll have to manually do the comparison using find_all or reject:
signups = CardSignup.all.map(&:email).map(&:downcase)
referrals = CardReferral.all.map(&:email).reject { |e| signups.include?(e.downcase) }
I'd suggest that reading a reference of Ruby's standard types might help you come up with code like this. For example, "Programming Ruby 1.9" has all methods of the Enumerable object explained starting on page 487 (find_all is on page 489).

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Implement autocomplete on MongoDB - ruby-on-rails

Say I have a collection of users and want to implement autocomplete on the usernames of those users. I looked at the mongodb docs and $regex seems to be one way to do this. Is there a better way? By better I mean more performant/better practice.

If you are looking for prefixes, you could use a range query (not sure about the exact syntax): db.users.find({'username': { $gt : 'jhc', $lt: 'jhd' } } ) And you want an index on the username field.

Related

Directly return a table entry from a (simplest) function in Lua

Cypher query with literal map syntax & dynamic keys

Search records having comma seperated values that contains any element from the given list

How to sum all properties of a nested collection?

Ruby: Case-Insensitive Array Comparison

Categories

Resources