How can I calculate successive elements in a database using Rails? - ruby-on-rails

Is there a way I can query the database to return only the elements that have happened successively (in time) according to a set variable?
Specifically, I need to find out how many times the user has won a game in a row. The games are stored in the database with an attribute win_loss (1 is win, 0 is loss).
I would like to see if the user has won 3 games in a row. Is there a way to do this in the database?
If not, what would it look like in the application?
I apologize ahead of time if this is confusing. Please ask questions and I will try to clear it up. I'm using Ruby on Rails.

I would take different approach. I would add a column consecutive_wins into your User model (negative numbers could represent consecutive losses if you need that as well).
EDIT: if you really need to query it...
Two queries:
select the newest victory and newest loose for given user (could be done in one query, in SQL SELECT max(created_at) AS newest, win_loss FROM games GROUP BY win_loss (I can't recall how to do grouping functions in Rails right now but you should get the idea)
and then, regarding the result of query (1):
if both newest win and newest lose are found, count the games with condition that created_at > newest_win AND created_at > newest_lose
if only newest win or lose found, user was always wining or always loosing, just count the number of his games
in other case there were no games
I recommend adding the column to user model, it will be better when you need to display the whole table of gamers and their consecutive wins/loses.

Related

How to implement saving of order of todos / list?

I am developing an application which is very similar to todo list in its nature, except order of todos matters and can be changed by user.
What's a good way to save this order in db without having to re-save whole todo list upon change of order?
I am developing in Rails, Postgres and React, newest versions.
I am thinking to save it as an array in User Todos (there can be multiple users of the application), but I am thinking it could complicate things a little as every time I create a todo I would have to save the List also.
You can look into acts_as_list gem and for this you'll have to add an additional column position in your table. But this will do mass update on the records. But this gem is frequently updated.
If you want an optimised solution and minimise the number of updates on changing the list then you should check ranked_model gem but this one is not frequently updated. There is a brief on how it works :-
This library is written using ARel from the ground-up. This leaves the code much cleaner than many implementations. ranked-model is also optimized to write to the database as little as possible: ranks are stored as a number between -2147483648 and 2147483647 (the INT range in MySQL). When an item is given a new position, it assigns itself a rank number between two neighbors. This allows several movements of items before no digits are available between two neighbors. When this occurs, ranked-model will try to shift other records out of the way. If items can't be easily shifted anymore, it will rebalance the distribution of rank numbers across all members of the ranked group.
You can refer this gem and make your own implementation as it only supports rails 3 & 4.
This was a bit of a head scratcher but here is what I figured:
create table orderedtable (
pk SERIAL PRIMARY KEY,
ord INTEGER NOT NULL,
UNIQUE(ord) DEFERRABLE INITIALLY DEFERRED
)
DEFERRABLE INITIALLY DEFERRED is important so that intermediate states don't cause constraint violations during reordering.
INSERT INTO orderedtable (ord) VALUES (1),(2),(3),(4),(5),(10),(11)
Note that when inserting in this table it would be more efficient to leave gaps between ord values so as to minimize the amount of order values that need to be shifted when inserting or moving rows later. The consecutive values are for demonstration purposes.
Here's the trick: You can find a consecutive sequence of values starting at a particular value using a recursive query.
So for example, let's say you wanted to insert or move a row just above position 3. One way would be to move rows currently at position 4 and 5 up by one to open up position 4.
WITH RECURSIVE consecutives(ord) AS (
SELECT ord FROM orderedtable WHERE ord = 3+1 --start position
UNION ALL
SELECT orderedtable.ord FROM orderedtable JOIN consecutives ON orderedtable.ord=consecutives.ord+1 --recursively select rows one above, until there is a hole in the sequence
)
UPDATE orderedtable
SET ord=orderedtable.ord+1
FROM consecutives
WHERE orderedtable.ord=consecutives.ord;
The above renumbers the ord from 1,2,3,4,5,10,11 to 1,2,3,5,6,10,11 leaving a hole at 4.
If there was already a hole at ord=4 , the above query wouldn't have done anything.
Then just insert or move another row by giving it the now free ord value of 4.
You could push rows down instead of up by changing the +1s to -1s.

change a sort order field in a table using entity framework 6

I have a table with three fields: Id, location, sortorder.
Id location sortorder
-- -------- ---------
1 a 1
2 b 2
3 c 3
4 d 4
I want to the user to be able to amend the sort order on the items in the table. I'm using EF to write to the database, is there any way of amending the sort order on the table without having to loads of calls to the database.
If I move an item to the top of the list from the bottom I would need to update all the rows that were underneath that new row, to move them down the order. If possible I would like to avoid n updates to the database, and just do it in the least number possible.
Is this possible?
I believe Gert's suggestion of using floats for sort order is probably the best one to go with. Drupal uses weights of menu items for the same purpose but inserts at increments of 100 or 1000 so you can go between things. I think that it also can run a cron job to respace the ordering so you don't run out of numbers in a more efficiently stored data type but that sounds like a holdover from my BASIC days in middle school where you had to do that with line numbers.
Also, I would wager that it isn't actually as awful as running n updates because it's instead doing one update that affects n rows. Yes, at the end of the day it does have to change n rows but that's on the DB side so there are tons of efficiencies that can be implemented to speed it up.

How to efficiently fetch n most recent rows with GROUP BY in sqlite?

I have a table of event results, and I need to fetch the most recent n events per player for a given list of players.
This is on iOS so it needs to be fast. I've looked at a lot of top-n-per-group solutions that use subqueries or joins, but these run slow for my 100k row dataset even on a macbook pro. So far my dumb solution, since I will only run this with a maximum of 6 players, is to do 6 separate queries. It isn't terribly slow, but there has to be a better way, right? Here's the gist of what I'm doing now:
results_by_pid = {}
player_ids = [1,2,3,4,5,6]
n_results = 6
for pid in player_ids:
results_by_pid[pid] = exec_sql("SELECT *
FROM results
WHERE player_id = #{pid}
ORDER BY event_date DESC
LIMIT n_events")
And then I go on my merry way. But how can I turn this into a single fast query?
There is no better way.
SQL window functions, which might help, are not implemented in SQLite.
SQLite is designed as an embedded database where most of the logic stays in the application.
In contrast to client/server databases where network communication should be avoided, there is no performance disadvantage to mixing SQL commands and program logic.
A less dumb solution requires you to do some SELECT player_id FROM somewhere beforehand, which should be no trouble.
To make the individual queries efficient, ensure you have one index on the two columns player_id and event_date.
This won't be much of an answer, but here goes...
I have found that making things really quick can involve ideas from the nature of the data and schema themselves. For example, searching an ordered list is faster than searching an unordered list, but you have to pay a cost up front - both in design and execution.
So ask yourself if there are any natural partitions on your data that may reduce the number of records SQLite must search. You might ask whether the latest n events fall within a particular time period. Will they all be from the last seven days? The last month? If so then you can construct the query to rule out whole chunks of data before performing more complex searches.
Also, if you just can't get the thing to work quickly, you can consider UX trickery! Soooooo many engineers don't get clever with their UX. Will your query be run as the result of a view controller push? Then set the thing going in a background thread from the PREVIOUS view controller, and let it work while iOS animates. How long does a push animation take? .2 seconds? At what point does your user indicate to the app (via some UX control) which playerids are going to be queried? As soon as he touches that button or TVCell, you can prefetch some data. So if the total work you have to do is O(n log n), that means you can probably break it up into O(n) and O(log n) pieces.
Just some thoughts while I avoid doing my own hard work.
More thoughts
How about a separate table that contains the ids of the previous n inserts? You could add a trigger to delete old ids if the size of the table grows above n. Say..
CREATE TABLE IF NOT EXISTS recent_results
(result_id INTEGER PRIMARY KEY, event_date DATE);
// is DATE a type? I don't know. you get the point
CREATE TRIGGER IF NOT EXISTS optimizer
AFTER INSERT ON recent_results
WHEN (SELECT COUNT(*) FROM recent_results) > N
BEGIN
DELETE FROM recent_results
WHERE result_id = (SELECT result_id
FROM recent_results
WHERE event_date = MIN(event_date));
// or something like that. I have no idea if this will work,
// I just threw it together.
Or you could just create a temporary memory-based table that you populate at app load and keep up to date as you perform transactions during app execution. That way you only pay the steep price once!
Just a few more thoughts for you. Be creative, and remember that you can usually define what you want as a data structure as well as an algorithm. Good luck!

How to group similar items in an activity feed

For a social network site, I have an activity of events from people you follow, and I'd like to group similar types of events made within a short timeframe together, for a more compact activity feed. Imagine how Facebook displays a comma separated list when you 'like' several things in rapid succession: 'Joe likes beer, football and chips.'
I understand using the group_by method on ActiveRecord Enumerable results, but there needs to be some initial work done populating a property that I can group by later. My questions deal with both storing activity data in a way that these groupings can be marked, and then later retrieving them again.
Right now I have an Activity model, which is a join association between the user that committed the activity and the item that that it's linked to (in my example above, assume 'beer', 'football' and 'chips' are records of a Like model). There are other activity types aside from 'likes' too (events, saving favorites, etc). What I'm considering is, as this association is created, a check is made when the last association of that type was done, and if it was made more than a certain time period ago, incrementing an 'activity block' counter that is part of the Activity model. Later, when rendering this activity feed, I can group by user, then type, then this activity block counter.
Example: Let's say 2 blocks of updates are made within the same day. A user likes 2 things at 2:05 and later 3 more things at 5:45. After the third update (the start of the 2nd block) happens at 5:45, the model detects too much time has passed and increments its activity block counter by 1, thus forcing any following updates into a new block when they are rendered via a group_by call:
2:05 Joe likes beer nuts and Hooters.
5:45 Joe likes couches, chips and salsa.
7:00 Joe is attending the Football Viewing Party At Joe's
My first question: What's an efficient way to increment a counter like this? It's no longer auto_increment, so the easiest thing I can think of is looking at the counter for the last record as a reference point. However, this couldn't be from the same query that checked for when the last update of that type was made, since a later update of another type could have already received the next counter value. They don't have to be globally unique, but that would be nice.
The other overall strategy I thought of was another model Called ActivityBlock, that joins groups of similar activities together. In many cases, updates will be isolated by themselves though, so this seems a little inefficient to have one record for each individual activity.
Do either of these seem like a solid strategy?
My final question revolves around pagination. Now that we're dealing with blocks, it's harder to always display exactly a certain amount of entries, before pagination kicks in. Either an individual (isolated) Activity update, or a block of then should count as just 1, so at the lowest layer of my group_by, I can incorporate a counter to track how many rows I've displayed, but this means I can't just make one DB query anymore and simply specify a limit statement. Is there any way I could still do this without repeatedly performing additional SQL queries until I've reached my page limit?
This would be one advantage of the ActivityBlock model approach, since I could easily apply a limit call to that, and blocks could contain an auto increment counter as well.
Check out http://railscasts.com/episodes/406-public-activity
He also posted one on how to do it from scratch in episode 407 (it's a Pro episode though).
You could use the epoch time, or a variation of it as the counter since thats semi-unique and deterministic

Does Ruby on Rails "has_many" array provide data on a "need to know" basis?

On Ruby on Rails, say, if the Actor model object is Tom Hanks, and the "has_many" fans is 20,000 Fan objects, then
actor.fans
gives an Array with 20,000 elements. Probably, the elements are not pre-populated with values? Otherwise, getting each Actor object from the DB can be extremely time consuming.
So it is on a "need to know" basis?
So does it pull data when I access actor.fans[500], and pull data when I access actor.fans[0]? If it jumps from each record to record, then it won't be able to optimize performance by doing sequential read, which can be faster on the hard disk because those records could be in the nearby sector / platter layer -- for example, if the program touches 2 random elements, then it will be faster just to read those 2 records, but what if it touches all elements in random order, then it may be faster just to read all records in a sequential way, and then process the random elements. But how will RoR know whether I am doing only a few random elements or all elements in random?
Why would you want to fetch 50000 records if you only use 2 of them? Then fetch only those two from DB. If you want to list the fans, then you will probably use pagination - i.e. use limit and offset in your query, or some pagination gem like will_paginate.
I see no logical explanation why should you go the way you try to. Explain a real situation so we could help you.
However there is one think you need to know wile loading many associated objects from DB - use :include like
Actor.all(:include => :fans)
this will eager-load all the fans so there will only be 2 queries instead of N+1, where N is a quantity of actors
Look at the SQL which is spewed out by the server in development mode, and that will tell you how many fan records are being loaded. In this case actor.fans will indeed cause them all to be loaded, which is probably not what you want.
You have several options:
Use a paginator as suggested by Tadas;
Set up another association with the fans table that pulls in just the ones you're interested in. This can be done either with a conditions on the has_many statement, e.g.
has_many :fans, :conditions => "country of residence = 'UK'"
Specifying the full SQL to narrow down the rows returned with the :finder_sql option
Specifying the :limit option which will, well, limit, the number of rows returned.
All depends on what you want to do.

Resources