I'm building a Rails application with a dashboard composed of a sorted collection of cells. The ultimate goal is to allow the user to arrange the cells and have that persisted to the database, but I'm unable to fathom the architecture required to make this happen.
I'm less concerned about the UI/UX of dragging and dropping cells, and more concerned about the models required to represent this in a SQL database with ActiveRecord.
Any help would be appreciated. Thanks!
This is a pretty solved problem, there are numerous gems that will handle this for you.
Typically you'd add a "position" integer column to the table, and sort by that when you select records. When you want to move an item A to a new position after item B, you first add 1 the position of all records that are sorted after B to make a new space for A, and then to set A's position to B.position + 1. This way, sorting involves only two writes.
Related
Ok, so I have like 7 cost tables. The idea would be to create a flat table that is essentially all of those costs on a single table. At which point we can feed that to a front end, and when a person picks a specific item, they can see all costs associated with that item.
I have an ItemInfo table, which defines all potential items that may have costs. I then have 7 cost tables, that define all of the individual costs occurred for 7 different phases of that items production.
So, just starting with two of those tables, I have joined the Item table to the Cost1 table, by the ItemID. If I execute that SQL, I get a new table that shows each cost that was accrued in the first phase, along with the relevant bits of data from both tables.
My issue is when I bring the next table in, Cost2.
The Cost1 table has 6,999 entries.
The Cost2 table has 13,743
When I join the ItemID table to the Cost2 table, the resulting table is massive.
I have tried inner joins, left joins, right joins, outer joins, etc.... Regardless of the type of join I try, I do not get 20,742 entries. Which would be the accurate number of entries, based on those two tables being both represented. I have not even attempted moving on to Cost3 through Cost7, as I can't even get the first two to display properly.
I suspect the answer may lie in grouping, but I'm not sure how to do that in a way that would retain the individual cost items from each page.
I thought I understood joins fairly well, and I think I do when it is just 2 tables. What I don't understand is if I tell the first 2 tables to only grab the matching items between them, and then I tell a second set of tables to do the same thing... why does it then seem to try and match cost1 to cost2, even though the ItemID table is the only one I am trying to link them too?
I would like to create a simple app in Xcode with two UIPickerViews that reference a data set where the second UIPickerView is dependent on the first one. I want to create an app where the user can select the manufacturer of a vehicle; Chevrolet, Dodge, Ford, etc. Then, the user can select the vehicle based on the first choice. For example if "Ford" was selected in the first UIPickerView, then only Ford vehicles show up in the second - F150, Focus, Mustang etc. After selecting both values, the user can search for the average price where the prices are kept in a data set. I found many examples with one UIPickerView referencing arrays, but I want to reference a much larger data set. How would I go about doing this? I am fairly new to Xcode, but I write SAS and SQL code daily.
I am assuming you have all of records saved in the database. I did something similar with 250k+ records.
Do not fetch all of your models' full representation into memory, fetch only one property (string column needed for current picker) with a DISTINCT on it - both SQLite & CoreData allow this.
Your subsequent pickers (2nd, 3rd & so on) will automatically see less data becuase of the previous filter applied (only Ford vehicles possible options).
Rule #1 applies to all of your pickers, only the relevant field as String pulled into memory with right filters.
I had no issues at all with above approach with my dataset. Not sure how big your dataset is.
Here's the situation, in the source database we have more than 600K active rows for a dimension but in reality the business only uses 100 of them.
Unfortunately the list of values that they might use is not known and we can't manually filter on those values to populate the dimension table.
I was thinking, what if I include the dimension columns for that table in the fact table and then when we send that to staging area, just seperate it from the fact and send it to it's own table.
This way, I will only capture the values that are actually used.
P.S. They have a search function in the application that help users navigate through 600K values. it's not like a drop-down field !
Do you have a better recommendation?
Yes - you could build the Dimension from the fact staging table. A couple of things to consider:
If the only attribute for the Dimension is the field in the fact staging table then you can keep this as a degenerate dimension in the fact table; no need to build a dimension table for it - unless you have other requirements that require a standalone dimension table, such as your BI tool needs it.
If there are other attributes you need to include in the dimension then you are still going to need to bring in the source dimension table - but you can filter it using the the values in the fact staging table and only load the used values into your dimension
I'm trying to design my first data mart with a star schema from an Excel Sheet containing informations about a Help Desk Service calls, this sheet contains 33 fields including different informations and I can't identify the fact table because I want to do the reporting later based on different KPI's.
I want to know how to identify the fact table measures easily and I have another question which is : Can a fact table contain only foreign keys of dimensions and no measures? Thanks in advance guys and sorry for my bad English.
You can have more than one fact table.
A fact table represents an event or process that you want to analyze.
The structure of the fact tables depend on the process or event that you are trying to analyze.
You need to tell us the events or processes that you want to analyze before we can help you further.
Can a fact table contain only foreign keys of dimensions and no measures?
Yes. This is called a factless fact table.
Let's say you want to do a basic analysis of calls:
Your full table might look like this
CALL_ID
START_DATE
DURATION
AGENT_NAME
AGENT_TENURE (how long worked for company)
CUSTOMER_NAME
CUSTOMER_TENURE (how long a customer)
PRODUCT_NAME (the product the customer is calling about)
RESOLVED
You would turn this into a fact table like this:
CALL_ID
START_DATE_KEY
AGENT_KEY
CUSTOMER_KEY
PRODUCT_KEY
DURATION (measure)
RESOLVED (quasi-measure)
And you would have a DATE dimension table, AGENT dimension table, CUSTOMER dimension table and PRODUCT dimension table.
Agile Data Warehouse Design is a good book, as are the ones by Kimball.
In general, the way I've done it (and there are a number of ways to do anything) is that the categorical data is referenced with a FKey in the fact table, but anything you want to perform aggregations on (typically as data types $/integers/doubles etc) can be in the fact table as well. So for example, a fact table might contain a hierarchy of types, such as product_category >> product_name, and it usually contains a time and/or location field as well; all of which would be referenced by a FKEY to a lookup table. The measure columns are usually integer based or money data, and are used in aggregate functions grouped by the other fields like this:
select sum(measureOne) as sum, product_category from facttable
where timeCol between X and Y group by product_category...etc
At one time a few years ago, I did have a fact table that had no measure column... because the only measure I had was based on count, which I would do dynamically by grouping different dimensions in the fact table.
In my application, users can rearrange their favorite books in whatever order they choose.
I have a "books" table in my database with a row for each book. Currently, there's an integer column called "position" that stores the position of each book: 1 for the top book, 2 for the next one, etc.
The problem is that if someone drags a book from, say, position #11000 to position #1, I then have to make 11,000 updates to the database. This seems inefficient. Is there a better way to do this?
One idea I've had would be just to have another table called "book_sort_orderings" or something, with a row for each user. And one column would be a huge text column that stores a sorted list of book ids. Then when the user rearranges the books, I can pull this value out into my code, perform the rearrangement there, and update the database row. Of course, any time a book is added or deleted I'd have to update this array as well. Is this the "right" way to go about things? Or is there something clever I can do to speed things up without changing my current setup?
You'd be surprised how fast a decent DBMS can update 11,000 rows, assuming you do it in a "bulk" fashion (as opposed to making a separate database round-trip for each row).
But if you want to avoid that, use the old BASIC trick (from the time BASIC still had line numbers): leave gaps!
Instead of using positions: 1, 2, 3, 4, 5 etc... use 10, 20, 30, 40, 50 etc....
So when you need to move (say) the first item to the next-to-last place, just modify 10 to 41 and you'll end-up with: 20, 30, 40, 41, 50 etc.... Obviously, you'll need to do some fiddling in case a gap gets completely filled, but this strategy should be able almost eliminate massive UPDATEs.
The other possibility is to implement a doubly-linked list: instead of order, keep an ID of the previous and the next item. Reordering can be done by simply "re-linking" the IDs, much as you would in an in-memory list. Unfortunately, you'd also prevent the DBMS from sorting the items directly (at least without awkward and probably inefficient recursive queries) - you'd have to do the sorting at the application level, so I'd recommend agains it
And one column would be a huge text column that stores a sorted list of book ids.
Please don't do that. You'd be violating the 1NF and there are very good reasons not to do that, including data consistency and performance (you'd have to rewrite the whole field for any single change to any portion of it).
Your current solution does not seem to work for multiple user settings. If the book's order is set in the Book table, wouldn't it be permanent for all users?
As others have mentioned, its typically best to keep your data normalized, which would require you to add another table like you are suggesting. So you could have a new BookOrdering table. So it'd have a book_id, a user_id and position column. That way for every user and every book there is an assigned position.
So there would be a default ordering (which would not be stored in this table), but users would have the ability to change the order. The table would only record changes from default. When you want to load the user's books, you'd first check this table for a certain user_id, and then shift/adjust the order accordingly.
It's not really hard to update all those rows in your example with a couple of SQL statements. You don't need to fire 11,000 updates at your DBMS (which I assume is what you were trying to say).
First, update all the books that are being shuffled forward one position:
UPDATE book
SET position = position + 1
WHERE position < 11000
AND position >= 1
...and then set the position of the book you're moving:
UPDATE book
SET position = 1
WHERE id = whatever