I have data like this:
Sorry I can't share the data because its PHI, but pretty basic data frame with those column headers.
Following this guys video almost exactly I've created two drop down lists on this page:
In that drop down list you can for instance select "Facial Trauma" in the category 1 drop down and all the rows with "Facial Trauma" in Category 1 will show up.
Works great. My problem is that some rows might have "Facial Trauma" in Category 1, and some rows might have "Facial Trauma" in Category 2, and I'd love for the Sorter to show it either way.
Examples include:
Study is "Facial Trauma" category 1 and "NeuroPlastics" category 2. When I pick "Facial Trauma" in the sorter it shows up.
Study is reversed of that, Facial Trauma is Category 2. Still shows up.
You would think I could maybe just drop one of the sorter category variables, but I'd still love to be able to filter specifically to studies that were tagged with BOTH Facial Trauma AND Neuroplastics if I'd like.
So picking "Facial Trauma" in the first sorter would still get studies where Facial Trauma was in the second column. Does that make sense?
Edited to include a link to dummy data:
https://docs.google.com/spreadsheets/d/1WrMF8DVnVt186-OJ-X_maeq3EOZeX_8jXqu4aMTj1MM/edit?usp=sharing
Related
I have a post table and I need to be able to assign multiple filters to it, such as
Style:
Type:
Category:
I don't want to create separate tables for each that would over complicate everything. I was thinking to have one category table and maybe create entries with initials like
Table [Category]
cat-website
cat-ios
style-dark
style-simple
type-portfolio
...
so I could keep all in 1 table and strip cat/style/type part when I display the filters and I will end up having only 1 table. Plus it may help to make multiple selections (I hope) easier such as www.website.com/posts?category=cat1+cat18+cat53
🤔 Does this approach make sense at all?
I would not recommend the approach you outline. Having to do that stripping cat/style/type sounds pure yuck.
Why not use something like acts_as_taggable_on? I am not affiliated with the gem, but use it in all of my projects and find it very useful for categorizing and filtering.
I'm working on a project built on Rails 4, ActiveRecord and PostgreSQL and faced with a performance dilemma -
For brevity, let's say I have Category & Item models. Category has_many items.
Let's take the example where category 'Furniture' has 'bed, large mattress, small mattress, armchair', etc. While displaying these items under the category, we would intuitively want to see all kinds of mattresses and bed frames together, instead of being lexicographically ordered. Also, let's assume the total number of items under any category is in the order of < 100 (mostly about ~10-15 per category) & so naturally, the order of items falling in the same 'group' under a category would be much lower than that.
To achieve this grouping, one way is to create a SubCategory model and associate items through them, so we can add items of a certain group later on and still be able to show them together by grouping on the category & sub category.
The other way I'm thinking of, since the order of total items is so small, is to add an order (float type) field to the Item model to still be able to group them together (Bed = 5.01, Mattress = 5.02, Chair = 6.01, Bed Cover = 5.03 & so on).
The only reason I'm considering the other option is because we're confident on the number of items to not go beyond even a 100 in our application's scope and so the Sub Category route - creating a new model and persisting many columns vs one - seems like an overkill for this particular case.
So my question (finally!) is this -
What kind of pitfalls might I fall if I went the second route? Moreover, is sorting on a float field with Postgres an overall better tradeoff on speed and memory vs adding a new model to simulate sub groupings such as mentioned in the above example?
I'm developing an Rails app that will display food with its nutrients. I want to show only the nutrients that the user wants to see.
So, I have the models:
Food:
Nutrient:
FoodNutrient: Specifies the quantity of each nutrient in each food
UserNutrient: Specifies which nutrients the user wants to see
I will have thousands of foods and more than 100 nutrients
I saw several several sources that give hints on how to deal with this type of complexity (for now I'm considering in trying with Arel). However, these sources usually don't provide examples neither hints on how we should deal with this on the views. I found this one but would love more opinions on the issue, specially concerning the large data involved.
So, how is the best way to deal with this in my index view?
Another doubt that I have is if it is better for performance to have the FoodNutrient model or it is better to include columns on the Food model in which each new column would represent a nutrient. I suppose that the FoodNutrient bet is better as the user will choose which nutrients he will see but I'm not sure.
I would appreciate any example, explanation, advice, feedback or reference that may help me.
Edited
As there were some comments from people that didn't understand my question, I will try to summarize it in other words.
I want to get data from the first 3 models, and the last one (UserNutrient) I would use to reduce the number of rows shown to the user.
As I want to show something like:
Food Name | Nutrient 1 | Nutrient 2 | Nutrient 3
_______________________________________________________
Food 1 10 40 7.3
Food 2 9 4.4 9.1
I understand that I would have one loop on Food that would iterate one per row shown above. And I would also have to iterate on UserNutrient inside of the first loop to show the quantity of the nutrient on each food (this data is on UserNutrient). The main question is how to do these loops, specially considering that the tables will have lots of data. This one seems to be a little similar, although I didn't understand well.
My other doubt is if the structure is the best one. The FoodNutrient and Food tables could be merged.
I have researched about this problem and for now I'm decided to merge the FoodNutrient and Food tables/models as Food.
I believe that a FoodNutrient with lots of rows would be worse as it would have a huge index. Worse than a Food table with lots of columns.
This article helped me to decide:
http://rails-bestpractices.com/posts/58-select-specific-fields-for-performance
If you have something to add, please, answer the question too or add a comment.
I am building an app that have the following requirements:
-> A User can be a player of different teams.
-> A Team can be of a sport type.
My question is:
-> Since for each sport type I want to store different information of a Player, what would be the best way to model that?
I have thought on having several models (and tables) for each kind of Sport, for example:
Basketball_Players, Football_Players and so on, but I am not sure if that would be a good approach. How do you usually do this on RoR?
I'd say you have two options, and I don't know that it's really possible to say which is the "most correct" way to do it without knowing the details of the requirements of your application.
What's a given is that you'll have a sport table and a player table. I can say that for sure. The question is how you connect the two.
Option 1: a single join table
You could have a table called player_sport (or whatever) with a player_id column, a sport_id column, and a serialized_player_data column or something like that, where you'd keep serialized player data (JSON, perhaps) depending on the sport. Pros: simple schema. Cons: not properly normalized, and therefore subject to inconsistencies.
Option 2: a separate join table for each sport
This is what you alluded to in your question, where you have a basketball_player, football_player, etc. Here you'd also have a player_id column but probably not a sport_id column because that would be redundant now that you're specifying the sport right in the table name. The need to have a serialized_player_data column would go away, since you'd now be free to store the needed attributes directly in columns, e.g. wrestling_player.weight_class_id or whatever. Pros: proper normalization. Cons: more complex schema, and therefore more work in your application code.
There's actually a third option as well:
Option 3: a combination of 1 and 2
Here you might do everything you would do in Option 2, except you'd move the common player attributes to the player_sport table and save basketball_player, etc. for the sport-specific attributes. So weight_class_id would stay in wrestling_player but player_sport would have height, weight, and other columns that are relevant to all sports.
If you're looking for a recommendation, I would probably do Option 2, or, if it looks like there's enough overlap for it to make sense, Option 3.
I'm programming a website that allows users to post classified ads with detailed fields for different types of items they are selling. However, I have a question about the best database schema.
The site features many categories (eg. Cars, Computers, Cameras) and each category of ads have their own distinct fields. For example, Cars have attributes such as number of doors, make, model, and horsepower while Computers have attributes such as CPU, RAM, Motherboard Model, etc.
Now since they are all listings, I was thinking of a polymorphic approach, creating a parent LISTINGS table and a different child table for each of the different categories (COMPUTERS, CARS, CAMERAS). Each child table will have a listing_id that will link back to the LISTINGS TABLE. So when a listing is fetched, it would fetch a row from LISTINGS joined by the linked row in the associated child table.
LISTINGS
-listing_id
-user_id
-email_address
-date_created
-description
CARS
-car_id
-listing_id
-make
-model
-num_doors
-horsepower
COMPUTERS
-computer_id
-listing_id
-cpu
-ram
-motherboard_model
Now, is this schema a good design pattern or are there better ways to do this?
I considered single inheritance but quickly brushed off the thought because the table will get too large too quickly, but then another dilemma came to mind - if the user does a global search on all the listings, then that means I will have to query each child table separately. What happens if I have over 100 different categories, wouldn't it be inefficient?
I also thought of another approach where there is a master table (meta table) that defines the fields in each category and a field table that stores the field values of each listing, but would that go against database normalization?
How would sites like Kijiji do it?
Your database design is fine. No reason to change what you've got. I've seen the search done a few ways. One is to have your search stored procedure join all the tables you need to search across and index the columns to be searched. The second way I've seen it done which worked pretty well was to have a table that is only used for search which gets a copy of whatever fields that need to be searched. Then you would put triggers on those fields and update the search table.
They both have drawbacks but I preferred the first to the second.
EDIT
You need the following tables.
Categories
- Id
- Description
CategoriesListingsXref
- CategoryId
- ListingId
With this cross reference model you can join all your listings for a given category during search. Then add a little dynamic sql (because it's easier to understand) and build up your query to include the field(s) you want to search against and call execute on your query.
That's it.
EDIT 2
This seems to be a little bigger discussion that we can fin in these comment boxes. But, anything we would discuss can be understood by reading the following post.
http://www.sommarskog.se/dyn-search-2008.html
It is really complete and shows you more than 1 way of doing it with pro's and cons.
Good luck.
I think the design you have chosen will be good for the scenario you just described. Though I'm not sure if the sub class tables should have their own ID. Since a CAR is a Listing, it makes sense that the values are from the same "domain".
In the typical classified ads site, the data for an ad is written once and then is basically read-only. You can exploit this and store the data in a second set of tables that are more optimized for searching in just the way you want the users to search. Also, the search problem only really exists for a "general" search. Once the user picks a certain type of ad, you can switch to the sub class tables in order to do more advanced search (RAM > 4gb, cpu = overpowered).