Core Data Model Design - ios

Let's assume I have an app about cooking recipes with two fundamental features:
The first one involves the CURRENT recipe that I'm preparing
The second one stores the recipes that I've decided to save
STANDARD SCENARIO
My current recipe is "Cheese Cake" and in RecipeDetailViewController I can see the current ingredients I've added for this recipe:
Sugar
Milk
Butter
etc.
Well, let's say that I'm satisfied from the final result and I decide to save (to log) the recipe I've just prepared.
* click save *
The recipe is now saved (is now logged) and in RecipesHistoryViewController I can see something like this:
Nov 15, 2013 - Cheese Cake
Nov 11, 2013 - Brownie
etc.
Now if I want I can edit the recipe in the history and change Milk to Soy Milk, for example.
The issue it's that editing the recipe in the history SHOULDN'T edit the recipe (and its ingredients) in my current recipe and vice versa. If I edit the current recipe and replace Butter with Peanut Butter it must not edit anyone of the recipe stored in history. Hope I explained myself.
CONSEQUENCES
What this scenario implies? Implies that currently, for satisfing the function of this features, I'm duplicating the recipe and every sub-relationship (ingredients) everytime the user click on "Save Recipe" button. Well it works but I feel it can be something else more clean. With this implemention it turns out that I have TONS of different duplicates Core Data object (sqlite rows) like these:
Object #1, name: Butter, recipe: 1
Object #2, name: Butter, recipe: 4
Object #3, name: Butter, recipe: 3
etc.
Ideas? How can I optimize this model structure?
EDIT 1
I've already thought of creating any RecipeHistory object with an attribute NSString where I could store a json dictionary but I don't know if it's better or not.
EDIT 2
Currently a RecipeHistory object contains this:
+-- RecipeHistory --+
| |
| attributes: |
| - date |
+-------------------+
| relationships: |
| - recipes |
+-------------------+
+----- Recipe ------+
| relationships: |
| - recipeInfo |
| - recipeshistory |
| - ingredients |
+-------------------+
+-- RecipeInfo ----+
| |
| attributes: |
| - name |
+-------------------+
+--- Ingredient ----+
| |
| attributes: |
| - name |
+-------------------+
| relationships: |
| - recipe |
+-------------------+
paulrehkugler is true when he says that duplicating every Recipe object (and its relationships RecipeInfo and Ingredients) when I create a RecipeHistory is going to fill the database with a tons of data but I don't find another solution that allows me flexibility for the future. Maybe in the future I would to create stats about recipes and history and having Core Data objects could prove to be useful. What do you think? I think this is a common scenario in many apps that store history and allow to edit history item.
BIG UPDATE
I have read the answers from some users and I want to explain better the situation.
The example I stated above is just an example, I mean that my app doesn't involve cook/recipe argument but I have used recipes because I think it's pretty okay for my real scenario.
Said this I want to explain that the app NEEDS two sections:
- First: where I can see the CURRENT recipe with related ingredients
- Second: where I can see the recipe I decided to save by tapping a button 'Save Recipe' in the first section
The current recipe found in the first section and a X recipe found in the 'history' section doesn't have NOTHING in common. However the user can edit whatever recipes saved in 'history' section (he can edit name, ingredients, whatever he wants, he can completely edit all things about a recipe found in history section).
This is the reason why I came up duplicating all NSManagedObjects. However, in this way, the database will grow as mad because everytime the user saves the current recipe the object representing the recipe (Recipe) is duplicated and also the relationships the recipes had (ingredients). So there will be TONS of ingredients named 'Butter' for example. You can say me: why the hell you need to have TONS of 'Butter' objects? Well, I need it because ingredients has for example the 'quantity' attribute, so every recipe have ingredients with different quantities.
Anyhow I don't like this approach, even it seems to be the only one. Ask me whatever you want and I'll try to explain every detail.
PS: Sorry for my basic English.
EDIT

Since you must deal with history, and because the events are generated manually by end users, consider changing the approach: rather than storing the current view of the model entities (i.e. recipes, ingredients, and the connections among them) store the individual events initiated by the user. This is called Event Sourcing.
The idea is to record what user does, rather than recording the new state after the user's action. When you need to get the current state, "replay" the events, applying the changes to in-memory structures. In addition to letting you implement the immediate requirements, this would let you restore the state as of a specific date by "replaying" the events up to a certain date. This helps with audits.
You can do it by defining events like this:
CreateIngredient - Adds new ingredient, and gives it a unique ID.
UpdateIngredient - Changes an attribute of an existing ingredient.
DeleteIngredient - Deletes an ingredient from the current state. Deleting an ingredient deletes it from all recipes and recipe histories.
CreateRecipe - Adds a new recipe, and gives it a unique ID.
UpdateRecipeAttribute - Changes an attribute of an existing recipe.
AddIngredientToRecipe - Adds an ingredient to an existing recipe.
DeleteIngredientFromRecipe - Deletes an ingredient from an existing recipe.
DeleteRecipe - Deletes a recipe.
CreateRecipeHistory - Creates a new recipe history from a specific recipe, and gives the history a new ID.
UpdateRecipeHistoryAttribute - Updates an attribute of a specific recipe history.
AddIngredientToRecipeHistory - Adds an ingredient to a recipe history.
DeleteIngredientFromRecipeHistory - Deletes an ingredient from a recipe history.
You can store the individual events in a single table using Core Data APIs. Add a class that processes events in order, and creates the current state of the model. The events will come from two places - the event store backed by Core Data, and from the user interface. This would let you keep a single event processor, and a single model with the details of the current state of recipes, ingredients, and recipe histories.
Replaying the events should happen only when the user consults the history, right?
No, that is not what happens: you read the whole history on start-up into the current "view", and then you send the new events both to the view and to the DB for persistence.
When users need to consult the history (specifically, when they need to find out how the model looked as of a specific date in the past) you need to replay the events partially, up until the date of interest.
Since the events are generated by hand, there wouldn't be too many of them: I would estimate the count in the thousands at the most - that's for a list of 100 recipes with 10 ingredients each. Processing an event on a modern hardware should be in microseconds, so reading and replaying the entire event log should be in the milliseconds.
Furthermore, do you know any link that shows an example of how to use Event Sourcing in a Core Data application? [...] For example, should I need to get rid of RecipeHistory NSManagedObject?
I do not know of a good reference implementation for event sourcing on iOS. That wouldn't be too different from implementing it on other systems. You would need to get rid of all tables that you currently have, replacing it with a single table that looks like this:
The attributes would be as follows:
EventId - Unique ID of this event. This is assigned automatically on insertion, and never changes.
EntityId - Unique ID of the entity created or modified by this event. This ID is assigned automatically by a Create... processor, and never changes.
EventType - A short string representing the name of this event type.
EventTime - The time the event has happened.
EventData - A serialized representation of the event - this can be binary or textual.
The last item can be replaced for a "denormalized" group of columns representing a superset of attributes used by the 12 event types above. This is entirely up to you - this table is merely one possible way of storing your events. It does not have to be Core Data - in fact, it does not even need to be in a database (although it makes things a little easier).

I think when a row in RecipesHistoryViewController is selected to modification, we can optimize the Save process with two options:
Let the user chooses if a new row must be saved or an update may happen. Having a Save New button to create a new row in Recipe and an Update button to update the current selected row.
To trace the changes have been made to a recipe (when update happens), I will try to log only changes of the recipe. Using EAV pattern will be an option.
As a hint: Comma separated values of ingredient name could be used as old and new values, when
inserting a row in RecipeHistory table, the sample may helps.
About the BIG UPDATE:
Assuming that the real application have a database for persistent operation, some suggestions may be helpful.
The current recipe found in the first section and a X recipe found in
the 'history' section doesn't have NOTHING in common
Leads the natural way of having no relation between Current and In-History recipe, so
trying to create a relation will be vain. With no relation the design will not be in normal form, redundancy will be inevitable.Flowing the approach there will be many records, in the case
We can limit any user's saved recipes in a predefined number.
Another solution to optimize performance of recipe table would be range
partitioning the table based on creation date field (let a data
base administrator be involved).
Another suggestion is to have a separate table for ingredient
concept. Having ingredient, recipe, recipe-ingredient
tables will reduce redundancy.
Using NoSql
If relations are not trivial part of the applications logic, I mean if your are not going to be ended in complex queries like "Which ingredients have been used more than X times in recipes that have less than total Y ingredients and Milk is not one of them" or analytical procedures then,have a look at NoSql databases and comparison of them.
They offer being non-relational, distributed, open-source, schema-free, easy replication support, simple API, huge amount of data and horizontally scalable.
For a basic example of a document based database: Having couchdb installed on my local machine(port number 5984) creating recipe database(table) on couchdb will be done by sending an standard HTTP request (using curl) like:
curl -X PUT http://127.0.0.1:5984/recipe
Dropping recipe table:
curl -X DELETE http://127.0.0.1:5984/recipe
Adding a recipe:
curl -X PUT http://127.0.0.1:5984/recipe/myFirstRecipe -d
'{"name":"Cheese Cake","description":"i am using couchDB for my recipes",
"ingredients": [
"Milk",
"Sugar"
],}'
Getting myFirstRecipe record(document)
curl -X GET http://127.0.0.1:5984/recipe/myFirstRecipe
No need of classical server side process like object relation mapping, data base driver, etc
BTW using Nosql will have short comings you need to consider, like here and here.

As I see it, your problem is more conceptual than model structure related.
My idea for your model is:
+*******+
Recipe
-----------------
-----------------
properties:
-----------------
- isDraft - BOOL
- name - NSString
- creationDate - NSDate
-----------------
-----------------
relationships:
-----------------
- ingredients - to-many with Ingredient
-----------------
+*******+
+*******+
Ingredient
-----------------
-----------------
properties:
-----------------
- name - NSString
-----------------
-----------------
relationships:
-----------------
- recipes - to-many with Recipe
-----------------
+*******+
Now, Lets call your "current" recipe a draft (a user may have many drafts).
As you can see, you can now display your recipes with a single fetched results controller (FRC)
The fetch request will look like this:
NSFetchRequest* r = [NSFetchRequest fetchRequestWithEntityName:#"Recipe"];
[r setFetchBatchSize:25];
NSSortDescriptor* sortCreationDate = [NSSortDescriptor sortDescriptorWithKey:#"creationDate" ascending:NO];
[r setSortDescriptors:#[sortCreationDate]];
you can section your data on the isDraft property:
NSFetchedResultsController* frc = [[NSFetchedResultsController alloc] initWithFetchRequest:r
managedObjectContext:context
sectionNameKeyPath:#"isDraft"
cacheName:nil];
Remember to give appropriate titles to your sections as to not confuse the user.
Now, all you have left is add some specific functionality like:
create new recipe
save
save draft
edit recipe (draft or not)
if draft offer to save as complete recipe
else, save the actual recipe
if you like, you might add a "save as" option
create copy (the user is aware that he might introduce redundant data if he saves the same recipe more than once)
In any case the user experience should be consistent.
Meaning:
While the user is editing/adding an object, this object should not change "under his feet".
If a user is adding a new recipe, he then might wish to save it as draft, or as a complete recipe.
When he save, in either case, he might still wish to continue editing it. and so, no new object need be created.
If you like to add versioning for your recipes, you will need to add an entity like RecipeHistory related to a single recipe. this entity will record changes on each committed change in a complete recipe object (use changedValues of NSManagedObject or check against the existing/committed values).
You may serialise and store the data as you see fit.
So you can see, its more of a conceptual issue (how you access your data) than it is a modelling issue.

There are a few questions that need to be answered:
Is there a limit to the number of "history items" for a recipe or is it really necessary to keep all the versions of a recipe around?
When is a modification just a change of an existing recipe and when does the change result in a new recipe? For example, should the user be allowed to change a "cheese cake" recipe into a "meat loaf" recipe by completely replacing every ingredient and the title?
The answers to these questions are important when planing your data model. For example, ask yourself if this would be a valid use case for your app: The user creates a "Basic Cake" recipe that contains sugar, flour and eggs. The user now wants to take this "Basic Cake" recipe as a template to create a "Cheese Cake", a "Pound Cake" and a "Carrot Cake" recipe. Is that a valid use case?
If so, every time you save a recipe, it basically creates a completely new, independent recipe because the user is allowed to change everything and thus turn a cheese cake into a meat loaf.
However, I think that would be unexpected behavior for the user. In my opinion the user creates a "Cheese Cake" recipe and then might want to trace the changes to that one recipe and not turn it into something completely different.
This is what I would suggest:
Instead of a RecipeHistory owning Recipes, change your data model so that Recipes have multiple RecipeVersions. That way, users can explicitly create new recipes and then track the changes to that one recipe. Also, users would not be allowed to edit a RecipeVersion directly, but instead could "revert" their recipe to a specific version and then edit that.
Make Ingredients unique: "Butter", "Milk" and "Flour" exist exactly once in the database and are only references by the different recipes. That way, you will not have duplicates in your database and saving just the reference will take up less disk space than saving the name of the ingredient again and again.
Allow your users to create a new recipe based on an existing Recipe(Version). That way you give your users the ability to "base" a new recipe on an existing one without complicating your app and your data model.
This is my suggested data model:
+----- Recipe ------+
| attributes: |
| - name |
| relationships: |
| - recipeVersions |
+-------------------+
+-- RecipeVersion ----+
| attributes: |
| - timestamp |
+----------------------+
| relationships: |
| - recipe |
| - ingredients |
+----------------------+
+--- Ingredient ----+
| attributes: |
| - name |
+-------------------+
| relationships: |
| - recipeVersions |
+-------------------+
Enjoy.

You don't need to duplicate all of the ingredient objects. Instead, just change the relationships so that recipes have many ingredients and ingredients can be in many recipes. Then when you create a duplicate recipe you just connect to the existing ingredients.
This would also make it easier to list the recipes that use an (or some combination of) ingredients.
You should also consider your UI/UX - should it be a full duplicate? Or should you allow the user to create 'alternatives' within each recipe (which just list a set of replacement ingredients).

It's a tradeoff between storage size and retrieval time.
If you duplicate each recipe every time the user clicks the "Save Recipe" button, you duplicate a lot of data in the database.
If you create a RecipeHistory object that has a Recipe and a list of changes, it takes longer to retrieve the data and populate your View Controllers, because you have to reconstruct a full Recipe in memory.
I'm not sure which is easier - whichever suits your use case is probably best.

Not sure I am clear on the problem you are trying to solve but I would start by modelling the Recipe and Ingredients and keep them separate from the actual mix and method which may change as the cook experiments. With some smart application logic you could only track the changes in each version rather than make a new copy. For example if the user decides to try a new version of a recipe then by default show the previous versions (or allow the user to select a version) Method and RecipeIngredients and if any changes are made save these changes as new Method and RecipeIngredient associated with the RecipeVersion.
This approach will use less storage but requires much more complicated application logic, for example swapping an ingredient would setting the quantity to 0 for the ones being replaced and adding new records for the new ones. Simply duplicating the previous (or user selected) version is not going to use much space, these are small records, and will be much much simpler to implement.

I believe it would be better to define ingredient table to have ingredientID and ingredientDisplayName, and in recipie history table store RecipieID, HistoryDate, IngredientArray.
if in ingredient table,
id:1 is Butter
id:2 is Milk
id:3 is cheese
id:4 is Sugar
id 5 is Soymilk
then in history table
for recipe 1: Cheese Cake, data Nov 15, IngredientArray: {1,2,3,4}
if on Nov 16 Cheese cake changes to have soy milk instead of milk then on that date IngredientArray is {1,2,3,5} . Many database has array column option or alternately could be a comma separated string or a Json document.
Its better to keep the ingredient list in-memory to do fast lookup to get ingredient names from list.

maybe I did not understand your question, but do you need to change the name of butter by editing? Why not just delete butter from that one recipe and add peanut butter to it. That way you do not change butter to peanut butter for al your other recipes that are linked to it? And with new recipes you can select peanut butter or butter.

Just to be clear, we are talking about frontend?
First, like suggest by Mohsen Heydari, on SQL rdbms, you should create a table between many-to-many connections to make two one to many for performance.
So you want a historic
+-- RecipeHistory --+
| |
| attributes: |
| - id |
| - date |
| - new name? |
| - notes ?? |
| - recipe-id |
+-------------------+
| relationships: |
| - recipes |
+-------------------+
+----- Recipe ------+
| attributes: |
| - id |
| - name |
| - discription |
| - date |
| - notes | #may be useful?
| - Modifiable | #this field is false if in history, else true,
+-------------------+
| relationships: |
| recipe-ingredient |
+-------------------+
+-Recipe-ingridient-+
| attributes: |
| id |
| recipe-id |
| ingridient-id |
| quantity |
+-------------------+
+--- Ingredient ----+
| |
| attributes: |
| - id |
| - name |
+-------------------+
| relationships: |
| -recipe-ingredient|
+-------------------+
Now if modifiable field on Recipe = True it belongs on the MainPage
If its false, it belongs on the historic page
After finding a recipe you want, you can query the ingredients by its recipe-id using the Recipe-Ingredient table, or Recipe by Ingredients the same way.
Another option less space hungry would be create a Recipe history, and create a Modified recipe table -> which takes a base recipe ID,
And map it to -> Main Recipe ID, Discarded Ingredients and New Ingredients, if you want this solution explained just ask

Related

Rails - Associating database/models with Modules that aren't database tables

I want to create "associations" (or an equivalent concept with similar methods available from having associations). It is with this table of information, that does NOT need to be updated wahtsoever with other tables that DO involve CRUD.
This is my non-updated table of information:
Table name: Personalities
personality_type | alternate_name | CF1 | CF2 | CF3 | CF4 | CF5 | CF6 | CF7 | CF8
----------------------------------------------------------------------------------
ENTj | ENTJ | Te | Ni | Se | Fi | etc | etc | |
INTp | INTJ | (more data values)
ISFj | ISFP | (more data values)
ESFp | ESFP | (more data values)
So it seems to me that making this non-updated into a database table and performing queries on it would be a silly and pointless way of designing my code, since that would entail all of the query loading time overhead.
So I was thinking of something like making a separate Ruby module, but wasn't sure how to "associate" it with other tables that would be full-fledged database tables with models.
1) How do I associate a non-database class instance based on ActiveRecords::Base with one?
2) Which format/data type should I put my non-updated table of information in? class, module, multiple class instances, a 2 dimensional array, or 2 dimensional hash?
My goal in sorting out this decision is to be able to use the similar method notations that comes with associating database models. (e.g. two tables called "Personality" and "User" would allow Rails/Ruby code like #user.alternate_name. and #personality.user.email).
3) Does the fact that rails uses hidden :id, and timestamp columns affect this in any way?
(If this question is a bit broad, feel free to ignore answering it).
Much thanks!
-A user can have only one personality type.
-Other database models need to refer to personality type information independent of the user model.
Presumably only the User model can have a personality type. Why not create an array of these types as a constant in the User model, which you can then refer to in forms etc for selection using User::PERSONALITY_TYPES.
For example:
class User
PERSONALITY_TYPES = %w{ ENTJ INTJ ISFP ESFP }
# ... other model code
end
Then simply store the index of the personality type within the array as the user's personality_type_index.
Perhaps I'm oversimplifying your needs, but this is the approach I would start with.

Return only results based on current object for dynamic menus

If I have an object that has_many - how would I go about getting back only the results that are related to the original results related ids?
Example:
tier_tbl
| id | name
1 low
2 med
3 high
randomdata_tbl
| id | tier_id | name
1 1 xxx
2 1 yyy
3 2 zzz
I would like to build a query that returns only, in the case of the above example, rows 1 and 2 from tier_tbl, because only 1 and 2 exist in the tier_id data.
Im new to activerecord, and without a loop, don't know a good way of doing this. Does rails allow for this kind of query building in an easier way?
The reasoning behind this is so that I can list only menu items that relate to the specific object I am dealing with. If the object i am dealing with has only the items contained in randomdata_tbl, there is no reason to display the 3rd tier name. So i'd like to omit it completely. I need to go this direction because of the way the models are set up. The example im dealing with is slightly more complicated.
Thanks
Lets call your first table tiers and second table randoms
If tier has many randoms and you want to find all tiers whoes id present in table randoms, you can do it that way:
# database query only
Tier.joins(:randoms).uniq
or
# with some ruby code
Tier.select{ |t| t.randoms.any? }

How to select table column names in a view and pass to controller in rails?

So I am new to Rails, and OO programming in general. I have some grasp of the MVC architecture. My goal is to make a (nearly) completely dynamic plug-and-play plotting web server. I am fairly confused with params, forms, and select helpers.
What I want to do is use Rails drop downs to basically pass parameters as strings to my controller, which will use the params to select certain column data from my database and plot it dynamically. I have the latter part of the task working, but I can't seem to pass values from my view to controller.
For simplicity's sake, say my database schema looks like this:
--------------Plot---------------
|____x____|____y1____|____y2____|
| 1 | 1 | 1 |
| 2 | 2 | 4 |
| 3 | 3 | 9 |
| 4 | 4 | 16 |
| 5 | 5 | 25 |
...
and in my Model, I have dynamic selector scopes that will let me select just certain columns of data:
in Plot.rb
class Plot < ActiveRecord::Base
scope :select_var, lambda {|varname| select(varname)}
scope :between_x, lambda {|x1,x2| where("x BETWEEN ? and ?","#{x1}","#{x2}")}
So this way, I can call:
irb>>#p1 = Plot.select_var(['x','y1']).between_x(1,3)
and get in return a class where #p1.x and #p1.y1 are my only attributes, only for values between x=1 to x=4, which I dynamically plot.
I want to start off in a view (plot/index), where I can dynamically select which variable names (table column names), and which rows from the database to fetch and plot. The problem is, most select helpers don't seem to work with columns in the database, only rows.
So to select columns, I first get an array of column names that exist in my database with a function I wrote.
Plots Controller
def index
d=Plot.first
#tags = d.list_vars
end
So #tags = ['x','y1','y2']
Then in my plot/index.html.erb I try to use a drop down to select wich variables I send back to the controller.
index.html.erb
<%= select_tag( :variable, options_for_select(#plots.first.list_vars,:name,:multiple=>:true) )%>
<%= button_to 'Plot now!', :controller =>"plots/plot_vars", :variable => params[:variable]%>
Finally, in the controller again
Plots controller
...
def plot_vars
#plot_data=Plot.select_vars([params[:variable]])
end
The problem is everytime I try this (or one of a hundred variations thereof), the params[:variable] is nill.
How can I use a drop down to pass a parameter with string variable names to the controller?
Sorry its so long, I have been struggling with this for about a month now. :-( I think my biggest problem is that this setup doesn't really match the Rails architecture. I don't have "users" and "articles" as individual entities. I really have a data structure, not a data object. Trying to work with the structure in terms of data object speak is not necessarily the easiest thing to do I think.
For background: My actual database has about 250 columns and a couple million rows, and they get changed and modified from time to time. I know I can make the database smarter, but its not worth it on my end. I work at a scientific institute where there are a ton of projects with databases just like this. Each one has a web developer that spends months setting up a web interface and their own janky plotting setups. I want to make this completely dynamic, as a plug-and-play solution so all you have to do is specify your database connection, and this rails setup will automatically show and plot which data you want in it. I am more of a sequential programmer and number cruncher, as are many people here. I think this project could be very helpful in the end, but its difficult to figure out for me right now.

Rails custom meta model?

I'd like to be able to add "meta" information to a model, basically user-defined fields. So, for instance, let's imagine a User model:
I define fields for first name, last name, age, gender.
I would like users to be able to define some "meta information", basically to go in their profile page and share other information. So one user might want to add "hobbies", "occupation", and "hometown", and another might want to add "hobbies", and "education".
So, I'd like to be able to have a standard view for this kind of stuff, so for instance in the view I might do something like (in HAML):
- for item in #meta
%li
%strong= item.key + ":"
= item.value
This way I can ensure that the information is consistently displayed, rather than just providing a user with a markdown textbox that they may format all different ways.
I'd also love to be able to click on meta and see other users who have given the same thing, so in the example above both users defined "hobbies", it would be nice to be able to say I want to see users who have shared hobbies -- or even better I want to see users whose hobbies are ___.
So, since I don't know what fields users will want to define in advance, what kind of options are there for providing that kind of functionality?
Is there a gem that handles custom meta information on a model like this, or at least sort of similarly? Has anyone had experience with this kind of problem? If so, how did you solve it?
Thanks!
The dynamic field implementation depends upon following factors:
Ability to dynamically add attributes
Ability to support new data types
Ability to retrieve the dynamic attributes without additional query
Ability to access dynamic attributes like regular attributes
Ability query the objects based on dynamic attributes. (eg: find the users with
skiing hobbies)
Typically, a solution doesn't address all the requirements. Mike's solution addresses 1, and 5 elegantly. You should use his solution if 1 & 5 are important for you.
Here is a long solution that addresses 1,2,3, 4 and 5
Update the users table
Add a text field called meta to the users table.
Update your User model
class User < ActiveRecord::Base
serialize :meta, Hash
def after_initialize
self.meta ||= {} if new_record?
end
end
Adding a new meta field
u = User.first
u.meta[:hobbies] = "skiing"
u.save
Accessing a meta field
puts "hobbies=#{u.meta[:hobbies]}"
Iterating the meta fields
u.meta.each do |k, v|
puts "#{k}=#{v}"
end
To address the 5th requirement you need to use Solr Or Sphinx full text search engines. They are efficient than relying on DB for LIKE queries.
Here is one approach if you use Solr through Sunspot gem.
class User
searchable do
integer(:user_id, :using => :id)
meta.each do |key, value|
t = solr_type(value)
send(t, key.to_sym) {value} if t
end
end
def solr_type(value)
return nil if value.nil?
return :integer if value.is_a?(Fixnum)
return :float if value.is_a?(Float)
return :string if value.is_a?(String)
return :date if value.is_a?(Date)
return :time if value.is_a?(Time)
end
def similar_users(*args)
keys = args.empty? ? meta.keys : [args].flatten.compact
User.search do
without(:user_id, id)
any_of do
keys.each do |key|
value = meta[key]
with(key, value) if value
end
and
end
end
end
Looking up similar users
u = User.first
u.similar_users # matching any one of the meta fields
u.similar_users :hobbies # with matching hobbies
u.similar_users :hobbies, :city # with matching hobbies or the same city
The performance gain here is significant.
If each user is allowed to define their own attributes, one option might be to have a table with three columns: user_id, attribute_name, attribute_value. It might look like:
| user_id | attribute_name | attribute_value |
| 2 | hobbies | skiing |
| 2 | hobbies | running |
| 2 | pets | dog |
| 3 | hobbies | skiing |
| 3 | colours | green |
This table would be used for finding other users who have the same hobbies/pets/etc.
For performance reasons (this table is going to get large) you may want to maintain multiple places that the info is stored -- different sources of info for different purposes. I don't think it's bad to store the same info in multiple tables if absolutely necessary for performance.
It all depends on what functionality you need. Maybe it will end up making sense that each user has their key/value pairs serialized into a string column on the users table (Rails provides nice support for this type of serialization), so when you display info for a particular user you don't even need to touch the huge table. Or maybe you will end up having another table that looks like this:
| user_id | keys | values |
| 2 | hobbies, pets | skiing, running, dog |
| 3 | hobbies, colours | skiing, green |
This table would be useful if you need to find all users that have hobbies (run LIKE sql against the keys column), or all users that have anything to do with a dog (run LIKE sql against the values column).
That's the best answer I can give with the requirements you gave. Maybe there is a third-party solution available, but I'm skeptical. It's not really a "pop in a gem" type of problem.
In this case, I would at least consider a documentdb like mongo or couch, which can deal with this type of scenario much easier then an rdms.
If that isn't the case, I would probably end up doing something along the lines of what Mike A. described.

Using SharePoint's Data Query Webpart to link two lists

I have two SharePoint Lists: A & B. List A has a column where the user can add multilple references (displayed as hyperlinks) for each entry to entries in B
A: B:
... | RefB | ... Name | OtherColumns....
----------------- -----------------------
... | B1 | ... B1 |
... | B2,B3 | ... B2 |
... | B1,B3 | ... B3 |
Now I want to display all entries from list B that are referenced by an (specific) entry in A. I.e: I set the filter to [Entry 2] and the Web part displays all the stuff from entries B2 and B3. Is this even possible?
I think the problem you've got which is ruining some of the way's I'm thinking of solving it is that the RefB column is multi-valued. You may have some joy doing filtering with the DataView but it might get messy fast, as you try to split RefB on the comma and compare against the resulting array of values.
I think the problem could be made easier by having only a single value in the RefB column.
Three solutions come to mind.
Have only one value in RefB per item in Table A and repeat the other fields in Table A. You'd have to accept some data redundancy and would need to be careful with data entry.
The normal relational database way of solving your data redundancy problem would be to have a 3rd table joining tabe A to table B. If you're not familiar with relational database techniques, there are lots of straight-forward tutorials on data normalisation on the net. While there's some more work, it may lead to a cleaner solution. Be careful when trying to fake a relational database within SharePoint though - it's not meant for relational data. You may be better off using a SQL database.
Put everything in one table, though I think you've already ruled this one out.

Resources