How revisions control works on quora? Database design - ruby-on-rails

Well, I've seen some plugins to create a versions table to keep track of modifications on specific models, but cant do easily like quora shows
What I have so far is a table like that:
id
item_type: especifies what model revision refers: "Topic"
item_id
event: if it was: "edited, added, reverted, removed"
who: who triggered the event
column: What column in "Topic" the value has changed. "topic.photo_url"
new: new value: "http://s3.amazonaws.../pic.png"
old old value: ""http://s3.amazonaws.../oldpic.png"
revision_rel: points to the past revision
timestamp
Someone could give me some help and guidelines with this design? Im worried about performance, wrong columns, missing columns, etc
id | item_type | item_id | event | who | column | new | old | revision_rel | date
________________________________________________________________________________________________________
1 | Topic | 2 | edit | Luccas | photo | pic.png | oldpic.png | null | m:d:y
2 | Topic | 2 | revert | Chris | photo | oldpic.png | pic.png | 1 | m:d:y

There are some gems available that already do what you are looking for. Have you looked into:
Take a looks at existing gems:
https://www.ruby-toolbox.com/categories/Active_Record_Versioning
I am using audited (previously acts_as_audited) for something very similar:
https://github.com/collectiveidea/audited

Related

Database design question - repeating duplicate values [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 days ago.
Improve this question
The application I am working on has a hierarchy to the data tables it has, which is mostly straight forward. However, there is one field that is shared between most of the tables, that is acting like a secondary pk to the tables. Searching through google, this site and other places, I don't see any similar examples of this design. It appears that this design pattern is uncommon -- but does it's use create a problem that should be resolved?
Parent Table - Top level data object (ex: Product)
There are at least 10 sub tables (i.e., Manufacture, Materials, SalesRep, Vendor, etc...)
Each of the sub tables may or may not have other dependent tables.
The parent table, and some (but not all) of the dependent tables have a field called "Type" (saved as an integer). (ex: Physical, Electronic, Both)
The issue is, when selecting the data, the type_id is passed into all of the retrievals, for all of the tables. Doing so, allows for "Product" (ex, a "Book") to have one complete set of data (e.g., manufactures, materials, reps, vendors etc...) for one type of "Product" (ex: electronic book) and that same "Product" to have a completely different (or the same) set of data (e.g., manufactures, reps, etc...) for a different type of "Product" (ex: physical printed book).
Repeating the type_id through all of the tables duplicates the same data throughout the tables, resulting in essentially a two field pk for each record.
Currently:
--// Table: product
+------+-------------+----------------+
| id | date_issued | product_fields |
+------+-------------+----------------+
| 1 | 2010-08-20 | Book 1 |
| 2 | 2010-08-20 | Book 2 |
| 3 | 2010-08-20 | Book 3 |
+------+-------------+----------------+
--// Table: manufacturer
+------+------------+----------+-------------------+
| id | product_id | type_id | name |
+------+------------+----------+-------------------+
| 1 | 1 | 1 | Digital Printers |
+------+------------+----------+-------------------+
| 2 | 1 | 2 | Physical Printers |
+------+------------+----------+-------------------+
From what I can see, making "Type" relation a sub table under "Product", then having every other table be a dependent of product/type association is an alternative. However, to implement such a design change would require a great deal of refactoring both the database and code. While it is an alternative, is that the way others would do this?
Resulting in something like this:
--// Table: product
+------+-------------+----------------+
| id | date_issued | product_fields |
+------+-------------+----------------+
| 101 | 2010-08-20 | Book 1 |
| 102 | 2010-08-20 | Book 2 |
| 103 | 2010-08-20 | Book 3 |
+------+-------------+----------------+
--// Table: product_type_assoc
+------+-------------+-------------+
| id | product_id | type_id |
+------+-------------+-------------+
| 5 | 101 | 1 |
+------+-------------+-------------+
| 6 | 101 | 2 |
+------+-------------+-------------+
| 7 | 102 | 1 |
+------+-------------+-------------+
--// Table: manufacturer
+------+---------------------+-------------------+
| id | assoc_id | name |
+------+---------------------+-------------------+
| 1 | 5 | Digital Printers |
+------+---------------------+-------------------+
| 2 | 6 | Physical Printers |
+------+---------------------+-------------------+
An interim steps seems like having the current "type" in the product table, and passing that to the sub queries
Select from "Vendor" where "Product".id = 1 and "Type"_id = "Product".current_type
What do you think - Is this the preferred way, or is there a more commonly accepted design that does the same thing?

How to get review cycle duration information from gerrit?

I'm trying to get some data on how long it takes for reviews to go through Gerrit on average.
Looking at some open source code, I see stuff like
reviewCreateTime = moment(mergedReviewsList[review].created);
reviewUpdateTime = moment(mergedReviewsList[review].updated);
interval = reviewUpdateTime.diff(reviewCreateTime, TIME_PERIOD_TYPE);
But with experimentation I don't think this logic is correct because adding a comment to a merged CR changes the updated timestamp.
I know this is possible because at the time of merge, Gerrit prints to the UI Change has been successfully merged by XXX.
I've been digging around in the mysql database but haven't found anything useful. I notice that changes that have been submitted have a submission_id, but I haven't found a table that stores submission information.
After a bunch of digging around, I have come up with one rather ugly but workable solution.
There is a table change_messages
mysql> describe change_messages;
+-----------------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------------+-------------+------+-----+---------+-------+
| author_id | int(11) | YES | | NULL | |
| written_on | timestamp | NO | | NULL | |
| message | text | YES | | NULL | |
| patchset_change_id | int(11) | YES | MUL | NULL | |
| patchset_patch_set_id | int(11) | YES | | NULL | |
| change_id | int(11) | NO | PRI | 0 | |
| uuid | varchar(40) | NO | PRI | | |
+-----------------------+-------------+------+-----+---------+-------+
7 rows in set (0.00 sec)
This basically stores stuff like XXXX has been successfully merged by YYYY and XXXX has been successfully cherry-picked as YYYY by ZZZZ.
You can then join this table with changes and datediff on change_messages.written_on and changes.created_on, e.g.
SELECT changes.change_id,
created_on,
written_on,
Datediff(written_on, created_on) diff
FROM change_messages
INNER JOIN changes
ON change_messages.change_id = changes.change_id
WHERE message LIKE 'Change has been successfully merged by %'
ORDER BY written_on;
Now this includes any time the CR was in draft mode. I'll edit this question if I get around to excluding that time.

Designing a Core Data managed object model for an iOS app that creates dynamic databases

I'm working on an iPhone app for users to create mini databases. The user can create a custom database schema and add columns with the standard data types (e.g. string, number, boolean) as well as other complex types such as objects and collections of a data type (e.g. an array of numbers).
For example, the user can create a database to record his meals.
Meal database:
[
{
"timestamp": "2013-03-01T13:00:00",
"foods": [1, 2],
"location": {
"lat": 47.253603,
"lon": -122.442537
}
}
]
Meal-Food database:
[
{
"id": 1,
"name": "Taco",
"healthRating": 0.5
},{
"id": 2,
"name": "Salad",
"healthRating": 0.8
}
]
What is the best way to implement a database for an app like this?
My current solution is to create the following database schema for the app:
When the user creates a new database schema as in the example above, the definition table will look like this:
+----+-----------+--------------+------------+-----------------+
| id | parent_id | name | data_type | collection_type |
+----+-----------+--------------+------------+-----------------+
| 1 | | meal | object | |
| 2 | 1 | timestamp | timestamp | |
| 3 | 1 | foods | collection | list |
| 4 | 1 | location | location | |
| 5 | | food | object | |
| 6 | 5 | name | string | |
| 7 | 5 | healthRating | number | |
+----+-----------+--------------+------------+-----------------+
When the user populates the database, the record table will look like this:
+----+-----------+---------------+------------------------+-----------+-----+
| id | parent_id | definition_id | string_value | int_value | ... |
+----+-----------+---------------+------------------------+-----------+-----+
| 1 | | 1 | | | |
| 2 | 1 | | 2013-03-01T13:00:00 | | |
| 3 | 1 | 2 | | 1 | |
| 4 | 1 | 2 | | 2 | |
| 5 | 1 | 4 | 47.253603, -122.442537 | | |
+----+-----------+---------------+------------------------+-----------+-----+
More details about this approach:
Values for different data types are stored in different columns in the record table. It is up to the app to parse values correctly (e.g. converting timestamp int_value into a date object).
Constraints and validation must be performed on the app as it is not possible on the database level.
What are other drawbacks with this approach and are there better solutions?
First of all your Record table is very inefficient and somewhat hard to work with. Instead you can have separate record tables for each record type you need to support.It will simplify everything a lot and add some additional flexibility, because it will not be a problem to introduce support for a new record type.
With that being said we can conclude it will be enough to have basic table management to make your system functional. Naturally, there is ALTER TABLE command:
but in some cases it might be very expensive and some engines have various limitations. For example:
SQLite supports a limited subset of ALTER TABLE. The ALTER TABLE
command in SQLite allows the user to rename a table or to add a new
column to an existing table.
Another approach might be to use BLOBs with some type tags in order to store record values.
This approach will reduce the need to support separate tables. It leads us to Schemaless approach.
Do you absolutely have to use CoreData for this?
It might make more sense to use a schema-less solution, such as http://developer.couchbase.com/mobile/develop/references/couchbase-lite/release-notes/iOS/index.html

Specflow Feature-level Templates

I'm trying to execute an entire SpecFlow Feature using three different UserID/Password combinations. I'm struggling to find a way to do this in the .feature file without having to introduce any loops in the MSTest.
On the Scenario level I'm doing this:
Scenario Template: Verify the addition functionality
Given the value <x>
And the value <y>
When I add the values together
Then the result should be <z>
Examples:
|x|y|z|
|1|2|3|
|2|2|4|
|2|3|5|
Is there a way to do a similar table at the feature level that will cause the entire feature to be executed for each row in the table?
Is there other functionality available to do the same thing?
I don't think the snippet you have is working is it? I've updated the below with the corrections I think you need (as Fresh also points out) and a couple of possible improvements.
With this snippet, you'll see that the scenario is run for each line in the table of examples. So, the first test will connect with 'Bob' and 'password', ask your tool to add 1 and 2 and check that the answer is 3.
I've also added an ID column - that is optional but I find it much easier to read the results with an ID number.
Scenario Outline: Verify the addition functionality
Given I am connecting with <username> and <password>
When I add <x> and <y> together
Then the result should be <total>
Examples:
| ID | username | password | x | y | total |
| 1 | Bob | password | 1 | 2 | 3 |
| 2 | Helen | Hello123 | 1 | 2 | 3 |
| 3 | Dave | pa£sword | 1 | 2 | 3 |
| 4 | Bob | password | 2 | 3 | 5 |
| 5 | Helen | Hello123 | 2 | 3 | 5 |
| 6 | Dave | pa£sword | 2 | 3 | 5 |
| 7 | Bob | password | 2 | 2 | 4 |
| 8 | Helen | Hello123 | 2 | 2 | 4 |
| 9 | Dave | pa£sword | 2 | 2 | 4 |
"Is there a way to do a similar table at the feature level that will
cause the entire feature to be executed for each row in the table?"
No, Specflow (and indeed the Gherkin language) doesn't have a concept of a "Feature Outline" i.e. a way of specifying a collection of features which should be run in their entirety.
You could possibly achiever what you are looking for by making use of Specflow tags to tag related scenarios. You could then use your test runner to trigger the testing of all the scenarios with that tag e.g.
#related
Scenario: A
Given ...etc...
#related
Scenario: B
Given ...etc.
SpecFlow+ Runner (aka SpecRun, http://www.specflow.org/plus/), provides infrastructure (called test targets) to be able to run the same test suite (or selected scenarios) with different settings. With this you can solve problems like the one you have mentioned. It can be also used to run the same web tests with different browsers, etc. Check this screencast for details: http://www.youtube.com/watch?v=RZYV4Dvhw3w

testing foreign keys with cucumber

I'm trying to set up the background for a cucumber Feature. Ideally I want to be able to do:
Given the following folders exist:
| id | parent_id | name |
| 1 | nil | folder1 |
| 2 | nil | folder2 |
| 3 | 2 | folder3 |
| 4 | 1 | folder4 |
| 5 | 1 | folder5 |
| 6 | 5 | folder6 |
However I can't do this as I can't set the ID of a particular model and so the first row may be created with an ID of 7 and therefore none of the other "child" rows can access it. Name is not unique so I can't do a find_by_name in the step definition. I've got a feeling it's gonna be some ugly nested array solution.
Any ideas how to achieve this?
I don't understand why you can't choose unique names for the purpose of your configuring the test?
The way I ended up doing it in my step definitions:
Given /^the following folders exist:$/ do |table|
table.hashes.each{|f|
folder = Folder.new(f)
folder.save
ActiveRecord::Base.connection.execute('UPDATE folders SET id = '+f['id'].to_s+' WHERE id = '+folder.id.to_s)
}
end

Resources