Normalization - 2NF and 3NF - normalization

https://dba.stackexchange.com/questions/98427/normilsation-2nf-and-3nf
I've been through several questions and youtube tutorials; I understand that 2NF is removing the partial dependencies and 3NF is the transitive ones, but I can't get my head around how the following example should look like in 2NF.
studentID | studentName | courseCode | courseTitle | modCode | modTitle | credits | resultCode
My attempt is the following for 2NF:
Student
studentID | studentName | courseCode | modCode | resultCode
Course
courseCode | courseTitle
Module
modCode | modTitle | credits
Is this correct? If not, where am I going wrong and why.
The following is the 3NF:
Student
studentID | studentName | courseCode
Course
courseCode | courseTitle
Module
modCode | modTitle | credits | courseCode
Results
studentID | modCode | resultCode
Same goes for this; is this correct - if not where and why?

Ok, let's have a look at your 2NF attempt:
Student
studentID | studentName | courseCode | modCode | resultCode
Course
courseCode | courseTitle
Module
modCode | modTitle | credits
Let's talk about Student first.
Your key cannot be studentID, because the resultCode depends on studentID and courseCode (you have one result for every course). But your studentName depends only on studentID, so a part of the key (studentID,courseCode). So 2NF is violated. You need to do something like this:
Student
studentID | studentName | courseCode | modCode
Course
courseCode | courseTitle
Module
modCode | modTitle | credits
Result
studentID | courseCode | resultCode
But please keep in mind, that this respects the 2NF, but still seems not correct, because modules and courses are now completely unrelated. So try this:
Student
studentID | studentName | modCode
Module
modCode | modTitle | credits
Course
courseCode | courseTitle | modCode
Result
studentID | courseCode | resultCode
A course belongs to a module (a module has many courses) - from my point of view. This is automatically in 2NF (because the key is always just one attribute (except for the result, but there is just one non-key-attribute always depending on both key attributes), so nothing can depend on a "part" of the key). And it is in 3NF, too, because every "physical" entity has it's logical representation in the data model (more a rule of thumb than a formalism).
Now let's have a look at your 3NF attempt. I think you've got the module<->course dependency the wrong way round, but let's just concentrate on the normalization.
Student
studentID | studentName | courseCode
Course
courseCode | courseTitle
Module
modCode | modTitle | credits | courseCode
Results
studentID | modCode | resultCode
This is correct 3NF, because there is simply no other key candidate than the key itself. So there can't be any transitive dependency.
To clarify this: a key candidate is one of the many minimal sets of possible keys. In your relation you have found at least one key candidate with one element (except for the results relation). So any other key candidate cannot have more than one element. That means, that you can simply look at every single attribute and decide "can this be a key or not?" And in your example you find, that no other attribute can be a key - so it's automatically in 2NF and 3NF.

Related

Do my tables in the db "HAVE" to be plural?

I would like to display Snort IPS events in my dashboard app, these events are written into a database via barnyard2. I'm going threw rails for zombies and they say to access info from tables out of a database the table should be plural, and the model is singular. You would then expect the table with the IPS events to be events and not event.
Barnyard2 comes with a create_postgres schema script. Below are the tables created by the script. They are all singular.
csdashboard=# \dt
List of relations
Schema | Name | Type | Owner
--------+-------------------+-------+-------------
public | data | table | csdashboard
public | detail | table | csdashboard
public | encoding | table | csdashboard
public | event | table | csdashboard
public | icmphdr | table | csdashboard
public | iphdr | table | csdashboard
public | opt | table | csdashboard
public | reference | table | csdashboard
public | reference_system | table | csdashboard
public | schema | table | csdashboard
public | sensor | table | csdashboard
public | sig_class | table | csdashboard
public | sig_reference | table | csdashboard
public | signature | table | csdashboard
public | tcphdr | table | csdashboard
public | udphdr | table | csdashboard
I do not need or want my app to enter anything into these tables, just find and display the info.
My questions are;
Do I need to create a model per table that I will be getting info from?
Since they should be plural, will this create a problem?
You do not have to stick with all conventions and there is an easy way to alter this one.
You can set table name in your model:
class Event < ActiveRecord::Base
self.table_name = 'event'
end
Yes. Yes.
You need a model per table. Without that you'll have big problems with ActiveRecord.
You'll have some other problems if your table name is not a plural of your model, once again with ActiveRecord.

Database Design for state, cities and districts

I have users represented in a user table and need to design a model to associate them with state/cities/districts that they choose:
On the database side,
Each user will be associated with 1 state, 1 city and a number of districts within that state/city combination. For instance, User A can choose to be associated with "NY" and "Brooklyn" and any X number of districts in "Brooklyn" (or none).
On the view side,
I'd like to present the district choices with checkboxes so they should be able to be pulled from the database field with simple_form in Rails pretty easily.
The design of the database should make it easy to query for the user and get the associated state / city and district relations that the user has chosen.
One idea I have is to simply have a one-to-many field for districts and a district table listing all the different districts. However, is there a way to enforce that the districts have to be valid for the city/state combination on the backend using validate?
Any tips would be appreciated.
Below I have outlined the database schema I would use based on the information you have given.
Every city belongs to exactly one state.
cities
id unsigned int(P)
state_id unsigned int(F states.id)
name varchar(50)
+----+----------+---------------+
| id | state_id | name |
+----+----------+---------------+
| 1 | 33 | New York City |
| .. | ........ | ............. |
+----+----------+---------------+
See ISO 3166 for more information. You didn't ask for countries but it's trivial to add them...
countries
id char(2)(P)
iso3 char(3)(U)
iso_num char(3)(U)
name varchar(45)(U)
+----+------+---------+---------------+
| id | iso3 | iso_num | name |
+----+------+---------+---------------+
| ca | can | 124 | Canada |
| mx | mex | 484 | Mexico |
| us | usa | 840 | United States |
| .. | .... | ....... | ............. |
+----+------+---------+---------------+
Every district belongs to exactly one city.
districts
id unsigned int(P)
city_id unsigned int(F cities.id)
name varchar(50)
+----+---------+-----------+
| id | city_id | name |
+----+---------+-----------+
| 1 | 1 | The Bronx |
| 2 | 1 | Brooklyn |
| 3 | 1 | Manhattan |
| .. | ....... | ......... |
+----+---------+-----------+
See ISO 3166-2:US for more information. Every state belongs to exactly one country.
states
id unsigned int(P)
country_id char(2)(F countries.id)
code char(2)
name varchar(50)
+----+------------+------+----------+
| id | country_id | code | name |
+----+------------+------+----------+
| 1 | us | AL | Alabama |
| .. | .......... | .... | ........ |
| 33 | us | NY | New York |
| .. | .......... | .... | ........ |
+----+------------+------+----------+
Based on your information a user belongs to exactly one city. In the example data Bob is associated with New York City. By joining tables you can very easily find that Bob is in New York state and the country of United States.
users
id unsigned int(P)
username varchar(255)
city_id unsigned int(F cities.id)
...
+----+----------+---------+-----+
| id | username | city_id | ... |
+----+----------+---------+-----+
| 1 | bob | 1 | ... |
| .. | ........ | ....... | ... |
+----+----------+---------+-----+
Users can belong to any number of districts. In the example data Bob belongs to The Bronx and Brooklyn. user_id and district_id form the Primary Key which insures a user cannot be associated with the same district more than once.
users_districts
user_id unsigned int(F users.id) \_(P)
district_id unsigned int(F districts.id) /
+---------+-------------+
| user_id | district_id |
+---------+-------------+
| 1 | 1 |
| 1 | 2 |
| ....... | ........... |
+---------+-------------+
My database model does NOT enforce the rule that the districts a user belongs to must be in the city that user belongs to - in my opinion that logic should be done at the application level. If Bob moves from New York City to Baltimore I think all of his records should be deleted from the users_districts table and then add any new ones for his new city.
As for the user interface, I would have the user:
Select a country - this will auto-populate a drop down list of associated states.
Select a state - this will auto-populate a drop down list of associated cities.
Select a city - this will auto-populate a list of associated districts.
Allow the user to select any number of districts.
You will need some combination of database and application-level logic.
Here is how I would build the database fields:
users = id, <other user fields>, city_id
districts = id, <other district fields>, city_id
cities = id, name, state_id
states = id, name
And then in the application, set it up so that the user can type in one city and multiple districts, and can not edit the state (view only):
When the user types in a city - maybe through a autocomplete field - it automatically updates the read-only state field with the state of the city
When the user types in a district, list only the districts that have district.city_id == cities.id
If you don't want to restrict the district selection in the UI, you will need to enforce the district.city_id == cities.id check in your application, though I personally think that's less intuitive than doing it right in the front-end UI.
Indian States AND UT MySQL QUERY
INSERT INTO `states`
VALUES
(1,'Andhra Pradesh'),
(2,'Telangana'),
(3,'Arunachal Pradesh'),
(4,'Assam'),
(5,'Bihar'),
(6,'Chhattisgarh'),
(7,'Chandigarh'),
(8,'Dadra and Nagar Haveli'),
(9,'Daman and Diu'),
(10,'Delhi'),
(11'Goa'),
(12,'Gujarat'),
(13,'Haryana'),
(14,'Himachal Pradesh'),
(15,'Jammu and Kashmir'),
(16,'Jharkhand'),
(17,'Karnataka'),
(18,'Kerala'),
(19,'Madhya Pradesh'),
(20,'Maharashtra'),
(21,'Manipur'),
(22,'Meghalaya'),
(23,'Mizoram'),
(24,'Nagaland'),
(25,'Orissa'),
(26,'Punjab'),
(27,'Pondicherry'),
(28,'Rajasthan'),
(29,'Sikkim'),
(30,'Tamil Nadu'),
(31,'Tripura'),
(32,'Uttar Pradesh'),
(33,'Uttarakhand'),
(34,'West Bengal'),
(35,'Lakshadweep'),
(36,'Ladakh ');

select distinct records based on one field while keeping other fields intact

I've got a table like this:
table: searches
+------------------------------+
| id | address | date |
+------------------------------+
| 1 | 123 foo st | 03/01/13 |
| 2 | 123 foo st | 03/02/13 |
| 3 | 456 foo st | 03/02/13 |
| 4 | 567 foo st | 03/01/13 |
| 5 | 456 foo st | 03/01/13 |
| 6 | 567 foo st | 03/01/13 |
+------------------------------+
And want a result set like this:
+------------------------------+
| id | address | date |
+------------------------------+
| 2 | 123 foo st | 03/02/13 |
| 3 | 456 foo st | 03/02/13 |
| 4 | 567 foo st | 03/01/13 |
+------------------------------+
But ActiveRecord seems unable to achieve this result. Here's what I'm trying:
Model has a 'most_recent' scope: scope :most_recent, order('date_searched DESC')
Model.most_recent.uniq returns the full set (SELECT DISTINCT "searches".* FROM "searches" ORDER BY date DESC) -- obviously the query is not going to do what I want, but neither is selecting only one column. I need all columns, but only rows where the address is unique in the result set.
I could do something like Model.select('distinct(address), date, id'), but that feels...wrong.
You could do a
select max(id), address, max(date) as latest
from searches
group by address
order by latest desc
According to sqlfiddle that does exactly what I think you want.
It's not quite the same as your requirement output, which doesn't seem to care about which ID is returned. Still, the query needs to specify something, which is here done by the "max" aggregate function.
I don't think you'll have any luck with ActiveRecord's autogenerated query methods for this case. So just add your own query method using that SQL to your model class. It's completely standard SQL that'll also run on basically any other RDBMS.
Edit: One big weakness of the query is that it doesn't necessarily return actual records. If the highest ID for a given address doesn't corellate with the highest date for that address, the resulting "record" will be different from the one actually stored in the DB. Depending on the use case that might matter or not. For Mysql simply changing max(id) to id would fix that problem, but IIRC Oracle has a problem with that.
To show unique addresses:
Searches.group(:address)
Then you can select columns if you want:
Searches.group(:address).select('id,date')

SelfJoin using Symfony 1.4/propel 1.4

I need to do self join using Symfony 1.4/Propel 1.4. My tables/db are too big to put here but an example table is given below to replicate the issue I'm facing.
Consider following example table with example data
Table Employee
----------------------------------------
|id | name | mid |
----------------------------------------
|1 | CEO |NULL |
|2 | CTO |1 |
|3 | CFO |1 |
|4 | PM1 |2 |
|5 | TL1 |4 |
----------------------------------------
Here first column is employee, second is employee name and 3rd is manager id. mid is link to another row in same table. For example, CTO(2) reports to CEO(1) so mid in second row is 1.
I need following output:
---------------------
|ename | manager |
---------------------
|CTO | CEO |
|CFO | CEO |
|PM1 | CTO |
|TL1 | PM1 |
---------------------
The SQL query will be:
SELECT e.name,m.name
FROM employee e, employee m
WHERE e.mid=m.id
AND e.mid NOT NULL;
My problem is, how do I write same query in Symfony/Propel 1.4? I try following
$c = new Criteria();
$c->clearSelectColumns();
$c->addSelectColumn(EmployeePeer::NAME.' as ename');
$c->addSelectColumn(EmployeePeer::NAME.' as manager');
$c->setPrimaryTableName(EmployeePeer::TABLE_NAME);
$c->addJoin(EmployeePeer::MID, EmployeePeer::ID, Criteria::INNER_JOIN);
$c->add(EmployeePeer::MID, NULL, Criteria::EQUAL);
Even I know this query do not make any sense and as per my expectation, I got PropelException.
But self join is one of the common database operation and I'm sure Propel must support that. Can someone please tell how to achieve above requirements in Symfony/Propel 1.4
According to this SQLFiddle, the SQL you want to perform is:
SELECT e.name as ename, m.name as manager
FROM employee e
LEFT JOIN employee m ON e.mid = m.id WHERE e.mid IS NOT NULL;
Like YouthPark, I think addAlias is the solution and I will do something like that:
$c = new Criteria();
$c->clearSelectColumns();
$c->addSelectColumn(EmployeePeer::NAME.' as ename');
$c->addSelectColumn(EmployeePeer::NAME.' as manager');
$c->addAlias('c2', EmployeePeer::TABLE_NAME);
$c->addJoin(EmployeePeer::ID, EmployeePeer::alias('c2', EmployeePeer::MID), Criteria::LEFT_JOIN);
$c->add(EmployeePeer::MID, Criteria::ISNOTNULL);
I'm not sure about the addSelectColumn part, by the way.
Well I never tried so not sure if that help you or not but there is no other answers so you might try/further search addAlias method, if you are stuck.
$notifCrit->addAlias("A", ThreadsPeer::TABLE_NAME);
$notifCrit->add("A.father_id", ThreadsPeer::FATHER_ID."=A.father_id", Criteria::CUSTOM);
Taken from last comment of old symfony forums
Not sure but Propel 1.4 might not support self join with build in methods as it need to set alias. So you need custom query as in above example.
$c = new Criteria();
$c->addJoin(ArticlePeer::AUTHOR_ID, AuthorPeer::ID);
$c->add(AuthorPeer::NAME, 'John Doe');
$articles = ArticlePeer::doSelect($c);

Linq query for text localization

I use EntityFramework and need assistance with LINQ query.
Im building an application that will store articles.
Same article can be translated to many languages.
So I have 2 tables:
Article table:
ArticleId
ResourceTitleId (FK: LocalizedContent.ResourceId)
ResourceContentId (FK: LocalizedContent.ResourceId)
LocalizedContent table:
ResourceId
LanguageId
Content
So, for sake of example, if I have article in English and Russian,
I would store one row in Article table which would look like that:
ArticleId | ResourceTitleId | ResourceContentId |
-----------|-----------------|-------------------|
1| 1 | 2 |
And then, LocalizedContent table will look like this:
ResourceId | LanguageId | Content |
------------|------------|---------|
1| 1 | aaa |
------------|------------|---------|
1| 2 | zzz |
------------|------------|---------|
2| 1 | bbb |
------------|------------|---------|
2| 2 | yyy |
And now for the question:
I want to select an article by language id (lets say English), and I want my result to look like that:
ArticleId | ResourceTitle | ResourceContent |
-----------|---------------|-----------------|
1| aaa | bbb |
How do I perform LINQ query that will retrieve me that result in one query?
Just perform an inner join between the two tables filtering them by LanguageId.
var english = 1;
var query =
from article in dc.Articles
join resourceTitle in dc.LocalizedContent
on article.ResourceTitleId equals resourceTitle.ResourceId
join resourceContent in dc.LocalizedContent
on article.ResourceContentId equals resourceContent.ResourceId
where resourceTitle.LanguageId == english
&& resourceContent.LanguageId == english
select new
{
article.ArticleId,
ResourceTitle = resourceTitle.Content,
ResourceContent = resourceContent.Content,
};

Resources