I was wondering if there are any complications when increasing in levels for page tables. So let's say I'm moving from a two-level page table to a multi-level page table for page table entries, what could cause some problems as I increase in levels?
More processing required for memory translation by the CPU.
Related
Hello and good morning.
I am working on a side project where I am adding an analytic board to an already existing app. The problem is that now the users table has over 400 columns. My question is that what's a better way of organizing this table such as splintering the table off into separate tables. How do you do that and how do you communicate the tables between the new tables?
Another concern is that If I separate the table will I still be able to save into it through the user model? I have code right now that says:
user.wallet += 100
user.save
If I separate wallet from user and link the two tables will I have to change this code. The reason I'm asking this is that there is a ton of code like this in the app.
Thank you so much if you can help me understanding how to organize a database. As a bonus if there is a book that talks about database organization can you recommend it to me (preferably one that is in rails).
Edit: Is there also a way to do all of this without loosing any data. For example transfer the data to a new column on the new table then destroying the old column.
Please read about:
Database Normalization
You'll get loads of hits when searching for that string and there are many books about database design covering that subject.
It is most likely, that this table of yours lacks normalization, but you have to see yourself!
Just to give an orientation - I would get a little anxious when dealing with a tenth of that number of columns. That saying, I clearly have to stress that there might be well normalized tables with 400 columns as well as sloppily created examples with just 10 columns.
Generally speaking, the probability of dealing with bad designed tables and hence facing trouble simply rises with the number of columns.
So take your time and if you find out, that users table needs normalization next step would indeed be to spread data over several tables. Because that clearly (and most likely even heavily) affects the coding of your application here is where you thoroughly have to balance pros and cons - simply impossible to judge that from far away.
Say, you have substantial problems (e.g. fierce performance problems - you wouldn't post it) that could be eased by normalization there are different approaches of how to split data. Here please read about:
Cardinalities
Usually the new tables are linked by
Foreign Keys
, identical data (like a user id) that appear in multiple tables and that are used to join them, that is.
And finally, yes, you can do that without losing data as the overall amount of information never changes when normalizing.
In case your last question was meant to be technical: There is no problem in reading data from one column and inserting them into a new one (of a new table). That has to happen in a certain order as foreign keys have to be filled before you can use them. See
Referential Integrity
However, quite obvious: Deleting data and dropping columns interferes with the operability of your application. Good planning is due.
I'm only just starting to learn about views in ActiveRecord from reading a few blog posts and some tutorials on how to set them up in Rails.
What I would like to know is what are some of the pros and cons of using a View instead of a query on existing ActiveRecord tables? Are there real, measurable performance benefits of using a view?
For example, I have a standard merchant application with orders, line items, and products. For an admin dashboard, I have various queries, many of which ping a query that I reuse a lot in the code - namely one that returns user_id, order_id and total_revenue from that order. From a business perspective, many good stats are based off that core query. At what point does it make sense to switch to to a view instead?
The ActiveRecord docs on Views are also a bit sparse so any references to some good resources on both the why and how would be greatly appreciated.
Clarification: I am not talking about HTML views but SQL database views. Essentially, are the performance wins maintained in an ActiveRecord implementation? Since the docs are sparse, are there any potential obstacles to those performance wins that could be lost if you implement them incorrectly (i.e., any non-obvious gotchas)?
I got this information from another developer off-line, so I am answering my own question here in case other people stumble upon it.
Basically, ActiveRecord starting in Rails 3.1 began implementing prepared statements which preprocess and cache SQL statement patterns ahead of time which later allow faster query responses. You can read more about it in this blog post.
This actually might result in not much benefit in switching to views in PostgreSQL since views may not perform much better than prepared statements in PG.
The PostgreSQL documentation on prepared statements seems clear and well-written. More thoughts on PostgreSQL views performance can be found in this stackoverflow post.
Additionally, it's probably much more likely that your Rails app has performance issues due to N+1 queries - this is a great post that explains the problem and one of the easiest ways to prevent it with eager loading.
This question is not directly related to ActiveRecord and it seems more a database related question to me. The answer is "it depends". Many times people use view because they want to:
represent a subset of the data contained in a table
simplify your query by join multiple tables into a virtual table represented by the view
do aggregation in the view
hide complexity of your data
for some security reasons
etc.
But most of the aforementioned features can be implemented by using raw tables. It's just a little complicated than using views. Another place you may considering using a view is for performance reason. That's materialized view or indexed view in SQL Server. Basically it saved a copy of your data in the view in the form you want and it can greatly boost performance.
I'm trying to understand factors that slow a website down. Say I define Example.all somewhere on a page. Would adding more attributes to my Example model significantly increase the time needed for the website to load, even if I don't use the said attributes on that page, since the server might have to iterate through more columns?
Depends.
Adding more attributes to your Example model will definitely increase the load time. But compared with other factors, it will just be too minor.
A couple things to speed your page loading
Cache your example model.
Use pagination, only load needed amount of data in one single page.
Only select columns you need. Example:
Example.select(“Name”, “Date”).where(Score: 0)
Consider add profiling tools to better measure your load time composition. Check Miniprofiler out and the example how to use it http://railscasts.com/episodes/368-miniprofiler
I want to do a ranking. I have in my database many users and I want to do a ranking with them, but I don't know how to.
I need to get data from many tables in my database for some calculations. I already do this in a view. It will show the data and calculations, but I have a problem. Is it possible to sort the table columns?
This way, I'm sorting in the view rather than the database.
I'm searching, but I've only found references to sort tables within the database.
Sorry for my english, I'm learning.
Thanks
My fact table holds a user score in a course he took. Some of the details of the course, which I have to show on the report, comes from more then one table (in the actual OLTP db).
Do I create a none normalized version of that course entry in a dimension table?
Or do I just join the fact table directly to the course table join to the other tables that describe this course (course_type,faculty who created this course etc)
Snowflaking or bridge tables do make the joins more complicated, and not just from a coding perspective, it also makes it less simple for BI users.
In most cases, I would put these directly in existing or additional dimension tables.
For instance, you have a scores fact table, which has the user details in a dimension which may or may not hold demographics on the user (perhaps it's only a bridge). Sometimes it is better to split out demographic information. So even though the gender and age might be associated with a user entity, in the dimensional model, these might be individual dimensions or lumped into a single dimension - all depending on the usage scenarios.
Perhaps your scores are attached to a state and states have regions (snowflake). It might be far more efficient for analysis to have the region dimension linked directly instead of going through the state dimension.
I think what you will find is that the dimensional model is a very pragmatic denormalization approach. The main things which are non-negotiable are the facts - after that the choice of dimensions is very much informed by the behavior of the data, your foresight for common usage scenarios - and avoiding falling into the too few dimensions and too many dimensions problems.
Maybe I do not understand your question, but a fact table in a star schema is supposed to be joined to dimension tables surrounding it.
If you do not feel like making joins, simply create a view, and use the view for reporting.
If you were to post a model (schema), it would be easier to comment/help.
It is a common practice to consolidate several dimensions together, sacrificing normalization in favor of performance. This is usually done when your typical query will need all dimensions together (as opposed to using different bits for different use cases).
Also remember that while you receive a reduction in join overhead, there are some drawbacks:
Loss of flexibility, which might hinder development as the warehouse expands
Full table scans take longer (in traditional row-based RDBMS such as SQL Server)
Disk space consumption
You will have to consider each case separately.
It might be worthwhile to also consider the option of creating a materialized view, if such ability is offered by your RDBMS.
We commonly have a snowflake schema as the physical DWH design, but add a reporting view layer that flattens the snowflake schema into a star schema.
This way your OLAP cube becomes much simpler adn easier to manage.