Is it wise to use Google Tables as a Database? - ruby-on-rails

I just found out about the Unhosted Movement.
I understand the points being made about the advantages over the classical web app approaches including a database being a sql or non-sql database.
From my point of view there are concerns regarding security and privacy. I believe the disadvantages outweigh the advantages. Especially if sensitive Data is involved.
I would love to hear about more pros/cons and experiences from you guys. Personally I would rather use Laravel/RoR or a similar Framework with scaffolding etc.

I'm about to try that. As far as security/privacy is concerned, you can grant limited access to tables and use ssl. Google still knows everything of course.
But fusion tables isn't a full blown database after all. Its sql is highly limited, you have no joins in SELECT, no GROUP BY in views and only left outer joins, no subqueries, no EXISTS clause, no users/transactions/locking/isolation levels etc, what might be the reason to use a database in the first place. It is also not meant to be that. There are also no standard connectors I'm aware of, so you'll have to use the API. The last post asking for a JDBC driver is some years old, and there still isn't any.

Related

Is there any concept of stored procedures in Cassandra?

For database management, my team right now is using a RDBMS based solution (MSSQL to be exact), but we expect to move to Cassandra soon as we're expecting a huge bump in traffic.
The application logic right now is decoupled from insertion logic, as the application only calls the specific procedures in SQL which calls some data validations and makes corresponding insertions.
I want to do something similar in Cassandra. However, I am unable to find anything that could aid me in doing so. UDFs are not useful as they are mostly used in SELECT query. I'd appreciate the community's help/advice on this, thanks!
The closest feature to a stored procedure will be a batch as it will allow you to "bundle" different DML statements associated to an insert, update or delete.
If you are moving from RDBMS to Cassandra, one of the biggest challenges is to adjust to the data modeling required, and more specific, to denormalization of data. The data model is the key factor of success (and failure) of any Cassandra implementation, and because of that, you may find several resources in the web (to mention the basics eBay blog, Datastax academy's Data model course)
Good luck with your implementation!

Using different datastores in the Rails same app?

So this is more or less an implementation question, this is the senario I have, basically we have an app which uses MySQL as it's datastore, user accounts, transactions etc, but we want to add in a robust charting feature and the data will be stored in Redis, now basically my question is:
Is it possible, and what are the best practices for integrating another datastore into an app which already depends on another one. Can I use Rack to generate the reports? etc...
I want to turn this into a sort of open discussion because I think the need for a solution like this is going to rise as we see more and more key/value stores that offer benefits far different than a RDBMS, an NoSQL stores as well. They all have their benefits but no one solution covers them all.
Thoughts?
You can have models that do not inherit ActiveRecord::Base. Add your preferred Redis client gem, do whatever config is necessary, and start writing Redis models.
I can try to reopen this topic, because should be very practical.
Have same issue with this. I want to replicate data from SQL to NoSQL. SQL used as main database storage, because data integrity, relations etc. And NoSQL as secondary database storage set for reading. In SQL you have much associations divided to much tables. Many one-to-one association saved in different tables for better readability. This associations should be saved as one document with NoSQL. It gives unbelievable speed. Only one load. Great for data exchange for API.
Do someone positive experience with replication SQL data to more consistent NoSQL documents?

Is entity framework a bad choice for multiple websites and large aplications?

Scenario : We currently have a website and are working on building couple of websites with an admin website. We are using asp.net-mvc , SQL Server 2005 and Entity Framework 4. So, currently we have a single solution that has all the websites and all the websites are using the same entity framework model. The Model currently has over 70 tables and will potentially have a lot more in the future... around 400?
Questions : Is Entity Framework model going to be slower when it is going to grow bigger? I have read quite a few articles where they say it is pretty slow due to the additional layers of mapping when as compared to say ado.net? Also , we thought of having multiple models but it seems that it is a bad practice too and is LINQ useful when we are not using any ORM?
So, we are just curious what and how all the large websites using a similiar technology as we have achieve good performance while using an ORM like EF or do they never opt for an ORM ? I have also worked on a LINQ to SQL application that had over 150 tables and we encurred a huge startup penalty, site took 15-20 seconds to respond when first loaded. I am pretty sure this was due to large startup cost of LINQ to SQL ORM. It would be great if someone can share their experience and thoughts regarding this ? What are the best practices to follow and I know it depends on every application but if performance is a concern then what are the best steps to be taken ?
I don't have a definite answer for you, but I have found this SO post: ORM performance cost, it will probably be informative for you, expecially the second highest answer mentioning this site:
http://ormbattle.net/
My personal experience is that for any ORM mapper I have seen so far, Joel's law of leaky abstraction applies heavily. So if you are going to use EF, make sure you have alternatives for optimization at hand.
I think you can certainly get EF4 to work in a performant way with a database with a large number of tables. That said, you will certainly have to overcome a number of hurdles that are specific to EF.
I don't think LinqToSql is a good alternative since Microsoft has stopped enhancing it for the most part.
What other alternatives have you considered? ADO.NET? NHibernate? Stored Procedures?
I know NHibernate may have trouble establishing the SessionFactory for 400 tables quickly, but that only happens once when the website application starts, which should be fairly rare if the application is used heavily. Each web request generally has a new Session and creating sessions from the session factory is very quick and inexpensive.
My biggest concern with EF is the management of the thing, if you have multiple models, then you're suddenly going to have multiple work to do maintaining them, making sure you never update the wrong model for the right database, or vice versa. This is a problem for us at the moment, and looks to only get worse.
Personally, I like to write SQL rather than rely on an abstraction on top of an abstration. The DB knows SQL and so I keep it happy with hand-crafted stored procedures, or hand-crafted SQL in some cases. One huge benefit to this is that I can reply code to see what it was trying to do, and see the resulting data by c&p the sql from the log to the sql query editor. That, in my opinion, makes support so much easier it entirely invalidates any programmer benefit you might have from using an ORM in the first place (especially as EF generates absolutely unreadable SQL).
In fact, come to think of it, the only benefit an ORM gives you is that you can code a bit quicker (once you have everything set up and are not changing the schema, of course), and ultimately, I don't think the benefit is worth the cost, not when you consider that I spend most of my coding time thinking about what I'm going to do as the 'doing it' part is relatively small cmpared to the design, test, support and maintain parts.

The Ruby community values simplicity...what's your argument for simplifying a db schema in a new project?

I'm working on a project with developers who have not worked with Ruby OR Rails before.
They have created a schema that is too complicated, in my opinion. The schema has 117 tables, and obtaining the simplest piece of information would require traversing/joining 7 tabels...and of course, there's no "main" table that serves as a sort of key between them. The schema renders many of the rails tools like 'find' method, and many of the has_many/belongs to relationships almost useless. And coding for all of these relationships will likely be more time-consuming than we have the money to code for.
THE QUESTION:
Assuming you are VERY convinced (IMHO...hehe) that the schema is not ideal, and there are multiple ways to represent the domain, how would you argue FOR simplifying the schema (aside from what I've already said)?
I'll stand up in 2 roles here
DBA: Database admin/designer.
Dev: Application developer.
I assume the DBA is a person who really know all the Database tricks. Reaallyy Knows.
DBA:
Database is the key of the application and should have predefined structure in order to serve its purpose well and with best performance.
If you cannot use random schema (which is reasonably normalised and good) then the tools are wrong.
Dev:
The database is just a data store, so we need to keep it simple and concentrate on the application.
DBA:
Database is not a store it is the core of the application. There is no application without database.
Dev:
No. The application is the core. There is no application without the front-end and the business logic applied to it.
And the war begins...
Both points are valid and it is always trade off.
If the database will ONLY be used by RoR, then you can use it more like a simple store.
If the DB can be used by other application OR it will be used with large amount of data and high traffic it must enforce some best practices.
Generally there is no way you can disagree with DBA.
But they can understand your situation and might allow you to loose the standards a bit so you could be more productive.
So you need to work closely, together.
And you need to talk to each other to explain and prove the point why database should be like this or that.
Otherwise, the team is broken and project can be failure with hight probability.
ActiveRecord is a very handy tool. But it cannot do everything for you. It does not provide Database structure by default that you expect exactly. So it should be tuned.
On the other side. If DBA can accept that all PKs are Auto incremented integers that would make Developer's life easier (ActiveRecord does it by default).
On the other side, if developers would accept some of DBA constraints it would make DBA's life easier.
Now to answer your question:
how would you argue FOR simplifying the schema
Do not argue. Meet the team and deliver the message and point on WHY it should be done.
Maybe it really shouldn't and you don't know all the things, maybe they are not aware of something.
You could agree on the general structure of the database AND try to describe it using RoR migrations as a meta language.
This way they would see the general picture, and you would use your great ActiveRecords.
And also everybody would be on the same page.
Your DB schema should reflect the domain and its relationships.
De-normalisation should only be done when you have measured that there is a performance problem.
7 joins is not excessive or bad, provided you have good indexes in place.
The general way to make this argument up the chain is based on cost. If you do things simply, there will be less code and fewer bugs. The system will be able to be built more quickly, or with more features, and thus will create more ROI. If you can get the money manager on board with that approach, he or she may let you dictate terms to the team. There is the counterargument that extreme over-normalization prevents bad data, but I have found that this is not the case, as the complexity it engenders tends to lead to more errors and more database code in general.
The architectural and technical argument here is simple. You have decided to use Ruby on Rails. Therefore you have decided to use the ActiveRecord pattern. The ActiveRecord pattern is driven by having the database tables match the object model. That's the pattern in use here, and in many other places, so the best practices they are trying to apply for extreme data normalization simply do not apply. Buy a copy of Patterns of Enterprise Application Architecture and put the little red bookmark at page 160 so they can understand how the pattern works from the architecture perspective.
What the DBA types tend to be unaware of is how much work ActiveRecord does for you, from query generation, cascading deletes, optimistic locking, auto populated columns, versioning (with acts_as_versioned), soft deletes (with acts_as_paranoid), etc. There is a strong argument to use well tested, community supported library functions to perform these operations versus custom code that must be maintained by a DBA.
The real issue with DBAs is then that they need some work to do. Let them focus on monitoring performance, finding slow queries in the code, creating indexes and doing backups.
If you end up losing the political battle for a sane schema, you may want to consider switching to DataMapper. It's the next pattern in PoEAA. The other thing you may be able to get them to do is to create views in the database that correspond to the object model. This way, you could use many of the finding capabilities in the ActiveRecord model based on the views, but have custom insert, update, and delete methods.

If using LINQ to SQL is there any good reason to learn SQL queries/syntax anymore?

I do understand SQL querying and syntax because of previous work using ASP.NET web forms and stored procedures, but I would not call myself an "expert" in it.
Since I have been using ASP.NET MVC and LinqToSql it seems that so much of the heavy lifting is done for me and encapsulated away at the SQL end that I'm questioning whether there is any benefit in continuing to top-up my knowledge of SQL queries or whether I'm better off focusing my "learning time" on other things.
Your thoughts?
You should absolutely know SQL and keep your knowledge up-to-date. ORM is designed to ease the pain of doing something tedious that you know how to do, much like a graphing calculator is designed to do something that you can do by hand (and should know how).
The minute you start letting your ORM do things in the database that you don't fully understand is the minute you've lost control over your model.
In my opinion, knowing SQL is more valuable than any vendor specific technology. There will always be cases when those nice prepackaged frameworks will not be able to solve a particular situation and knowledge of advanced SQL will be required.
It is still important to learn SQL queries/syntax. The reason is you need to at least understand how Linq to SQL translate to the database behind the scenes.
This will help you when you find problems, for example something not updating correctly. Or a query performance needs to increase.
It is the same that you need to understand what assembly language is and how it eventually becomes machine language. However in all you don't have to be an expert, but at least be able to write in it and understand it.
It is still important to know SQL and the paradigm (set-based) behind it to be able to create efficient SQL statements, even if your using LinqToSql or any other OR/M.
There will always be situations where you will want to write the query in native SQL because it is not possible to write it in LinqToSql / HQL / whatever, or LinqToSql is just not able to generate a performant query for it.
There will always be situations where you will want to execute an ad-hoc query on a database using native sql, etc...
I think LinqToSQL (or other Linq to SQL providers) should not prevent you of knowing SQL.
When your query is not returning what you expect, or when it takes 30 minutes to run on the production database, you'd better be able to understand what LTS has generated, and why it is failing.
I know, it's a rehashed topic, and it might not be applicable to what you do ("small" database that will never hit that kind of problem etc), but it pays not to get too oblivious of abstraction layers sometimes.
The other reason is, Linq does not the whole range of what you can do in SQL, so you might have to resort to writing "raw" SQL, even if the result is materialised as objects.
It depends what you're working on, and from what you said it might make more sense to focus on other areas.
Having said that I find knowing SQL allows the following:
The ability to write queries to extract data from systems easily.
For adhoc queries, or for checking things.
The ability to write complex stored procedures, which allows me to group complex data processing in one place, where it should be, in the database.
The ability to fine tune LinqToSql by adding indexes, and understanding the SQL/query plan's it procedures.
Most of these are more of a help on more complex systems, so if you're not working on those it might not be as much of a help.
It may help in your situation to list the technologies which might be of use, and then prioritise them.
In order words make a development plan for yourself, which may encompass more then just learning technical knowledge but allow a more broad focus like design patterns, communication skills and other areas.
SQL is a tool. Linq to SQL is also a tool. Having more tools in your belt is a good thing. It'll give you more perspectives when attacking a problem.
Consider a scenario where you may want to do multiple queries or multiple updates to the db in one operation. If you can write TSQL you can potentially save yourself a lot of roundtrips to the database.
I would say you definately need to know your SQL in depth, because you need to know what code your Linq-expression generates and what effects the code will have if you want high performing queries. Sure you might get the job done in most cases, but sometimes there is a huge difference in performance in very subtle difference in Linq-syntax.
I ran into this this morning actually, where I had done .Any(d => d.Id == (...).First().Id) instead of doing where (...).Any(i => i.Id == d.Id). This resulted in the query executing five times slower.
Sometimes you need to analyze the actual Sql-query to realise the mistakes you make.
Its always a good think to learn the underlying language for stuff like Linq To SQL. SQL is pretty much standardized and it will help you understand a new paradigm in programming.
You may not always be working in .NET.
Doesn't hurt to know the underlying concepts.
LINQ to SQL is not being maintained anymore in favor of the Entity Framework
Sooner or later you will run into problems that need at leat a working knowledge of SQL to solve. And sooner or later you will run into requirements that are best realised in the DB (whether in SP-s or in triggers or views or whaterver).
LINQ To SQL will only work with .NET. IF you happen get another job where you are not working with .NET, then you will have to go back to writing Stored Procs.
Knowing SQL will also give you a better understanding of how the server operates as well as possibly making you a better database designer.

Resources