I'm trying to store two types of product identifications together, one for each product color for each product.
I want to do this without creating another table, just by storing a set of id pairs. My form looks like this:
How can i submit and store this data in a consistent way? Is Hstore a good alternative for this?
Why don’t you want to create another table? Your data model seems to necessitate it. I would recommend avoiding database-specific solutions when a Rails-based one (adding another model) would be more universal and is trivially supported.
Related
I am quite a newbie to Cube.js. I have been trying to integrate Cube.js analytics functionality with my Ruby on Rails app. The database is PostgreSQL. In a database, there is a certain column called answers_json with jsonb data type which contains a nested hash. An example of data of that column is:
**answers_json:**
"question_weights_calc"=>
{"314"=>{"329"=>1.5, "331"=>4.5, "332"=>1.5, "333"=>3.0},
"315"=>{"334"=>1.5, "335"=>4.5, "336"=>1.5, "337"=>3.0},
"316"=>{"338"=>1.5, "339"=>3.0}}
There are many more keys in the same column with the same hash structure as shown above. I posted the specific part because I would be dealing with this part only. I need assistance with accessing the values in the hash. The column has a nested hash. In the example above, the keys "314", "315" and "316" are Category IDs. The keys associated with Category ID "314" are "329","331","332", "333"; which are Question IDs. Each category will have multiple questions. For different records, the category and question IDs will be dynamic. For example, for another record, Category ID and Question IDs associated with that category id will be different. I need to access the values associated with the key question id. For example, to access the value "1.5" I need to do this in my schema file:
**sql: `(answers_json -> 'question_weights_calc' -> '314' ->> '329')`**
But the issue here is, those ids will be dynamic for different records in the database. Instead of "314" and "329", they can be some other numbers. Adding different record's json here for clarification:
**answers_json:**
"question_weights_calc"=>{"129"=>{"273"=>6.0, "275"=>15.0, "277"=>8.0}, "252"=>{"279"=>3.0, "281"=>8.0, "283"=>3.0}}}
How can I know and access those dynamic IDs and their values since I also need to perform mathematical operations on values. Thanks!
As a general rule, it's difficult to run SQL-based reporting on highly dynamic JSON data. Postgres does have some useful functions for dealing with JSON, and you might be able to use json_each or json_object_keys plus a few joins to get there, but its quite likely that the performance and maintainability of such a query would be difficult to say the least 😅 Cube.js ultimately executes SQL queries, so if you do go the above route, the query should be easily transferrable to a Cube.js schema.
Another approach would be to create a separate data processing pipeline that collects all the JSON data and flattens it into a single table. The pipeline should then store this data back in your database of choice, from where you could then use Cube.js to query it.
I have setup a time-series / events database using the AWS Firehose -> S3/Glue -> Athena stack. It is being used to track various user actions - session started, action performed etc. across a number of our products. My question is about how best to store different types of IDs in this system.
The existing schema is one big 'fact table' with a bunch of different columns. Two of the most important columns are event_type_id and object_id. To use StackOverflow as an example, two events might be:
question_asked - in this case I would be storing the question id in the object_id column.
tag_created - in this case I would be storing the tag id in the object_id column.
My question is - is storing multiple different types of IDs in the same column bad practice? It's working OK for us at the moment, but it does require the person/system performing queries to know what type of object the object_id column refers to, based on the event they are querying.
If bad practice, what other approaches might be better? Multiple columns where they are NULL if not relevant for the event in that row? Or is this where dimension tables would be a better fit?
This isn't necessarily bad practice, depending on how you use it.
It sounds like you're aware of the potential pitfalls of such an approach (i.e. users of the data have to be aware of the context - in this case "event type" - to use the values correctly), so as you're using Athena you could mitigate that by creating views over source table for different event types, inserting a WHERE clause filter on event type and possibly renaming object_id to something more context specific e.g. question_id.
This makes it easier for users to work with the data and understand exactly what the values are they're working with.
In a big data environment I wouldn't recommend creating dimension tables if it can be avoided as JOINs between tables start to get expensive. Having multiple columns for different ids is possible but then you create new problems for users such as having to account for NULL values in an Id column, and this also potentially makes it harder to add new event types and ids as you have to change the schema to accommodate them.
I'm using Rails 3.2 and MariaDB. I have this group of data:
description, services, facilities
Not indexed and purely for output in the show page. Should I store these as one JSON object in one more_info attribute or store as separate attributes?
I personally would make columns for them, it would generally make the fields easier to work with, especially if there will be need a to update the values. I usually reserve JSON serialized fields when I do not know how many attributes there will be.
If you are showing the data to your users I would recommend saving them in different columns. I find that as soon as users see something they want to filter by it or work with it in ways you have not foreseen.
If you are not then the choice is less clear cut but the very fact you have 3 distinct groups suggest that they are different things which could be treated differently as your application matures.
I would always go with the Normalised form unless you have documented reasons not to.
I want to allow users to create drafts of several models (such as article, blog post etc). I am thinking of implementing this by creating a draft model for each of my current models (such as articleDraft, blogpostDraft etc.). Is there a better way to do this? Creating a new model for every existing model that should support drafts seems messy and is a lot of work.
I think the better was is to have a flag in the table (ex: int column called draft), to identify if the record is a draft or not.
Advantages of having such a column with out a separate table, as I can see:
It's easy to make your record non-draft (just change the flag)
you will not duplicate data (because practically you will have the same in draft and non-draft records)
coding will be easy, no complex login
all the data will be in one place and hence less room for error
I've been working on Draftsman, a Ruby gem for creating a draft state of your ActiveRecord data.
Draftsman's default approach is to store draft data for all drafted models in a single drafts table via a polymorphic relationship. It stores the object state as JSON in an object column and optionally stores JSON data representing changes in an object_changes column.
Draftsman allows for you to create a separate draft model for each model (e.g., article_drafts, blog_post_drafts) if you want. I agree that this approach is fairly cumbersome and error-prone.
The real advantage to splitting the draft data out into separate models (or to just use a boolean draft flag on the main table, per sameera207's answer) is that you don't end up with a gigantic drafts table with tons of records. I'd offer that that only becomes a real problem when your application has a ton of usage though.
All that to say that my ultimate recommendation is to store all of your draft data in the main model (blog) or a single drafts table, then separate out as needed if your application needs to scale up.
Check out the Active Record Versioning category at The Ruby Toolbox. The current leader is Paper Trail.
I'd go down the state machine route. You can validate each attribute when the model's in a certain state only. Far easier than multiple checkboxes and each state change can have an action (or actions) associated with it.
Having a flag in the model has some disadvantages:
You can not save as draft unless the data is valid. Sure, you can skip validations in the Rails model, but think about the "NOT NULL" columns defined in the database
To find the "real" records, you have to use a filter (like "WHERE draft = FALSE"). This can slow down query performance.
As an alternative, check out my gem drafting. It stores drafts for different models in a separate table.
In some of my forms I have to provide a dropdown where users can select some districts. The thing is, there will always be a fixed number of districts ( 31 in this case ). Should I create a model named District having only a string field,populate it with the data and be done with it?
It's content will not modify over time. Is there another way?
You should take a look at jnunemakers scam-gem. It mimics the AR for you and lets you define the models in your Rails app without a backing database/table.
I use this whenever I want something to do psuedo belongs to/has many relationships, but do not want to back a model with a database as the data does not change often, if ever.
Making a table-backed model is the simplest way. Otherwise you're going to end up implementing half of an AR model anyway, because you'll want to use collection_select at some point.
I guess it depends on how you want to store the districts and whether you want to do any querying etc.
For example, you could just have the list of districts as a constant, then store them as a string in your models (not very elegant), or as you say you could create a model and use active record associations - this would allow you to easily query on districts etc.