I'm trying to use two datatables like below to meet my requirement.
dtExcelData: This DataTable holds the data which is uploaded from Excel file.
dtDbData: This DataTable holds data from Database.
The requirement is that I should validate dtExcelData before I insert into database. There exists 39 columns in dtExcelData datatable with the column headings column1, column2, ... column39. And, the number of rows can range up to 400 (or even little more).
I've to do validation like below:
column6, and column22 combinedly is considered as primary key. If this same data is already available in database, I should NOT consider that record to insert into database. I can simply ignore that record. All other records should be inserted into database.
After some analysis, understood that we can use LINQ Except method to meet my requirement.
I've tried number of approaches to meet this requirement, but unable to arrive to proper solution.
I am looking for some approach like below:
dtExcelData.Except(dtDbData)
Can someone suggest me the better approach!
Related
I am quite a newbie to Cube.js. I have been trying to integrate Cube.js analytics functionality with my Ruby on Rails app. The database is PostgreSQL. In a database, there is a certain column called answers_json with jsonb data type which contains a nested hash. An example of data of that column is:
**answers_json:**
"question_weights_calc"=>
{"314"=>{"329"=>1.5, "331"=>4.5, "332"=>1.5, "333"=>3.0},
"315"=>{"334"=>1.5, "335"=>4.5, "336"=>1.5, "337"=>3.0},
"316"=>{"338"=>1.5, "339"=>3.0}}
There are many more keys in the same column with the same hash structure as shown above. I posted the specific part because I would be dealing with this part only. I need assistance with accessing the values in the hash. The column has a nested hash. In the example above, the keys "314", "315" and "316" are Category IDs. The keys associated with Category ID "314" are "329","331","332", "333"; which are Question IDs. Each category will have multiple questions. For different records, the category and question IDs will be dynamic. For example, for another record, Category ID and Question IDs associated with that category id will be different. I need to access the values associated with the key question id. For example, to access the value "1.5" I need to do this in my schema file:
**sql: `(answers_json -> 'question_weights_calc' -> '314' ->> '329')`**
But the issue here is, those ids will be dynamic for different records in the database. Instead of "314" and "329", they can be some other numbers. Adding different record's json here for clarification:
**answers_json:**
"question_weights_calc"=>{"129"=>{"273"=>6.0, "275"=>15.0, "277"=>8.0}, "252"=>{"279"=>3.0, "281"=>8.0, "283"=>3.0}}}
How can I know and access those dynamic IDs and their values since I also need to perform mathematical operations on values. Thanks!
As a general rule, it's difficult to run SQL-based reporting on highly dynamic JSON data. Postgres does have some useful functions for dealing with JSON, and you might be able to use json_each or json_object_keys plus a few joins to get there, but its quite likely that the performance and maintainability of such a query would be difficult to say the least 😅 Cube.js ultimately executes SQL queries, so if you do go the above route, the query should be easily transferrable to a Cube.js schema.
Another approach would be to create a separate data processing pipeline that collects all the JSON data and flattens it into a single table. The pipeline should then store this data back in your database of choice, from where you could then use Cube.js to query it.
I wonder if it's possible to create a logic that automatically creates a denormalized table and it's data (and maintains it) by a specific SQL-like query.
Given a system where the user can maintain his data model and data. All data are stored in "relational" tables, but those tables are only used for the user to maintain his data. If he wants to display data on a webpage he has to write a query (SQL) which will automatically turn into a denormalized table and also be kept up-to-date when updating/deleting the relational data.
Let's say I got a query like this:
select t1.a, t1.b from t1 where t1.c = 1
The logic will automatically create a denormalized table with a copy of the needed data according to the query. It's mostly like a view (I wonder if views will be more performant than my approach). Whenever this query (give it a name) is needed by some business logic it will be replaced by a simple query on that new table.
Any update in t1 will search for all queries where t1 is involved and update the denormalized data automatically, but for performance win it will only update the rows infected (in this example just one row). That's the point where I'm not sure if it's achievable in an automatic way. The example query is simple, but what if there are queries with joins, aggregation or even sub queries?
Does an approach like this exist in the NoSQL world and maybe can somebody share his experience with it?
I would also like to know whether creating one table per query does conflict with any best practises when using NoSQL databases.
I have an idea how to solve simple queries just by finding the involved entity by its primary key when updating data and run the query on that specific entity again (so that joins will be updated, too). But with aggregation and sub queries I don't really know how to determine which denormalized table's entity is involved.
I've been facing this problem for the past few days. I am attempting to create a table view that is populated from a database query (seems simple enough). As I will be managing multiple tables, I have created a database helper class to fetch the data by using the sql queries. But it does not work consistently (or at all of late).
When I attempt to query a table, using one of the defined functions, the db return cursors with XX number of records, but null column data. In effect, multiple rows ( I see the row separators), but each row is blank.
Any suggestion or help is highly appreciated.
How can we load fact table in star schema using informatica powercenter ? Can you please provide any example for mappings/tranformations for this.
to load fact table ,if there is star schema dimentions table are independant at that time lookup on every dimention which you have to load, override the query with only active records check the condition with only natural key means your primary key in dimention after that on that basis take the surrogate key which artifically made by us for loading dimention table and also take which field you want to load in to that fact table.
Take the Staging tables as source tables and take the dimensions as lookups then load the data into fact table.
eg. http://www.folkstalk.com/2012/11/how-to-load-rows-into-fact-table-in.html
I was not able to find one when I was learning, hence adding this screenshot as a reference for new learners.
the mapping basically looks up at each of the dimension tables, and loads the dimension keys into fact as Foriegn keys and rest of the active records should come from SQ, I have used SQL override to perform all the joins and conditions required for loading the fact records.
I am developing an ASP.NET MVC4 web application. It uses the entity framework for data access. Many of the pages contain grids. These need to support paging, sorting, filtering and grouping. For performance the grid filtering, sorting, paging etc needs to occur on the database (i.e. the entity framework needs to generate a suitable SQL query). One complication is that the view model to represent the grid rows is built by combining the data from multiple business entities (tables). This could be simply getting the data from an entity a couple of levels down or by calculating it based on the values of related business entities. What approach is recommended to handle this scenario? Does anyone know of a good example on the web? Most have a simple mapping between the view model and business domain model.
Update 28/11 - To further clarify the initial display of the grid and paging works performs well. (See comment below) The problem is how do you handle sorting/ordering (and filtering) when the column that the user clicked on does not map directly to a column on the underlying business table. I am looking for a general solution to achieving this as the system will have approx 100 grids with a number of columns each and trying to handle this on a per column basis will not be maintainable.
If you want to be able to order a calculated field that isn't pre calculated in the database or do any Database Operation against it, then you are going to have to precalulate the value and store it in the database. I don't know anyway around that.
The only other solution is to move the paging and sorting etc to the web server, I am sure you don't really want to do that as you will have to calculate ALL the values to find what order they go in.
So if you want to achieve what you want - I think you will have to do the following, I would love to hear alternate solutions though:
Database Level Changes:
Add a Nullable Column for each calculated field you have in your View Model.
Write a SQL Script the calculates these values.
Set the column to Not Null if necessary
App Level Changes:
In your Add and Edit Pages you will have to calculate these values and Commit them with the rest of the data
You can now query against these at a Database level and use Queryable as you wanted.