I have a database of economists in Microsoft Access, and i need to transfer it to Neoj.
Keep in mind that all these economists have also been teachers.
So, I have a table - Economist - where i store all economists data, with an ID "codecon" for each economist. Then, i have a table - University - where i store information about Universities, with an ID "coduni" for each University.
As last i have a table - Subject - with subjects informations, with an ID "codsubj" for each subject.
Now, in Access i have another table - Teaching, where i use the previous IDs to say that "Economist codecon teach Subject codsubj in University coduni.
How can i create this type of link in Neo4j, where i can only have relationships between TWO nodes?
Any help would be great. Thanks.
You will want to think about what your graph data model is going to look like. After that, you might want to use a tool to model it. Finally, you'll want to load the data in. You can use LOAD CSV if you've got the data exported out of Access as csv files, APOC has an XLS importer.
I think you'll find this following starter guide - going from relational to graph helpful. I would also suggest you start with exporting your tables to begin with to CSV.
For data modelling, I find arrows helpful.
Related
I'm working on a data storage model for a clickstream analytics system. User action data comes from a third-party system as a set of large JSON files. Currently, we will have an ETL process to read JSON files as a source and save data into our store for future analysis and reporting.
Depending on some business rules of the source system, each event can have an is_success field set to true or false. Non-successful user actions have a JSON field with an array of nested objects with diagnostic data about failures.
The draft data model for the storage system is the following:
I have concerns about the relation between fact_events and dim_failure_details on the diagram above. To me, dim_failure_details does not look like a dimension because it has a many-to-one relationship to the fact table.
I've read a design tip from the Kimball Group. That article recommends using a bridge table in a similar situation. But I don't understand how to apply that solution in my case because each event can have different and unpredictable values for attribute_key and attribute_value even when failure_type is the same for multiple events.
I also saw a few similar questions (Star schema [fact 1:n dimension]...how?, Star schema [fact 1:n dimension]...how?), but still don't know how the relationship should be organized correctly. Any help will be much appreciated.
I'm a new user to QLIK, scripting & overall beginner. I am looking for any help or recommendations to deal with my tables below. Just trying to create a good model to link my tables.
Created a sample here
file The original 3 tables are different qvd files
Transactions table has multiple columns and the main ones are TxnID, SourcePartyTypeID, DestPartyTypeID, SourcePartyType, DestinationPartyType, ConductorID.
Customers Table - CustName, CustID etc.
Accounts Table - AcctID, AcctNum, PrimaryActID etc.
With transactions it can relate to multiple CustID's/AcctID's which are linked by the Dest/SourcePartyIDs. Also the transaction has a source/destination party type field where A = Accounts, C = Customers & some NULLs.
I have read a lot on data models and a link table for star schema or join is recommended but I am unsure how to code this because these are also based on the Source/DestinationType fields (Transactions Table) where A = Accounts & C = Customers. Have tried to code but not successful.
I'm unsure how to join based on SourceType/DestinationType = Accounts or Customers. Link table or ApplyMap() with a WHERE clause?? Any suggestions
Hopefully your introduction to Qlik is still a positive one! There are a lot of resources to help you develop your Qlik scripting capabilities including:
Qlik Continuous Classroom (https://learning.qlik.com)
Qlik Community (https://community.qlik.com)
Qlik Product Documentation (https://help.qlik.com)
In terms of your sample data question. If you are creating a Qlik Sense app you can use the Qlik Data Manager to link your data.
This is excellent because not only will it try and analyse your data and make useful suggestions to link fields, it will also build the script which you can then review and use as a basis for developing your own understanding further.
Looking at your sample data, one option might be a simple key field between a couple of the tables. Here is one example of how this could work.
Rod
[Transactions]:
Load
// User generated fields
AutoNumberHash256 ( [DestPartyID], [SoucePartyID] ) As _keyAccount,
// Fields in source data
[TxnID],
[TxnNum],
[ConductorID],
[SourcePartyType],
[SoucePartyID] As [CustID],
[DestPartyType],
[DestPartyID],
[etc...]
From [lib://AttachedFiles/TablesExamples.xlsx]
(ooxml, embedded labels, table is Transactions);
[Customers]:
Load
// User generated fields
// Fields in source data
[CustID],
[CustFirstName],
[CustLastName]
From [lib://AttachedFiles/TablesExamples.xlsx]
(ooxml, embedded labels, table is Customers);
[Accounts]:
Load
// User generated fields
AutoNumberHash256 ( [AcctID], [PrimaryAcctID] ) As _keyAccount,
// Fields in source data
[AcctID],
[AcctNum],
[PrimaryAcctID],
[AcctName]
From [lib://AttachedFiles/TablesExamples.xlsx]
(ooxml, embedded labels, table is Accounts);
here is my scenario. I have a pre-defined data type structure for books. Just take it as an example for the sake of simplicity. The structure looks like the image below. It's a Labeled Property Graph and the information is self-explained. This data type structure is fixed, I cannot change it. I just use it.
When there is 1 book, let's call it Harry Potter, in the system, it might look like below:
So, the book has its own property (ID, Name,...) and also contains a field type MandatoryData. By looking at this graph, we can know all information about the book.
The problem happens when I have 2 books in the system, which looks like this:
In this case, there is another book called Graph DB, with those information as highlighted.
The problem of this design is: we don't know which information belong to which book. For example, we cannot distinguish the publishedYear anymore.
My question is: how to solve or avoid this problem? Should I create 1 MandatoryData for each book? Could you propose me any design?
I'm using Neo4j and Cypher. Thank you for your help!
UPDATE
From the comments (by #AnhTriet):
Thanks for your suggestion. But I want to have some sort of connection
between those books. If we create new MandatoryData, those books will
be completely separated. (...) I meant, 2 books should point to some
same nodes if they have the same author or published year, right?
After some clarification in the comments, I suggest the creation of a MandatoryData node for each property in the database. Then you will connect a given book to various MandatoryData nodes, depending on the number of properties of the book.
This way two books with the same author will be connected to the same MandatoryData node.
Since you cannot change the data model, I strongly recommend you to create a new MandatoryData node for each new book added to the system.
This way you will be able to get informations about the an specific book with queries like:
// Get the author's name of the book with ID = 1
MATCH (:Book {ID : 1})-->(:MandatoryData)-->(:Author)-->(:Name)-->(v:Value)
RETURN v.value
The model proposed in your question is not viable since has no way to identify the owner of an specific property, as indicated by you.
I am creating an application which requires the user to register. All data entered by user will be stored in this table called "customer". Now part of the information being collected is the address but I don't want to congest the table structure and would like to store address as an object (city, address, post code, etc).
What's the best practice: create an address table and refer the table through foreign key in the customer table or store the customer address as an object and store it in customer table?
I am not sure how parse fully functions so looking for your experience in the answer.
Thanks
I faced this exact problem a few months ago, and solved it by having a pointer in the customer object structure to the additional data. Note that if you do this, you'll need to make sure to include the pointed to field in future customer queries, or the data won't be fetched.
Retrospectively, I'm not sure I'd recommend splitting the objects up. It does create a more normalised data structure, but Parse fights against this in several ways:
You have to remember to include the pointed to field in all future queries. This is a pain.
You can only follow pointers up to a certain depth within a query (I think 3?)
Parse charges you by the database access, so denormalised data can be an issue.
Parse doesn't really support atomic operations or transactional queries, so it's easy to get your data into an inconsistent state if you're not careful about when you save. For example, you update your customer record, go to change the address record, and have the second query fail. Now you're in a "half updated state", and without transaction rollback, you'll have to fix it yourself (and you might not even know it's broken!).
Overall, were I to use Parse again (unlikely), I'd probably stick with giant denormalised objects.
Here is a solution to handle two table by the help of userId.
Note- You are creating a table of REGISTRATION and filling few data by your end(code).
so you can create an other one table for Address. and when you will create a new table of
Address a question will arise that how you manage these table
so its simple here you have same user id for both table "REGISTRATION & ADDRESS"
then by the help of that unique "userid" you can play. And as per your requirement find
the detail of both table and merge as well.
Hope it will resolve your problem .
We haave Accounts, Deals, Contacts, Tasks and some other objects in the database. When a new organisation we want to set up some of these objects as "Demo Data" which they can view/edit and delete as they wish.
We also want to give the user the option to delete all demo data so we need to be able to quickly identify it.
Here are two possible ways of doing this:
Have a "IsDemoData" field on all the above objects : This would mean that the field would need to be added if new types of demo data become required. Also, it would increase database size as IsDemoData would be redundant for any record that is not demo data.
Have a DemoDataLookup table with TableName and ID. The ID here would not be a strong foreign key but a theoretical foreign key to a record in the table stated by table name.
Which of these is better and is there a better normalised solution.
As a DBA, I think I'd rather see demo data isolated in a schema named "demo".
This is simple with some SQL database management systems, not so simple with others. In PostgreSQL, for example, you can write all your SQL with unqualified names, and put the "demo" schema first in the schema search path. When your clients no longer want the demo data, just drop the demo schema.