I have two files with same ID variable, so I want to match them with the MATCH FILES command, but I want to keep all the variables from the first file and just some from the other one. Thing is, I don't want to type every variable from the first file, but the subcommand KEEP ALL seems it's not working. Here my syntax and the error message:
GET FILE='C:\Users\Mike\Desktop\Households.sav'.
SORT CASES BY ID (A).
GET FILE='C:\Users\Mike\Desktop\Adults.sav'.
SORT CASES BY ID (A).
MATCH FILES
/FILE=*
/KEEP ALL
/FILE='C:\Users\Mike\Desktop\Households.sav'
/BY ID
/KEEP PV1 PV2 PV3 PV4.
EXECUTE.
SAVE OUTFILE
'C:\Users\Mike\Desktop\matchHouseholdsAdults.sav'.
Subcommands are out of order. All the FILE, TABLE, RENAME and IN subcommands must precede all other kinds of subcommands. Syntax checking begins with the next slash.
Thanks, fellows.
From the CSR:
DROP and KEEP must follow all FILE, TABLE, and RENAME subcommands.
You can use /DROP after the second FILE subcommand to ge rid of unwanted variables in the second file. If there are duplicate names, the first FILE content take priority.
Related
I am trying to load nodes and its relations from csv file using neo4j bulk importer my script like this
neo4j-admin import \
--id-type=string \
--nodes:AGENT="nodes_AGENT_C_20190610.csv" \
--nodes:CUSTOMER="nodes_CUSTOMER_C_20190610.csv" \
--relationships:CASHOUT="relcashoutTest-header.csv,relcashoutTest.csv"
and my csv file like this for relationship files
:TYPE,:START_ID(CUSTOMER),:END_ID(AGENT),TXNID:string,TIMESTAMP:datetime,AMOUNT:int,CHANNEL
Here TYPE indicates the column named RELATIONSHIP
and my relational csv file look like this
CASHOUT,abc,xyz,6C19MX7DXL,2019-03-01T11:02:55,40,charge
CASHOUT,pqr,jkl,6C19MX7E2V,2019-03-01T11:02:57,10,charge
after running my import.sh script I am getting bellow error
unexpected error: Group 'CUSTOMER' not found. Available groups are: []
I have gone through the document but didn't figure it out my mistakes. Any help will be appreciated neo4j version is 3.5.8
The :START_ID and :END_ID fields can take an optional ID space, as in :START_ID(CUSTOMER).
But an ID space is not the same thing as a node label. In order for :START_ID(CUSTOMER) to work, one of your node CSV files (presumably the one for the CUSTOMER label) must specify, in its header, :ID(CUSTOMER) instead of just :ID. Doing so would associate the CUSTOMER ID space with the nodes created by that file, and you should no longer see that specific error.
You may also need to do something similar for the AGENT ID space.
NOTE: If all your nodes have unique values in the :ID field (across CSV files), then you do not need to use ID spaces at all. In that case, your relationship file header can simply use :START_ID and :END_ID without any qualification.
could you please advise how to build "if statement" in SPSS Modeler if we have two data sources?
One data source (1) is a table (an output node generated by SPSS Modeler) where all the IDs are listed with which we need to work further.
Another data source (2) is an Excel file where all the IDs are listed whereas this list includes some IDs from (1) but also some additional ones - to all these IDs are assigned values that are needed to be added to the data source (1) not necessarily to the table.
So if the ID from (1) is in (2) we would like to assign a value from (2) to the ID in (1) and have it stored in some table or even better in a file.
Thank you very much for your help / advice.
Patricia
Based on your problem it sounds like you want to merge these datasets. This can be easily done in Modeler via the Merge Node, just make sure the variables have the same name or Modeler won't recognize it as a key. You can see an example here
You can also create a flag variable using the Derive node, see example here
You will have to use the Merge Node to combine the 2 datasets but you don't have to give the same name for the keys IDs. You can use the option condition in the Merge Node without the necessity of having the same name and even the same type of variable.
Syntax example for the merge Node - option condition: 'ID' = 'id'
I am trying to dump some date to Neo4J. Some of my node names (in the chosen format for dumping) has numbers, which have to be exported as node-names.
I encounter the following error when the node name or label starts with a number.
Neo.ClientError.Statement.InvalidSyntax
MERGE (1:User {name: "u1"})
Is this because, internally neo4j has a unique ID?. How do we circumvent this problem?
I believe these are just the syntax rules Neo4j uses. Also keep in mind that the thing you are referring to as the node name (1, in your example) is actually a variable name, and only persists for the duration of the query (or until it leaves scope if not carried over in a WITH clause to the next part of the query).
From the developer documentation:
Variable names are case sensitive, and can contain underscores and
alphanumeric characters (a-z, 0-9), but must always start with a
letter...The same rules apply to property names.
While I didn't see anything about label names, it looks like it follows the same syntax rules.
Property values, of course, can be anything you want.
You described the limitation as a "problem", so I'm guessing there's a perceived issue with this in your import, likely around the confusion between variables and what you called node names. If that's so, then please add some more details to your description, and I can add on to my answer accordingly.
i have a large cobol source code file that reads one file, checks for id's in a specific record, processes, then repeats till there are no more files. I have a few records that contain ids that I do not want to process. I'd like read the file, read the record, if the record variable equals a certain string, i'd like to move to the next file and do nothing with this file. Any suggestions?
Merging files in spss
Hi,
I have a problem in merging files. Here's what I need to do: I have chosen 200 cases from 7000 in ArcMap (GIS-program). In the process I have lost some of the cases' variable information.
Now I would like to get the variables back to my smaller dataset, and I used data-> merge files > add variables, and ID as match, match cases on > keyvariables in sorted files > both files provide cases.
This gave a dataset of all the 7000 cases, only the variables already existed in the first table didn't add to the merged dataset. I tried also all different choises, but none of them gave me the result I wanted. This would be the 200 cases added with the variables that were lost in the process.
So in a nutshell how do I merge/replace the info from variables A (dataset) to variables B(dataset) without extra casesĀ“ from A (only the info of the selected 200 casesĀ“out of 7000)?
Out of hand:
Create a new variable in the reduced DataSet with the Value of 1.
Match the files.
Sort by the new variable.
Delete all cases who don't have the value 1 on this variable.
I don't see why you are choosing both files provide cases. You want to use the 7000-case file as a keyed table using ID as the key and match it with the 200-case file, which provides all the cases. Assuming that you select all the variables from the large file that you want, this should give you the desired result.