I'm creating a mapping from the X12 5010 837 format to XML using MapForce 2013 and the EDI config they provide for this format. My XML schema is hierarchical, e.g.:
Provider > Patient > Claim > Diagnoses, Procedures, Revenues, Payers
The mapping is configured to:
create a new Provider every time a hierarchical loop with a level code of 20 is found, and
create a new Patient every time a hierarchical loop with a level code of 22 is found.
I end up with the correct number of providers with this configuration. However, I end up with nothing inside of a the providers. When I remove the connection mentioned above (#1), I then get a single provider with all of the patients inside of it.
The file I'm testing with has a bunch of providers, each with a single patient.
So, it seems that by connecting the hierarchical provider loop to my Provider element, it is skipping all child loops/elements.
Related
CALL apoc.import.csv(
[{fileName: 'file:/persons.csv', labels: ['Person']}],
[{fileName: 'file:/knows.csv', type: 'KNOWS'}],
{delimiter: '|', arrayDelimiter: ',', stringIds: false}
)
For this example, internally, does the 'import' use merge or create to add nodes, relationships and properties? I tested, it seems it uses 'create' to add new rows even for a new ID record. Is there a way to control this? When to use apoc.load VS apoc.import? It seems apoc.load is a lot more flexible, where users can choose to use cypher commands specifically for purposes. Right?
From the source of CsvEntityLoader (which seems to be doing the work under the covers), nodes are blindly created rather than being merged.
While there's an ignoreDuplicateNodes configuration property you can set, it just ignores IDs duplicated within the incoming CSV (i.e. it's not de-duplicating the incoming records against your existing graph). You could protect yourself from creating duplicate nodes by creating an appropriate unique constraint on any uniquely-identifying properties, which would at least prevent you accidentally running the same import twice.
Personally I'd only use apoc.import.csv to do a one-off bulk load of data into a fresh graph (or to load a dump from another graph that was exported as a CSV by something like apoc.export.csv.*). And even then, you've got the batch import tool that'll do that job with higher performance for large datasets.
I tend to use either the built-in LOAD CSV command or apoc.load.csv for most things, as you can control exactly what you do with each record coming in from the file (such as performing a MERGE rather than a CREATE).
As indicated by by #Pablissimo's answer, the ignoreDuplicateNodes config option (when explicitly set to true) does not actually check for duplicates in the DB - it just checks within the file. A request to address this hole was brought up before, but nothing has been done yet to address it. So, if this is a concern for your use case, then you should not use apoc.import.csv.
The rest of this answer applies iff your files never specify nodes that already exist in your DB.
If your node CSV file follows the neo4j-admin import command's import file header format and has a header that specifies the :ID field for the column containing the node's unique ID, then the apoc.import.csv procedure should, by default, fail when it encounters duplicate node IDs (within the same file). That is because the procedure's ignoreDuplicateNodes config value defaults to false (you can specify true to skip duplicate IDs instead of failing).
However, since your node imports are not failing but are generating duplicate nodes, that implies your node CSV file does not specify the :ID field as appropriate. To fix this, you need to add the :ID field and call the procedure with the config option ignoreDuplicateNodes:true. Or, you can modify those CSV files somehow to remove duplicate rows.
I have a data model that starts with a single record, this has a custom "recordId" that's a uuid, then it relates out to other nodes and they then in turn relate to each other. That starting node is what defines the data that "belongs" together, as in if we had separate databases inside neo4j. I need to export this data, into a backup data-set that can be re-imported into either the same or a new database with ease
After some help, I'm using APOC to do the export:
call apoc.export.cypher.query("MATCH (start:installations)
WHERE start.recordId = \"XXXXXXXX-XXX-XXX-XXXX-XXXXXXXXXXXXX\"
CALL apoc.path.subgraphAll(start, {}) YIELD nodes, relationships
RETURN nodes, relationships", "/var/lib/neo4j/data/test_export.cypher", {})
There are then 2 problems I'm having:
Problem 1 is the data that's exported has internal neo4j identifiers to generate the relationships. This is bad if we need to import into a new database and the UNIQUE IMPORT ID values already exist. I need to have this data generated with my own custom recordIds as the point of reference.
Problem 2 is that the import doesn't even work.
call apoc.cypher.runFile("/var/lib/neo4j/data/test_export.cypher") yield row, result
returns:
Failed to invoke procedure apoc.cypher.runFile: Caused by: java.lang.RuntimeException: Error accessing file /var/lib/neo4j/data/test_export.cypher
I'm hoping someone can help me figure out what may be going on, but I'm not sure what additional info is helpful. No one in the Neo4j slack channel has been able to help find a solution.
Thanks.
problem1:
The exported file does not contain any internal neo4j ids. It is not safe to use neo4j ids out of the database, since they are not globally unique. So you should not use them to transfer data from one database to another.
If you are about to use globally uniqe ids, you can use an external plugin like GraphAware UUID plugin. (disclaimer: I work for GraphAware)
problem2:
If you cannot access the file, then possible reasons:
apoc.import.file.enabled=true is not set in neo4j.conf
os level
permission is not set
I want to copy the list of users specified as Component Watchers (with the plugin of the same name) into the list of Watchers, at issue creation time. I'm trying to do this with a Script-Runner post-function, after creating a custom field of type Component Watchers.
The part that I'm missing is how to obtain the Component Watchers usernames as a list. Any idea?
Try the Behaviours plugin. There you can specify "behaviour"-s, that can be mapped to projects and issue types, and there you can specify Groovy Scripts to run during transitions.
It won't be easy, but it is versatile and does not eat too much resources.
I have an XML file in my app resources folder. I am trying to update that file with new dictionaries dynamically. In other words I am trying to edit an existing XML file to add new keys and values to it.
First of all can we edit a static XML file and add new dictionary with keys and values to it. What is the best way to do this.
In general, you can read an XML file into a document object (choose your language), use methods to modify it (add your new dictionary), and (re-)write it back out to either the original XML file, or a new one.
That's straightforward ... just roll up the ol' sleeves and code it up.
The real problem comes in with formatting in the XML file before and after said additions.
If you are going to 'unix diff' the XML file before and after, then order is important. Some standard XML processors do better with order than others.
If the order changes behind the scenes, and is gratuitously propagated into your output file, you lose standard diffing advantages, such as some gui differs, and some scm diffs (svn, cvs, etc.).
For example, browse to:
Order of XML attributes after DOM processing
They discuss that DOM loses order where SAX does not.
You can also write a custom XML 'diff'er (there may be such off-the-shelf ... for example check out 'http://diffxml.sourceforge.net/') that compares 2 XML documents tag-by-tag, attribute-by-attribute, etc.
Perhaps some standard XML-related tool such as XSLT will allow you to keep the formatting constant without changing tag or attribute order. You'd have to research that.
BTW, a related problem is the config (.ini) file problem ... many common processors flippantly announce that the write-order may not agree with the read-order.
I have a task to create reports about various work items from a Team Foundation Server 2010 instance. They are looking for more information than the query tools seem to expose which is why I am not using the OOB reporting capabilities. The documentation on creating custom reports against TFS identify the Tfs_Analysis cube and the Tfs_Warehouse database as the intended sources for reporting.
They have created a custom work item, "Deployment Requests", to track requests for code migrations. This work item has custom urgency levels (critical, medium, low).
According to Manually Process the Data Warehouse and Analysis Services Cube for Team Foundation Server, every two minutes my ODS (Tfs_DefaultCollection) should sync with the Tfs_Warehouse and every 2 hours it hits the Tfs_Analysis cube. The basic work items correctly show up in my Tfs_Warehouse except not all of the data makes it over, in particular, the urgency isn't getting migrated.
As a concrete example, work item 19301 was a deployment request. This is what they can see using the native query tool from the web front-end.
I can find it in the Tfs_DefaultCollection and the "Urgency" is mapped to Fld10176.
SELECT
Fld10176 AS Urgency
, *
FROM Tfs_DefaultCollection.dbo.WorkItemsAre
WHERE ID = 19301
trimmed results...
Urgency Not A Field Changed Date
1 - Critical - (Right Away) 58 2011-09-07 15:52:29.613
If I query the warehouse, I see the deployment request and the "standard" data (people, time, area, etc)
SELECT
DWI.System_WorkItemType
, DWI.Microsoft_VSTS_Common_Priority
, DWI.Microsoft_VSTS_Common_Severity
, *
FROM
Tfw_Warehouse.dbo.DimWorkItem DWI
WHERE
DWI.System_Id = 19301
Trimmed results
System_WorkItemType Microsoft_VSTS_Common_Priority Microsoft_VSTS_Common_Severity
Deployment Request NULL NULL
I am not the TFS admin (first exposure to TFS is at this new gig) and thus far, they've been rather ...unhelpful.
Is there be a way to map that custom field over to an existing field in the Tfs_Warehouse? (Backfilling legacy values would be great but fixing current/future is all I need)
Is there a different approach I should be using?
Did you mark the field as reportable? See http://msdn.microsoft.com/en-us/library/ee921481.aspx for more information about this topic.
Based on Ewald Hofman's link, I ran
C:\Program Files\Microsoft Visual Studio 10.0\VC>witadmin listfields /collection:http://SomeServer/tfs > \tmp\witadmin.txt
and discovered a host of things not configured
Reportable As: None
At this point, I punted the ticket to the TFS admins and indicated they needed to fix things. In particular, examine these two fields
Field: Application.Changes
Name: ApplicationChanges
Type: PlainText
Use: Project1, Project2
Indexed: False
Reportable As: None
or
Field: Microsoft.VSTS.Common.ApplicationChanges
Name: Application Changes
Type: Html
Use: Project1, Project2
Indexed: False
Reportable As: None
It will be a while before the TFS Admins do anything but I'm happy to accept Edwald's answer.