Line break and also getting space in HL7 message - hl7

My BizTalk receive XML message as a input message. I am converting that message to HL7 message using Transform in orchestration.
Now if input message consists of any empty field in any of the node, the HL7 message breaks up at that postion and also include space in that message.
Can anyone help me to resolve this issue? following is my HL7 message:
Note --- Copy this message in Textpad to get to know exact space in it
MSH|^~\&|EEHR^bbbbbbbbbb|aaaaaaaaaaaaaaaaa^12699^DNS|KYIR|CDP|201103060733||VXU^V04|14962|P|2.3.1||||
PID|1||765874316^^^^SS||ssssss^anan^T|wwwww^^^^^^M|20100217|M||2135-2^YYYYYYYY or jjjjjj^HL70005|5896 hyhyhyhy Ave^Apt# 112^Wanta Fe^NM^85678^XXX^H^^049||5033331120X
^PRN^PH^^^505^5551120^~^NET^X.400^xxxxxx#yutyutopo.com|5056083515X4365^WPN^PH^^^505^6086715^4365|es^English^HL70296||||215486702|||H^erererer or qwqwqw^HL70189|bnbnbn|Y|1||||
Thanks.

I'm not entirely sure what the issue is - is it that there are spaces in the output HL7 message string? I'm not on my windows partition right now, so I'm not able to actually see any glaring issues with spacing in your posted message.
Anyway, if it is just spaces, can you just parse through the string and replace spaces in fields with an empty string?
Something like: message.replaceAll("\\| \\|", "||"); <-- This is Java code
That previous code would replace all instances of '| |' with '||' (i.e. replace fields with a space with an empty string).
Hope that helps.
Cheers

It seems your problem is that there are wrong segment separators.
As it would be possible to find just all segment headers as a combination of a blank followed by a known segment header and a field delimiter and to replace the blank by the correct segment separator, there is no guarantee, that you will not get the same combination by chance at a position different from the start of a segment.
So best advice would be to avoid the wrong segment separators und to provide it right.

Related

How to see which line while parsing json source caused the exception?

I have JSON files with hundreds of lines, but when there is an error that causes a parsing exception, the library returns a character position, not a line number.
Line number would be hugely helpful since most text editors will show you, or take you to the line number, but I don't know of any that give the absolute character number.
I found the spot in parse_error where deserialization member byte_ holds the character index, but it doesn't seem to have line number info.
Does the json container "know" which line it is, and I could ask for it in the exception handler? I know this isn't a trivial issue, since different OS's give us the "joys" of different EOLs, but perhaps it has been handled anyway?

What should I do with emails using charset ansi_x3.110-1983?

My application is parsing incoming emails. I try to parse them as best as possible but every now and then I get one with puzzling content. This time is an email that looks to be in ASCII but the specified charset is: ansi_x3.110-1983.
My application handles it correctly by defaulting to ASCII, but it throws a warning which I'd like to stop receiving, so my question is: what is ansi_x3.110-1983 and what should I do with it?
According to this page on the IANA's site, ANSI_X3.110-1983 is also known as:
iso-ir-99
CSA_T500-1983
NAPLPS
csISO99NAPLPS
Of those, only the name NAPLPS seems interesting or informative. If you can, consider getting in touch with the people sending those mails. If they're really using Prodigy in this day and age, I'd be amazed.
The IANA site also has a pointer to RFC 1345, which contains a description of the bytes and the characters that they map to. Compared to ISO-8859-1, the control characters are the same, as are most of the punctuation, all of the numbers and letters, and most of the remaining characters in the first 7 bits.
You could possibly use the guide in the RFC to write a tool to map the characters over, if someone hasn't written a tool for it already. To be honest, it may be easier to simply ignore the whines about the weird character set given that the character mapping is close enough to what is expected anyway...

Character #\u009C cannot be represented in the character set CHARSET:CP1252 - how to fix it

As already pointed out in the topic, I got the following error:
Character #\u009C cannot be represented in the character set CHARSET:CP1252
trying to print out a string given back by drakma:http-request, as far as I understand the error-code the problem is that the windows-encoding (CP1252) does not support this character.
Therefore to be able to process it, I might/must convert the whole string.
My question is what package/library does support converting strings to certain character-sets efficiently?
An alike question is this one, but just ignoring the error would not help in my case.
Drakma already does the job of "converting strings": after all, when it reads from some random webserver, it just gets a stream of bytes. It then has to convert that to a lisp string. You probably want to bind *drakma-default-external-format* to something else, although I can't remember off-hand what the allowable values are. Maybe something like :utf-8?

NHapi incomplete messages encoded partially and without error?

In NHapi, I'm attempting to create a pipe-encoded ORM. When I parser.Encode() my populated message, only some of the segments are printed. Notably among the missing segments is MSH!
I don't know for sure, but I believe that the encoder is skipping segments that it considers to be incomplete.
I have given values for the required fields MSH-1, 2, 9, 10, 11, and 12, but I cannot get the MSH segment to encode.
If I am right that the MSH segment's incompleteness is causing this omission: Is there any way to have the PipeEncoder or some other validator throw exceptions if messages are not complete?
And: In any case, why is the MSH segment not encoding?
Perhaps this could help someone, so I won't just close it. I was printing these encoded messages to the Console and seeing only two segments, and jumbled at that, though I wasn't familiar enough with HL7 to realize.
What was happening was that NHapi's '\r' single newline character (rather than "\r\n") was merely overwriting each line with the next segment. My PID segment was long enough to wrap, getting me to the second line and the two segments.
That was dumb.

WorkItem validation of "Plain Text" fields

I've got an application that bridges our help desk system with TFS (one way from Help Desk to TFS). When I create the work item in TFS, in some situations I'm getting an "InvalidCharacters" validation error.
The field I'm using is the standard "Description" field, which is defined as "Plain Text" in the Work Item definition.
This is only happening on one record, so I'm sure it's the data, but I can't figure out what character is being considered to be invalid. Is there any guidance on what will trigger the InvalidCharacters validation on "Plain Text" fields?
It looks like this field is unable to display the extended ASCII characters. There was an a with an accent grave (à) in the string I was trying to save.
-- EDIT --
This actually became even more frustrating. The character representation when I did a ToCharArray() was "à", however, when I finally found the spot in the string where it was bombing, the actual character was a single-character ellipses (...). Which was probably caused by someone copying and pasting from Word into our help-desk system for comments.
My ultimate resolution was a brute force spin through the char array, replacing any character that had an int value of greater than 127 with something else (in my case, a question mark).
A ‘string’ field is invalid if it contains control characters other than newline, carriage return, and tab or if it contains mismatched surrogate characters. Longtext fields (like plaintext) accept everything except mismatched surrogate pairs. Make sure your copy/paste is resulting in Unicode being pasted in.
You can use a Regex function to compress all white space down to a " " character, such as this:
Regex.Replace( text, #"\s+", " " );
Although that actually strips more than you technically need to, since it takes out newline, carriage return and tab too.
Hope this helps!

Resources