HL7 version 3 parsing

HL7 version 3 parsing - hl7

I was parsing HL7 version 2.x messages through HAPI. Now I want to parse HL7 version 3 messages, which are in XML format. HAPI does not support HL7 version 3, so how can I do this?

HL7 version 3 is essentially XML-formatted HL7 data. As such, you could use any old XML parser. That said, you would have to build the intelligence re: segments etc... in yourself.
It does, however, appear that there is an HL7 v3 Java Special Interest Group, which has developed an API at least for RIM.
Another option would be to look at an integration engine. An open source option here is mirth. Mirth is a interface integration engine. It is a separate product - not a library you would integrate with your own. It can, however, take over the heavy lifting of converting HL7 to something more useful in your application - a Web Service call, a database insert, a differently formatted file (pdf, edi, etc).

Mohawk College publishes a Free and Open Source (FLOSS) API Framework for HL7 version 3 messaging and CDA Document processing called the "Everest Framework".
This framework is available for Java and .NET and comes with extensive examples and documentation on how to use HL7v3 messaging.
You can download the framework at (https://github.com/MohawkMEDIC/everest).
Support is also available via the GitHub project page.
This framework was developed through grant funding provided by the Natural Sciences and Engineering Research Council of Canada (NSERC) and Canada Health Infoway.

I used HL7 Java SIG some time ago (2008), but it is very easy to 1. create your own parser from the schemas using JAXB (Generate Java classes from .XSD files...?), or 2. create your own parser from scratch (I would suggest to use Groovy XMLSlurper http://www.groovy-lang.org/processing-xml.html).

You were asking for a link to the official parser for HL7v3 (go to the section under "v3 Utilities", I'll admit it's not easy to find but here it is:
http://www.hl7.org/participate/toolsandresources.cfm?ref=nav
They have some examples and data files to test with as well.

Related

AWS SWF vs Flow Framework

I am familiar with the Concept of amazon SWF . I can see many SDK in different languages to use SWF services. Also, amazon Flow Framework is a set of library to implement distributed applications . Currently this Flow Framework is available in Java and Ruby . Then how can we write distributed applications using SWF in other languages like python , php etc. Does this mean amazon provides the framework in Java and Ruby only , rest of the languages have other vendor's libraries ? Please explain .

You are right that AWS currently only provides high-level frameworks for Ruby and Java ("Flow" frameworks). Low-level access to SWF is available in most (all?) official SDKs though: boto2/3 for Python, go-sdk, etc.
When using SWF, you'll find yourself implementing mainly two types of programs: "activity workers" and "deciders" (http://docs.aws.amazon.com/amazonswf/latest/developerguide/swf-dev-actors.html).
Using the Flow framework is not mandatory, but it helps implementing deciders by providing high-level abstractions for describing synchronisation points, defining which tasks can be run in parallel, retries, etc. There are also non-official libraries (I'm personally maintaining one for my company, "simpleflow").
If you want to use other languages for deciders, I recommend you try to use an existing framework first, then see if you want to implement this yourself (it's not trivial from my experience).
If you want to implement activities in other languages, I recommend you start using the Flow framework end-to-end, and then you can either 1/ fork and use your favorite language as a subprocess of Ruby/Java Flow workers, or 2/ mimic the serialisation logic of the Flow framework and implement workers directly yourself with low-level APIs (which is simple: poll for an activity, do work, then respond to SWF with the result).

CCD and HL7 V3 / V2

My company is going to be an HIE and we are figuring out how to exchange our information with other systems. We are located in USA and I see that the current common standard is HL7 V2. Hl7 V3 is not backward compatible to HL7 V2. These are the transactions that we are planning to implement:
o Discharge Summaries
o Progress Notes
o Lab Results
o Procedures/Orders
This whole thing is complicated and I am trying to figure out using bits and pieces of info scattered in the net. So here are my questions:
Should we look at HL7 V3 or V2 implementation?
Are CCD and HL7 V2 compatible? Most of the documents I saw from HITSP talks about HL7 V3.
Where can I find an exhaustive list of data fields which are required for all the transactions above? Is it defined by HITSP?
Do the fields defined for each transaction change depending on whether we go with HL7 V3 or V2?
Any help will be much appreciated!

If you're going to be an HIE, you're going to need to handle both HL7 v2 and v3. My experience is that almost all "real life" data transmission is v2 (I think this is probably because most systems capable of v3 transmission also allow v2 formatting, whereas the older systems cannot handle v3, but this is just speculation). There's a very superficial overview of the core differences between v2 and v3 here, which talks a bit about what fundamental changes are being made besides format (delimited vs xml)
The second part of your questions doesn't really have a definitive answer, and ties in with your fourth question. CCD can be HL7v2 compatible, but it's a lot more difficult and relies on the sender and receiver both having a common understanding on a number of issues (hl7v2 is a very loose "standard"). Here's HL7standard.com's thoughts on the subject:
The short answer is "yes" with the long answer being "it depends."
There are different levels of CDA/CCD documents. It probably would be
possible to create a Level-1 CDA document from an HL7 v2 message,
assuming you had a fairly robust HL7 v2 report of some flavor (MDM or
ORU type of message most likely).
It would be harder to take an HL7 v2 message and convert it into a
Level-2 or Level-3 CDA document. More data is required in those
formats, and that data needs to be structured, not just free form
text.
The CDA is based on the HL7 v3 RIM, and compliant CDA documents will
follow that data model. Mapping from one data model to another can be
challenging.
All CDA documents need to have a responsible party or person who
'signs' the document. At a minimum, the HL7 v2 message would have to
have this information.
Converting a CDA document into an HL7 v2 message is possible, but you
run the risk of losing the 'context' of the data that is in the
original document if it is a highly structured Level-2 or Level-3
document. If the CDA contains nothing more than a textual report and
some patient demographic information, then it can be converted into a
comparable v2 message.
For getting a full list of fields, I'd just purchase the full HL7 standard... and maybe have a look over at bluebuttonplus.org... though keep in mind that v2 can result in some confusion, as message sub-segments aren't always consistently used between different actors. Anyway, I don't think that there are clean answers to most of what you're asking, but at the very least you will really need to handle both formats. I'd also throw in that you may want to look at a product like Mirth or InterfaceWare to handle some of your parsing and testing.

You'll definitely want to take a look at HL7.org. The HL7 standards are now available for free, and a large part of it is licensed at no cost. As far as HL7 version, 2.5.1 seems to be the standard for the v2 space. As stated in the other answer, you'll want to support both V2 and V3 HL7.
If you are looking in to sending the CCD, looking at the same HL7 site search for QRDA, the Quality Document Reporting Architecture, an XML HL7 standard for the transmission of the CDA.
Two other notes on this, you'll want to look at the emerging discussions around QRDA I and QRDA III that are taking place now with ONC, and the revised standards for the CCDA.
Here's a direct link to the, I believe, most recent CDA standard on HL7.org
HL7/ASTM Implementation Guide for CDA Release 2 -Continuity of Care Document (CCD®) Release 1
Finally, you will need to have conversations with the major EHR players such as Epic, Cerner and Meditech. They'll need to be on board with your solution and can guide you through the implementation process. In addition to that, they will need to know how best to set up their systems to talk to yours so they can guide their clients. Without this I'm afraid you'll get no traction.

How to do a REST webserver with Delphi as a backend for a big web application?

I read this question but was somehow not satisfied with the answers.
I also quickly read (as suggested in that question) the last chapter of Marco Cantù 2010 Handbook, from which I quote the following (I think I can quote such a short text):
I [Marco Cantù] do have a lot of
investment in server side web and REST
applications written in Delphi, and in
the recent years I've started playing
with and introducing at conferences a
Delphi Web Application REST
Framework119 (that is, DWARF), which
at this time is still not publicly
available... simply because it is too
sketchy and unfinished to be
published. I've seen other ongoing
efforts to clone Rails in Delphi and
offer other REST server architectures.
I think that if you want to build a
very large REST application
architecture you should roll out your
own technology or use one of these
prototypical architectures.
Considering that I own Delphi XE Professional and DataSnap is not in there and I would like to consider to write large applications too according to the above comments it seems DataSnap is not an option.
Is there even a commercial solution for this? I don't want to consider "my own implementation of REST", I would like to create a webserver that uses some of my datamodules where I use the DAC I choose (Devart in this case).
Final note: my goal is to write the backend for a large web application, on the client I would like to use Ext JS 4.0, but I want to do all the client work in javascript, to take full advantage of EXT JS, so basically I need a webserver just for the data and tracking the state, not for serving webpages.

To create your REST services, try our Open Source mORMot project. Now it is a well known and stabilized project, used world wide in production.
You can use any DAC with the current state of the framework by implementing a custom TSQLRestServerStatic class (similar to the TSQLRestServerStaticInMemory class, but calling your DAC): so you'll benefit for the ORM and the JSON RESTful architecture, together with the high-speed http.sys kernel-mode server.
The SQLite3 engine is NOT mandatory with our framework, even if it was designed to work better with it.

If you will start an application from scratch, I think the mORMot is a good option if Delphi is your only option. If you choose datasnap you'll have to live with the problems of performance and stability.
I wrote an article on my blog talking about performance and stability with DataSnap (and mORMot) in large applications, you can see it on the following link:
DataSnap analysis based on Speed & Stability tests

I think you should have a look at kbmMW, there is a way to implement a basic REST server based on an event driven HTTP server.
Check news.components4developers.com news groups, there you will have a lot of documentation.

FireHttp is a high-performance Web server based on Delphi/Object Pascal language. It supports HTTP 1.1, HTTPS (SSL/TLS), WebSocket, GZip, Deflate, IOCP, EPOLL. It adopts multi-process+multi-threading model, has good stability and concurrency performance, and provides SDK source code. Developers can use SDK to quickly build high-performance cross-platform Web applications.

What exactly does the Open XML SDK v2 take care of that you would have to do manually when coding by hand with an XML library?

This is closely related to another question I asked: Is there functionality that is NOT exposed in the Open XML SDK v2?
I am currently working with Open XML files manually. I recently had a look at the SDK and was surprised to find that it looked pretty low level, quite similar in fact to the helper classes I have created myself. My question is what exactly does the SDK v2 take care of that you would have to do manually when coding by hand with an XML library?
For example, would it automatically patch the _rels files when deleting a PowerPoint slide?

In addition to Otaku's links, this shows an example (near the bottom) of navigating an OpenXML document using the IO.Packaging namespace versus the SDK.
Just like Microsoft states on the download page for the SDK:
The Open XML SDK 2.0 for Microsoft
Office is built on top of the
System.IO.Packaging API and provides
strongly typed part classes to
manipulate Open XML documents. The SDK
also uses the .NET Framework
Language-Integrated Query (LINQ)
technology to provide strongly typed
object access to the XML content
inside the parts of Open XML
documents.
The Open XML SDK 2.0 simplifies the
task of manipulating Open XML packages
and the underlying Open XML schema
elements within a package. The Open
XML Application Programming Interface
(API) encapsulates many common tasks
that developers perform on Open XML
packages, so you can perform complex
operations with just a few lines of
code.
I've worked pretty much only with the SDK, but for example, it's nice to be able to grab a table out of a Word document by just using:
Table table = wordprocessingDocument.MainDocumentPart.Document.Body.Elements<Table>().First();
(I mean, assuming it's the first table)
I'd say the SDK does exactly what it seeks to do by providing a sort of intuitive object-based way to work with documents.
As far as automatically patching the relationships -- no, it doesn't do that. And looking back at how you actually state the question, I guess I might even say that (and I'm fairly new to Open XML so this isn't gospel by means) the SDK2.0 doesn't necessarily offer any extra functionality, so much as it offers a more convenient way to achieve the same functionality. For example, you still need to know about those relationships when you delete an element, but it's a lot easier to deal with them.
Also, there's been some efforts on top of the SDK to add even more abstraction -- see, for example, ExtremeML (Excel library only. I've never used it but I think it does get into things like patching relationships).
So I'm sorry if I've rambled a bit too much here. But I guess my short answer is: there's probably not extra functionality, but there's a nice level of abstraction that makes achieving certain functionality a lot easier to handle -- and if you've been doing it by hand up until now, you'll certainly have the understanding of the OPC to understand what exactly is being abstracted.

As a starting point, read this from the Brian Jones & Zeyad Rajabi blog.
I don't know of a side-by-side comparison, but the following articles/videos do discuss the two:
Using the Open XML SDK 2.0 Classes
Versus Using .NET XML Services is
a good place to start comparing the
two.
Open XML and the Open XML SDK is
a deep dive video which discusses both.
Finally, this is a What's New for 2.0 - it can be assumed that neither 1.0 or hand-coding have these benefits.

Looking for an information retrival / text mining application or library

We extract various information from e-mails - flights, car rentals, hotels and more. the method is to extract the body of the mail, usually in HTML form but sometime it's text or we use the information in a PDF/Word/RTF attachment. We then apply regular expressions (sometimes in several steps) in order to get information, which is provided in a tabular form (you can think of a flight table, hotel table, etc.). Notice, even though we parse HTML, this is not web scraping.
Currently we are using QL2's WebQL engine, but we are looking to replace it from business reasons. Can you recommend on another engine? It must run on Linux and be accessible from Java (a Java API would be the the best, but Web services are good solution as well). It also must support regular expressions for text extraction and not just to be based on the HTML structure.

I recommend that you have a look at R. It has an extensive number of text mining packages: have a look at the Natural Language Processing view. In particular, look at the tm package. Here are some relevant links:
Paper about the package in the Journal of Statistical Computing: http://www.jstatsoft.org/v25/i05/paper. The paper includes a nice example of an analysis of the R-devel
mailing list (https://stat.ethz.ch/pipermail/r-devel/) newsgroup postings from 2006.
Package homepage: http://cran.r-project.org/web/packages/tm/index.html
Look at the introductory vignette: http://cran.r-project.org/web/packages/tm/vignettes/tm.pdf
In addition, R provides many tools for parsing HTML or XML. Have a look at this question for an example using the RCurl and XML packages.
Edit: You can integrate R with Java with JRI. It's a very widely used package, with many examples. You can also see these related questions.

Have a look at:
LingPipe - LingPipe is a suite of Java libraries for the linguistic analysis of human language.
Lucene - Apache Lucene is a high-performance, full-featured text search engine library written entirely in Java.

Just wanted to update - our final decision was to implement the parsing in groovy, and to add some required functionality (html to text, pdf to text, clean whitespace, etc.) either by implementing it in Java ot by relying on 3rd party libraries.

I use a custom parser made with Flex and C++ for similar purposes. I'd suggest you take a look at parser generators in java (javaCC .jj files) javacc-faq Nutch does it this way. (NutchAnalysis.jj)

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart