Capturing Field Name Metadata from a CSV File in Altova MapForce - parsing

I've been asked to prototype a replacement "file transformation process" (that currently is a mess of SQL) using Altova's MapForce. My input is a CSV file with headers. My problem is that I need to capture both the data AND the column name to use in downstream processing.
I need to have MapForce feed a C# method (imported as that takes two parameters: fieldName and value. I can access the value trivially, but after hours pouring over the manual (1000 pages!) I haven't found any examples of how to access the field name as an output.
The reason each output needs the field name and the value has to do with how all our mappings/transformations are currently managed - on a database. The .NET code jumps in at this point and does any necessary database lookups.
For example, if I had the following file:
"Symbol", "Account", "Price", ...
"FOO", "10101", "1.23", ...
"BAR", "10201, "13.56", ...
And a static method string TransformField( string fieldName, string value ),
I'd like to map the CSV file's Symbol data output to the method's value parameter and the Field Name "Symbol" to the method's fieldName parameter.
Some limitations:
I need to keep the "wiring" visible in the MapForce GUI. I'll have non-programmers maintaining the mappings in the future. So doing all this in code is not an option.
MapForce is the tool of choice by the company. Part of the reason our original process is such a mess is because the original programmer rolled his own mapping/transformation tool (out of TSQL no less - ouch).
We can treat all inputs/outputs to the method call as strings. Conversions will happen later.
I would like to avoid using scalar literals as inputs. I already have the column names from the file - I do not want to re-type each one and feed it to my method.
I'm not sure how many users out there have experience with this tool, but after 3 days of tinkering with it, I see much potential. If only I can get past this current sticking point, I think the company will have a solid alternative to their current mess.
Thanks for any/all suggestions.

I solved my issue and, for future reference, want to post a solution. I handled my problem by using MapForce's FlexText. This allowed me to extract the header from the CSV file and "invert" the column names as data inputs to the transformation process. Once I knew the approach to take, I was able to find more information directly from Altova.
I found a couple helpful tutorials while digging through their website:
Altova Online Videos
Web Tutorial
Hope this can help someone else in the future!

Related

Delphi and attachment files in MS access database

I search many times in Google, SO, and I can't find anything about working with attachment via delphi, so I decide to write this question.
I have a table in .accdb database called Files with those fields:
IDFile PK AutoIncField,
FileName WideStringField,
FilesAttached WideMemoFiled.
How can I save/load files to/from attachment fields using delphi?
Attach files and graphics to the records in your database
The problem here, in delphi the datatype of FilesAttached is TWideMemoField,
when I write ShowMessage(FDTable1FilesAttached.Value); it give just the name of the attachment.
I don't know how to Insert/save files to/from that field using delphi.
It didn't seem that hard to find VBA/C# examples of working with .accdb Attachment fields which should translate fairly easily into Delphi. However, it turned out to be more difficult than I imagined to find something that a) hadn't misunderstood what Attachment fields actually are and b) actually works. Skip to the update section below.
For example, googling
accdb create attachment in vba
gives numerous hits including this one
http://sourcedaddy.com/ms-access/working-with-attachment-fields.html
which you might try as a starting point. It uses MS DAO objects, and includes straightforward code for storing files to Attachment fields and for accessing them. You would need to create a Delphi wrapper unit for the DAO type library, if you don't already have one, using the IDE's Import Type Library
If you would prefer something ADO-based, you might take a look at
https://www.codeproject.com/Questions/843001/Handling-fields-of-Attachment-type-in-MS-Access-us
Update See the function OpenFirstAttachmentAsTempFile in the post by "aspen" (date = 4/11/2012 07:18 am) in this thread
https://access-programmers.co.uk/forums/showthread.php?t=224112&page=2
which shows an apparently successful attempt to extract a file from an attachment field (the thread also contains several other attempts at coding this function).
Note in particular this line
Set rstChild = rstCurrent.Fields(strFieldName).Value ' the .Value for a complex field returns the underlying recordset
which implies that the Value of the attachment field can return a recordset which contains the attached file(s).
Presumably, importing a recent version of the DAO type library into Delphi would allow
a Delphi app to do the same thing, and then one could reverse-engineer the rstChild recordset to see how to populate this field in code. I haven't done that yet, though.

Rails - best practice to store dictionaries (key value pairs)

I need some architectural advice. I'm more into java, but trying to get up to speed with Ruby-on-rails. In the app I am building I need a convenient place to store some dictionary values that will be later used in various places of the application. These will be usually key value pairs - e.g. list of values to be used in select list.
The main objective is to keep this logic in one place of the application.
I am considering following options:
Store values in the database - i'm kind of reluctant from that, as values won't change very often.
Put all of the values in one class. In JAVA I'd have some static properties in one class holding this values (e.g. call Utils.getStates() will return list of states). How to do it ruby way?
Have some .yml file with values - read from the values. How to do it? I guess I have to parse the file in the initializer, but is there any tutorial how to do it?
Precise example? Let's say that have a model that have a field called "Type". Type can be: ['Type A', 'Type B', 'Type C'...]. And of course, for each type I want to have key and value.
I'd appreciate some suggestions about how you solve this problem in your apps.
Thanks,
Maciek
How often does the list change? Is it acceptable to have developers involved each time a value changes (updating code, re-deploying the app)? If the answer is no then store the values in a database.
Is the list of values reuseable? Then a gem or a yaml file with an initializer might be a good choice.
Is it just a small list and does not change often? Then you might want to consider a constant.
I think in Rails any data that would change at runtime and needs to be persisted, would normally be stored in the database. I think that would be the "rails way". You could save the data to yaml or json file, but that would not follow the normal flow of the MVC pattern that is so common in rails

Looking for advice on how to normalize format of incoming json from different sources

I am working on a project that receives data (json) from a number of different sources. Each source returns json in a different format, however all services fall into the same category i.e. Issues from Jira and Stories from PivotalTracker each have the same core information.
I am looking for a way to normalize this as much as possible so that I can add other services and formats in the future. Right now I am handling each response type (Jira, PivotalTracker) separately and taking action on each response independently.
So far I am thinking that I'll need a parser for each service, i.e. JiraIssueParser, PivotalTrackerStoryParser etc which transforms the response into a common format that can be used by one method to post onwards, rather than having methods for each to do the receive/parse/post.
Something like this format:
{
issue: {
title: ,
description: ,
assignee: ,
comments: {
1: {
id: ,
title: ,
body:
}
time_entries: {
1: {
id: ,
time: ,
date:
}
}
}
}
I would like to define the common schema somewhere so that each parser's output is always identical. I'm thinking this could be done with a YAML file but I'm not sure how to go about it, and how to use that in the parser.
I would greatly appreciate some suggestions on how to do this. Maybe this is a really stupid question and I should just be outputting the above format from each parser, but I think it would make sense to have some kind of format that is enforced/validated.
Suggestions are appreciated and I'm open to taking a new direction with this if anyone has any ideas. Thanks in advance.
If you're using Rails, I assume then that you are going to have a relational database at some point.
What I would suggest is to define ActiveRecord models that express your "normalized" format: Issue, Comment, TimeEntry, etc.
The job of your parsers, then, is to coerce the JSON data into the appropriate model objects and attributes and save them. Your models thus enforce the canonical data structure (i.e. the schema), and you can even use validators to do further sanity checks.
Finally, I would also save the raw JSON somewhere alongside your model, preferably also in the database. Even though you have already parsed the JSON, keeping it around will come in handy for troubleshooting. For example, if you find and fix a parsing bug, you can re-run the parser on the saved JSON without having to re-download everything the original external sources.

Avoiding repeated type definition for each result when projecting in Breeze.js

I'm currently querying an Entity using projections to avoid returning the entire object.
It works flawlessly, however, when looking at the actual response from the server, I'm seeing the same type definition repeated for every single element.
For example:
["$type":"_IB_4NdB_p8LiaC3WlWHHQ_pZzrAC_plF4[[System.Int32, mscorlib],[System.String, mscorlib],[System.String, mscorlib],[System.String, mscorlib],[System.Nullable`1[[System.Int32, mscorlib]], mscorlib],[System.Int32, mscorlib],[System.Single, mscorlib]], _IB_4NdB_p8LiaC3WlWHHQ_pZzrAC_plF4_IdeaBlade"
Now, given that every item in the result is sharing the same projection for that query, is there a way to have Breeze only define the Type Description ONCE instead of for every element?
It may not seem like a big deal but as result size increases those bytes do start to add up. At the moment There is little difference between returning the projected values and the entire entity itself due to this overhead.
NOTE: As it turns out, since we use Dynamic Compression of JSON in our real environments, this actually turns out to be a minor issue, since 200KB responses actually turn into less than 20KB traffic after gzip compression. Will probably be closing this question, unless someone has something to add that could be of use to others.
Update 18 September 2014
I decided to "cure" the problem of the long ugly $type names in serialized data for both dynamic types from projection queries and anonymous types created for an endpoint such as "Lookups".
There's a new Breeze Labs nuget package, "Breeze.DynamicTypeRenaming" (search for "Breeze Dynamic Type Renaming"). This adds two files to your Web API project's "Controllers" folder. One is a CustomBreezeConfig which replaces Breeze's default config and resets the Json.Net "Binder" setting with the new DynamicTypeRenamingSerializationBinder; this binder does the type name magic.
Just install the nuget package in your Web API project and it should "just work". In your case, the $type value would become "_IB_4NdB_p8LiaC3WlWHHQ_pZzrAC_plF4, Dynamic".
See an example of it in the "DocCode" sample.
As always, this is a Breeze Lab product, not part of the core Breeze product. It is offered "as is" with no promise of support. I'm pretty sure it's good and has no adverse side-effects. No guarantees. I'm sure you'll let me know if there's a problem.
That IS atrocious, isn't it! That's the C# generated anonymous type. You can get rid of it by casting into a custom DTO type.
I don't know if it is actually harmful. I hate looking at it in any case.
Lately I've been thinking about adding a JSON.NET IContractResolver that detects such uglies and turns them into shorter uglies. Wouldn't be hard. Just haven't had the time.
Why not write that yourself and contribute to the community? We'd be grateful! :-)
Using Dynamic Compression of JSON output has turned this into a non-issue, at least for now, since all that repeated content is heavily compressed server-side.

Parsing a CSV for Database Insertion when Formatted Incorrectly

I recently wrote a mailing platform for one of our employees to use. The system runs great, scales great, and is fun to use. However, it is currently inoperable due to a bug that I can't figure out how to fix (fairly inexperienced developer).
The process goes something like this...
Upload a CSV file to a specific FTP directory.
Go to the import_mailing_list page.
Choose a CSV file within the FTP directory.
Name and describe what the list contains.
Associate file headings with database columns.
Then, the back-end loops over each line of the file, associating the values with a heading, and importing these values into a database.
This all works wonderfully, except in a specific case, when a raw CSV is not correctly formatted. For example...
fname, lname, email
Bob, Schlumberger, bob#bob.com
Bobbette, Schlumberger
Another, Record, goeshere#email.com
As you can see, there is a missing comma on line two. This would cause an error when attempting to pull "valArray[3]" (or valArray[2], in the case of every language but mine).
I am looking for the most efficient solution to keep this error from happening. Perhaps I should check the array length, and compare it to the index we're going to attempt to pull, before pulling it. But to do this for each and every value seems inefficient. Anybody have another idea?
Our stack is ColdFusion 8/9 and MySQL 5.1. This is why I refer to the array index as [3].
There's ArrayIsDefined(array, elementIndex), or ArrayLen(array)
seems inefficient?
You gotta code what you need to code, forget about inefficiency. Get it right before you get it fast (when needed).
I suppose if you are looking for another way of doing this (instead of checking the array length each time, although that really doesn't sound that bad to me), you could wrap each line insert attempt in a try/catch block. If it fails, then stuff the failed row in a buffer (including the line number and error message) that you could then display to the user after the batch has completed, so they could see each of the failed lines and why they failed. This has the advantages of 1) not having to explicitly check the array length each time and 2) catching other errors that you might not have anticipated beforehand (maybe a value is too long for your field, for example).

Resources