How to parse an unconventional file with Talend? - parsing

I have a file shape like this :
How can I parse a file like this with Talend Open Studio ?
Here's what I tried :
In the tJavaRow, the input is the whole file in a single row. I split it and parse it manually. But I can't figure out how to create an output row for each OBJ in the file.
Is this the "Right" way of doing it ? Or is there a specific component for this type of files ?

But I can't figure out how to create an output row for each OBJ in the file
You can do this by using the tJavaFlex component:
Put your raw content in the globalMap by connecting it to tFlowToIterate
Put your split-and-parse logic in the "Start Code" part of tJavaFlex, using the contents you made available in step 1
Start a loop in the "Start Code" part of tJavaFlex (e.g. for each object)
Define your output schema in tJavaFlex
In the "Main Code" part of tJavaFlex, map your parsed object to the columns of your output row
Dont forget to close your loop in the "End Code" part of tJavaFlex
I layed out a quick example, with no parsing logic. But since you already got this down, I think it should be sufficient:
Start Code
String[] lines = ((String)globalMap.get("row1.content")).split("\r\n");
for(String line : lines) { // starts the "generating" loop
Main Code
row2.key = line; // uses the "generating" loop
End Code
} // closes the "generating" loop

Related

ImageJ/Fiji - Save CSV using macro

I am not a coder but trying to turn ThunderSTORM's batch process into an automated one where I have a single input folder and a single output folder.
input_directory = newArray("C:\\Users\\me\\Desktop\\Images");
output_directory = ("C:\\Users\\me\\Desktop\\Results");
for(i = 0; i < input_directory.length; i++) {
open(input_directory[i]);
originalName = getTitle();
originalNameWithoutExt = replace( originalName , ".tif" , "" );
fileName = originalNameWithoutExt;
run("Run analysis", "filter=[Wavelet filter (B-Spline)] scale=2.0 order=3 detector "+
"detector=[Local maximum] connectivity=8-neighbourhood threshold=std(Wave.F1) "+
"estimator=[PSF: Integrated Gaussian] sigma=1.6 method=[Weighted Least squares] fitradius=3 mfaenabled=false "+
"renderer=[Averaged shifted histograms] magnification=5.0 colorizez=true shifts=2 "+
"repaint=50 threed=false");
saveAs(fileName+"_Results", output_directory);
}
This probably looks like a huge mess but the original batch file used arrays and I can't figure out what that is. Taking it out brakes it so I left it in. The main issues I have revolve around the saveAs part not working.
Using run("Export Results") works but I need to manually pick a location and file name. I tried to set this up to take the file name and rename it to the generic image name so it can save a CSV using that name.
Any help pointing out why I'm a moron? I would also love to only open one file at a time (this opens them all) and close it when the analysis is complete. But I will settle for that happening on a different day if I can just manage to save the damn CSV automatically.
For the most part, I broke the code a whole bunch of times but it's in a working condition like this.
I appreciate any and all help. Thank you!

PHPSpreadsheet Fails to Save Line Chart Loaded From Template (Locked by 'another user')

I am running into a problem where phpspreadsheet will not write out a line chart from an xlsx file I'm using as a template.
I get two error messages when opening the spreadsheet:
We found a problem with some content in 'Hello World.xlsx'. Do you want us to try to recover as much as we can? If you trust the source of the workbook, click Yes
Hello World.xlsx is locked for editing
by 'another user'
Open 'Read-Only' or click 'Notify' to open read-only and receive notification when the document is no longer in use.
<?php
// ... snip
$reader = IOFactory::createReader($inputFileType);
$reader->setIncludeCharts(true);
$spreadsheet = $reader->load($inputFileName);
// ... snip
// populate chart data
$sheet = $spreadsheet->getSheetByName('Ratios');
// reverse order, so populate "backwards"
for($endingRow = 14; ($endingRow > 1) && ($row = $last13Ratios->nextRecord()); $endingRow--) {
$sheet->setCellValue('A'.$endingRow, $row->date);
$sheet->setCellValue('B'.$endingRow, $row->percent60/100);
$sheet->setCellValue('C'.$endingRow, $row->percent90/100);
$helper->log(print_r($row));
}
// ... snip
$writer = new XlsxWriter($spreadsheet);
$writer->setIncludeCharts(true);
$callStartTime = microtime(true);
$writer->save($saveFileName);
$helper->logWrite($writer, $saveFileName, $callStartTime);
$spreadsheet->disconnectWorksheets();
die();
If I put fake data in the template, it appears to work. But I don't want fake data, just in case something goes haywire. Better to have no information than wrong information.
The short answer is to put a space ' in the upper left cell of the section being used for data in the template.
Note that the data type appears to make a difference, also: a date instead of a string will fail.
Be sure to use the single quote mark to designate it as a string, otherwise excel will ignore the space when you save it.

How to show String new lines on gsp grails file?

I've stored a string in the database. When I save and retrieve the string and the result I'm getting is as following:
This is my new object
Testing multiple lines
-- Test 1
-- Test 2
-- Test 3
That is what I get from a println command when I call the save and index methods.
But when I show it on screen. It's being shown like:
This is my object Testing multiple lines -- Test 1 -- Test 2 -- Test 3
Already tried to show it like the following:
${adviceInstance.advice?.encodeAsHTML()}
But still the same thing.
Do I need to replace \n to or something like that? Is there any easier way to show it properly?
Common problems have a variety of solutions
1> could be you that you replace \n with <br>
so either in your controller/service or if you like in gsp:
${adviceInstance.advice?.replace('\n','<br>')}
2> display the content in a read-only textarea
<g:textArea name="something" readonly="true">
${adviceInstance.advice}
</g:textArea>
3> Use the <pre> tag
<pre>
${adviceInstance.advice}
</pre>
4> Use css white-space http://www.w3schools.com/cssref/pr_text_white-space.asp:
<div class="space">
</div>
//css code:
.space {
white-space:pre
}
Also make a note if you have a strict configuration for the storage of such fields that when you submit it via a form, there are additional elements I didn't delve into what it actually was, it may have actually be the return carriages or \r, anyhow explained in comments below. About the good rule to set a setter that trims the element each time it is received. i.e.:
Class Advice {
String advice
static constraints = {
advice(nullable:false, minSize:1, maxSize:255)
}
/*
* In this scenario with a a maxSize value, ensure you
* set your own setter to trim any hidden \r
* that may be posted back as part of the form request
* by end user. Trust me I got to know the hard way.
*/
void setAdvice(String adv) {
advice=adv.trim()
}
}
${raw(adviceInstance.advice?.encodeAsHTML().replace("\n", "<br>"))}
This is how i solve the problem.
Firstly make sure the string contains \n to denote line break.
For example :
String test = "This is first line. \n This is second line";
Then in gsp page use:
${raw(test?.replace("\n", "<br>"))}
The output will be as:
This is first line.
This is second line.

Groovy- searching and excretion xml code from log file

I have so many texts in log file but sometimes i got responses as a xml code and I have to cut this xml code and move to other files.
For example:
sThread1....dsadasdsadsadasdasdasdas.......dasdasdasdadasdasdasdadadsada
important xml code to cut and move to other file: <response><important> 1 </import...></response>
important xml code to other file: <response><important> 2 </important...></response>
sThread2....dsadasdsadsadasdasdasdas.......dasdasdasdadasdasdasdadadsada
Hindrance: xml code starting from difference numbers of sign (not always start in the same number of sign)
Please help me with finding method how to find xml code in text
Right now i tested substring() method but xml code not always start from this same sign :(
EDIT:
I found what I wanted, function which I searched was indexOf().
I needed a number of letter where String "Response is : " ending: so I used:
int positionOfXmlInLine = lineTxt.indexOf("<response")
And after this I can cut string to the end of the line :
def cuttedText = lineTxt.substring(positionOfXmlInLine);
So I have right now only a XML text/code from log file.
Next is a parsing XML value like BDKosher wrote under it.
Hoply that will help someone You guys
You might be able to leverage XmlSlurper for this, assuming your XML is valid enough. The code below will take each line of the log, wrap it in a root element, and parse it. Once parsed, it extracts and prints out the value of the <important> element's value attribute, but instead you could do whatever you need to do with the data:
def input = '''
sThread1..sdadassda..sdadasdsada....sdadasdas...
important code to cut and move to other file: **<response><important value="1"></important></response>**
important code to other file: ****<response><important value="3"></important></response>****
sThread2..dsadasd.s.da.das.d.as.das.d.as.da.sd.a.
'''
def parser = new XmlSlurper()
input.eachLine { line, lineNo ->
def output = parser.parseText("<wrapper>$line</wrapper>")
if (!output.response.isEmpty()) {
println "Line $lineNo is of importance ${output.response.important.#value.text()}"
}
}
This prints out:
Line 2 is of importance 1
Line 3 is of importance 3

Best way of storing an "array of records" at design-time

I have a set of data that I need to store at design-time to construct the contents of a group of components at run-time.
Something like this:
type
TVulnerabilityData = record
Vulnerability: TVulnerability;
Name: string;
Description: string;
ErrorMessage: string;
end;
What's the best way of storing this data at design-time for later retrieval at run-time? I'll have about 20 records for which I know all the contents of each "record" but I'm stuck on what's the best way of storing the data.
The only semi-elegant idea I've come up with is "construct" each record on the unit's initialization like this:
var
VulnerabilityData: array[Low(TVulnerability)..High(TVulnerability)] of TVulnerabilityData;
....
initialization
VulnerabilityData[0].Vulnerability := vVulnerability1;
VulnerabilityData[0].Name := 'Name of Vulnerability1';
VulnerabilityData[0].Description := 'Description of Vulnerability1';
VulnerabilityData[0].ErrorMessage := 'Error Message of Vulnerability1';
VulnerabilityData[1]......
.....
VulnerabilityData[20]......
Is there a better and/or more elegant solution than this?
Thanks for reading and for any insights you might provide.
You can also declare your array as consts and initialize it...
const
VulnerabilityData: array[Low(TVulnerability)..High(TVulnerability)] of TVulnerabilityData =
(
(Vulnerability : vVulnerability1; Name : Name1; Description : Description1; ErrorMessage : ErrorMessage1),
(Vulnerability : vVulnerability2; Name : Name2; Description : Description2; ErrorMessage : ErrorMessage2),
[...]
(Vulnerability : vVulnerabilityX; Name : NameX; Description : DescriptionX; ErrorMessage : ErrorMessageX)
)
);
I don't have an IDE on this computer to double check the syntax... might be a comma or two missing. But this is how you should do it I think.
not an answer but may be a clue: design-time controls can have images and other binary data associated with it, why not write your data to a resource file and read from there? iterating of course, to make it simpler, extensible and more elegant
The typical way would be a file, either properties style (a=b\n on each line) cdf, xml, yaml (preferred if you have a parser for it) or a database.
If you must specify it in code as in your example, you should start by putting it in something you can parse into a simple format then iterate over it. For instance, in Java I'd instantiate an array:
String[] vals=new String[]{
"Name of Vulnerability1", "Description of Vulnerability1", "Error Message of Vulnerability1",
"Name of Vulnerability2", ...
}
This puts all your data into one place and the loop that reads it can easily be changed to read it from a file.
I use this pattern all the time to create menus and for other string-intensive initialization.
Don't forget that you can throw some logic in there too! For instance, with menus I will sometimes create them using data like this:
"^File", "Open", "Close", "^Edit", "Copy", "Paste"
As I'm reading this in I scan for the ^ which tells the code to make this entry a top level item. I also use "+Item" to create a sub-group and "-Item" to go back up to the previous group.
Since you are completely specifying the format you can add power later. For instance, if you coded menus using the above system, you might decide at first that you could use the first letter of each item as an accelerator key. Later you find out that File/Close conflicts with another "C" item, you can just change the protocol to allow "Close*e" to specify that E should be the accelerator. You could even include ctrl-x with a different character. (If you do shorthand data entry tricks like this, document it with comments!)
Don't be afraid to write little tools like this, in the long run they will help you immensely, and I can turn out a parser like this and copy/paste the values into my code faster than you can mold a text file to fit your example.

Resources