When a message which attachment gets saved in Microsoft Outlook, it it saved as a '.msg' files which contains all the content of the email along with the aattachment files. I'd like to extract the textual content of the body of the email as well as it's attachments. Does Apache Tika support '.msg' files? If not any other idea?
If you look at the list of mail formats supported by Apache Tika 1.9 (currently the latest version), you'll see that Outlook MSG files are listed as being supported.
Taking a simple example MSG file from the Apache POI project's test files, and using the Tika App standalone jar to make testing easy, we can easily get out the contents and the metadata:
$ java -jar tika-app-1.9.jar --metadata simple_test_msg.msg
Author: Travis Ferguson
Content-Length: 16896
Content-Type: application/vnd.ms-outlook
Creation-Date: 2007-07-06T05:27:17Z
Last-Modified: 2007-07-06T05:27:17Z
Last-Save-Date: 2007-07-06T05:27:17Z
Message-Bcc:
Message-Cc:
Message-From: Travis Ferguson
Message-Recipient-Address: travis#overwrittenstack.com
Message-To: travis#overwrittenstack.com
X-Parsed-By: org.apache.tika.parser.DefaultParser
X-Parsed-By: org.apache.tika.parser.microsoft.OfficeParser
creator: Travis Ferguson
date: 2007-07-06T05:27:17Z
dc:creator: Travis Ferguson
dc:description: test message
dc:title: test message
dcterms:created: 2007-07-06T05:27:17Z
dcterms:modified: 2007-07-06T05:27:17Z
meta:author: Travis Ferguson
meta:creation-date: 2007-07-06T05:27:17Z
meta:save-date: 2007-07-06T05:27:17Z
modified: 2007-07-06T05:27:17Z
resourceName: simple_test_msg.msg
subject: test message
title: test message
$ java -jar tika-app-1.9.jar --text simple_test_msg.msg
test message
From
Travis Ferguson
To
travis#overwrittenstack.com
Recipients
travis#overwrittenstack.com
This is a test message.
Metadata, including senders, receipients, dates etc, text, all you could want!
Alternately, if you have special needs/requirements and want full control, you can use the underlying Apache POI HSMF library to parse your MSG files, look at the HSMF unit tests for usage examples
Tika does support msg files
You can use apache POI there are some examples around like this one
sample:
public static void main(String[] args) throws Exception{
MsgParser msgp = new MsgParser();
Message msg = msgp.parseMsg("c:/temp/test2.msg");
String fromEmail = msg.getFromEmail();
String fromName = msg.getFromName();
String subject = msg.getSubject();
String body = msg.getBodyText();
System.out.println("From :" + fromName + " <" + fromEmail + ">");
System.out.println("Subject :" + subject);
System.out.println("");
System.out.println(body);
System.out.println("");
List atts = msg.getAttachments();
for (Attachment att : atts) {
if (att instanceof FileAttachment) {
FileAttachment file = (FileAttachment) att;
System.out.println("Attachment : " + file.getFilename());
// you get the actual attachment with
// byte date[] = file.getData();
}
}
}
Related
How can we get multiple headings inside description as following:
description: |
The Engine API ....
# Errors
The API uses standard HTTP status codes ...
# Versioning
Some other text
In the above example, Errors and Versioning are two headers.
Currently, I am adding description as following:
ApiInfoBuilder().description(" ... ")
You can go with markdown like that:
String description = "# The Engine API ...\n" +
"## Errors\n" +
"The API uses standard HTTP status codes ...\n" +
"## Versioning\n" +
"Some other text\n";
The \n new lines are important here.
Or simply use plain HTML:
String description = "<h2>The Engine API ....</h2>" +
"<h3>Errors</h3>" +
"<p>The API uses standard HTTP status codes ...</p>" +
"<h3>Versioning</h3>" +
"<p>Some other text</p>";
Then add the string to the API info:
ApiInfoBuilder().description(description)
After reading about Apache Flume and the benefits it provides in terms of handling client events I decided it was time to start looking into this in more detail. Another great benefit appears to be that it can handle Apache Avro objects :-) However, I am struggle to understand how the Avro schema is used to validate Flume events received.
To help understand my problem in more detail I have provided code snippets below;
Avro schema
For the purpose of this post I am using a sample schema defining a nested Object1 record with 2 fields.
{
"namespace": "com.example.avro",
"name": "Example",
"type": "record",
"fields": [
{
"name": "object1",
"type": {
"name": "Object1",
"type": "record",
"fields": [
{
"name": "value1",
"type": "string"
},
{
"name": "value2",
"type": "string"
}
]
}
}
]
}
Embedded Flume agent
Within my Java project I am currently using the Apache Flume embedded agent as detailed below;
public static void main(String[] args) {
final Event event = EventBuilder.withBody("Test", Charset.forName("UTF-8"));
final Map<String, String> properties = new HashMap<>();
properties.put("channel.type", "memory");
properties.put("channel.capacity", "100");
properties.put("sinks", "sink1");
properties.put("sink1.type", "avro");
properties.put("sink1.hostname", "192.168.99.101");
properties.put("sink1.port", "11111");
properties.put("sink1.batch-size", "1");
properties.put("processor.type", "failover");
final EmbeddedAgent embeddedAgent = new EmbeddedAgent("TestAgent");
embeddedAgent.configure(properties);
embeddedAgent.start();
try {
embeddedAgent.put(event);
} catch (EventDeliveryException e) {
e.printStackTrace();
}
}
In the above example I am creating a new Flume event with "Test" defined as the event body sending events to a separate Apache Flume agent running inside a VM (192.168.99.101).
Remote Flume agent
As described above I have configured this agent to receive events from the embedded Flume agent. The Flume configuration for this agent looks like;
# Name the components on this agent
hello.sources = avroSource
hello.channels = memoryChannel
hello.sinks = loggerSink
# Describe/configure the source
hello.sources.avroSource.type = avro
hello.sources.avroSource.bind = 0.0.0.0
hello.sources.avroSource.port = 11111
hello.sources.avroSource.channels = memoryChannel
# Describe the sink
hello.sinks.loggerSink.type = logger
# Use a channel which buffers events in memory
hello.channels.memoryChannel.type = memory
hello.channels.memoryChannel.capacity = 1000
hello.channels.memoryChannel.transactionCapacity = 1000
# Bind the source and sink to the channel
hello.sources.avroSource.channels = memoryChannel
hello.sinks.loggerSink.channel = memoryChannel
And I am executing the following command to launch the agent;
./bin/flume-ng agent --conf conf --conf-file ../sample-flume.conf --name hello -Dflume.root.logger=TRACE,console -Dorg.apache.flume.log.printconfig=true -Dorg.apache.flume.log.rawdata=true
When I execute the Java project main method I see the "Test" event is passed through to my logger sink with the following output;
2019-02-18 14:15:09,998 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 54 65 73 74 Test }
However, it is unclear to me exactly where I should configure the Avro schema to ensure that only valid events are received and processed by Flume. Can someone please help me understand where I am going wrong? Or, if I have misunderstood the intention of how Flume is designed to convert Flume events into Avro events?
In addition to the above I have also tried using the Avro RPC client after changing the Avro schema to specify a protocol talking directly to my remote Flume agent, but when I attempt to send events I see the following error;
Exception in thread "main" org.apache.avro.AvroRuntimeException: Not a remote message: test
at org.apache.avro.ipc.Requestor$Response.getResponse(Requestor.java:532)
at org.apache.avro.ipc.Requestor$TransceiverCallback.handleResult(Requestor.java:359)
at org.apache.avro.ipc.Requestor$TransceiverCallback.handleResult(Requestor.java:322)
at org.apache.avro.ipc.NettyTransceiver$NettyClientAvroHandler.messageReceived(NettyTransceiver.java:613)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.apache.avro.ipc.NettyTransceiver$NettyClientAvroHandler.handleUpstream(NettyTransceiver.java:595)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:558)
at org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:786)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:296)
at org.jboss.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:458)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:439)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
at org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:558)
at org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:553)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:268)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:255)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:84)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:471)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:332)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:102)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
My goal is that I am able to ensure that events populated by my application conforms to the Avro schema generated to avoid invalid events being published. I would prefer I achieve this using the embedded Flume agent, but if this is not possible then I would consider using the Avro RPC approach talking directly to my remote Flume agent.
Any help / guidance would be a great help. Thanks in advance.
UPDATE
After further reading I wonder if I have misunderstood the purpose of Apache Flume. I originally thought this could be used to automatically create Avro events based on the data / schema, but now wondering if the application should assume responsibility for producing Avro events which will be stored in Flume according to the channel configuration and sent as a batch via the sink (in my case a Spark Streaming cluster).
If the above is correct then I would like to know whether Flume is required to know about the schema or just my Spark Streaming cluster which will eventually process this data? If Flume is required to know about the schema then can you please provide details of how this can be achieved?
Thanks in advance.
Since your goal is to process the data using Spark Streaming cluster you may solve this problem with 2 solutions
1) Using Flume client (tested with flume-ng-sdk 1.9.0) and Spark Streaming (tested with spark-streaming_2.11 2.4.0 and spark-streaming-flume_2.11 2.3.0) without Flume server in between the network topology.
Client class sends Flume json event at port 41416
public class JSONFlumeClient {
public static void main(String[] args) {
RpcClient client = RpcClientFactory.getDefaultInstance("localhost", 41416);
String jsonData = "{\r\n" + " \"namespace\": \"com.example.avro\",\r\n" + " \"name\": \"Example\",\r\n"
+ " \"type\": \"record\",\r\n" + " \"fields\": [\r\n" + " {\r\n"
+ " \"name\": \"object1\",\r\n" + " \"type\": {\r\n" + " \"name\": \"Object1\",\r\n"
+ " \"type\": \"record\",\r\n" + " \"fields\": [\r\n" + " {\r\n"
+ " \"name\": \"value1\",\r\n" + " \"type\": \"string\"\r\n" + " },\r\n"
+ " {\r\n" + " \"name\": \"value2\",\r\n" + " \"type\": \"string\"\r\n"
+ " }\r\n" + " ]\r\n" + " }\r\n" + " }\r\n" + " ]\r\n" + "}";
Event event = EventBuilder.withBody(jsonData, Charset.forName("UTF-8"));
try {
client.append(event);
} catch (Throwable t) {
System.err.println(t.getMessage());
t.printStackTrace();
} finally {
client.close();
}
}
}
Spark Streaming Server class listens at port 41416
public class SparkStreamingToySample {
public static void main(String[] args) throws Exception {
SparkConf sparkConf = new SparkConf().setMaster("local[2]")
.setAppName("SparkStreamingToySample");
JavaStreamingContext ssc = new JavaStreamingContext(sparkConf, Durations.seconds(30));
JavaReceiverInputDStream<SparkFlumeEvent> lines = FlumeUtils
.createStream(ssc, "localhost", 41416);
lines.map(sfe -> new String(sfe.event().getBody().array(), "UTF-8"))
.foreachRDD((data,time)->
System.out.println("***" + new Date(time.milliseconds()) + "=" + data.collect().toString()));
ssc.start();
ssc.awaitTermination();
}
}
2) Using Flume client + Flume server between + Spark Streaming (as Flume Sink) as network topology.
For this option, the code is the same, but now the SparkStreaming must specify the full dns qualified hostname instead of localhost to start SparkStreaming server at same port 41416 if you're running this locally for testing. The Flume client will connect to flume server port 41415. The tricky part now is how to define your flume topology. You need to specify both a source and a sink for this to work.
See flume conf below
agent1.channels.ch1.type = memory
agent1.sources.avroSource1.channels = ch1
agent1.sources.avroSource1.type = avro
agent1.sources.avroSource1.bind = 0.0.0.0
agent1.sources.avroSource1.port = 41415
agent1.sinks.avroSink.channel = ch1
agent1.sinks.avroSink.type = avro
agent1.sinks.avroSink.hostname = <full dns qualified hostname>
agent1.sinks.avroSink.port = 41416
agent1.channels = ch1
agent1.sources = avroSource1
agent1.sinks = avroSink
You should get same results with both solutions, but returning to your question of if Flume is really needed for Spark Streaming contents from Json stream, the answer is it depends, Flume supports interceptors so in this case it could be used to cleanse or filter invalid data for your Spark project, but since you're adding an extra component to the topology it may impact performance and require more resources (CPU/Memory) than without Flume.
We have several JIRA issues which have over 1000 duplicated, bogus, spam-like comments. How can we quickly delete them?
Background:
We disabled a user in active directory (Exchange) but not JIRA, so JIRA kept trying to email them updates. The email server gave a bounce-back message, and JIRA dutifully logged it to the task, which caused it to send another update, and a feedback loop was born.
The messages have this format:
Delivery has failed to these recipients or groups:
mail#example.com<mail#example.com>
The e-mail address you entered couldn't be found. Please check the recipient's e-mail address and try to resend the message. If the problem continues, please contact your helpdesk.
Diagnostic information for administrators:
Generating server: emailserver.example.com
user#example.com
#550 5.1.1 RESOLVER.ADR.RecipNotFound; not found ##
Original message headers:
Received: from jiraserver.example.com (10.0.0.999) by emailserver.example.com (10.0.0.999)
with Microsoft SMTP Server id nn.n.nnn.n; Mon, 13 Jun 2016 15:57:04 -0500
Date: Mon, 13 Jun 2016 15:57:03 -0500
Our research did not discover an easy way without using purchased plug-ins such as Script Runner or "hacking" the database, which we wanted to avoid.
Note:
We came up with a solution and are posting here to share.
I created a python script to remove all comments for a specific Jira issue.
It uses the API from Jira.
'''
This script removes all comments from a specified jira issue
Please provide Jira-Issue-Key/Id, Jira-URL, Username, PAssword in the variables below.
'''
import sys
import json
import requests
import urllib3
# Jira Issue Key or Id where comments are deleted from
JIRA_ISSUE_KEY = 'TST-123'
# URL to Jira
URL_JIRA = 'https://jira.contoso.com'
# Username with enough rights to delete comments
JIRA_USERNAME = 'admin'
# Password to Jira User
JIRA_PASSWORD = 'S9ev6ZpQ4sy2VFH2_bjKKQAYRUlDfW7ujNnrIq9Lbn5w'
''' ----- ----- Do not change anything below ----- ----- '''
# Ignore SSL problem (certificate) - self signed
urllib3.disable_warnings()
# get issue comments:
# https://developer.atlassian.com/cloud/jira/platform/rest/#api-api-2-issue-issueIdOrKey-comment-get
URL_GET_COMMENT = '{0}/rest/api/latest/issue/{1}/comment'.format(URL_JIRA, JIRA_ISSUE_KEY)
# delete issue comment:
# https://developer.atlassian.com/cloud/jira/platform/rest/#api-api-2-issue-issueIdOrKey-comment-id-delete
URL_DELETE_COMMENT = '{0}/rest/api/2/issue/{1}/comment/{2}'
def user_yesno():
''' Asks user for input yes or no, responds with boolean '''
allowed_response_yes = {'yes', 'y'}
allowed_response_no = {'no', 'n'}
user_response = input().lower()
if user_response in allowed_response_yes:
return True
elif user_response in allowed_response_no:
return False
else:
sys.stdout.write("Please respond with 'yes' or 'no'")
return False
# get jira comments
RESPONSE = requests.get(URL_GET_COMMENT, verify=False, auth=(JIRA_USERNAME, JIRA_PASSWORD))
# check if http response is OK (200)
if RESPONSE.status_code != 200:
print('Exit-101: Could not connect to api [HTTP-Error: {0}]'.format(RESPONSE.status_code))
sys.exit(101)
# parse response to json
JSON_RESPONSE = json.loads(RESPONSE.text)
# get user confirmation to delete all comments for issue
print('You want to delete {0} comments for issue {1}? (yes/no)' \
.format(len(JSON_RESPONSE['comments']), JIRA_ISSUE_KEY))
if user_yesno():
for jira_comment in JSON_RESPONSE['comments']:
print('Deleting Jira comment {0}'.format(jira_comment['id']))
# send delete request
RESPONSE = requests.delete(
URL_DELETE_COMMENT.format(URL_JIRA, JIRA_ISSUE_KEY, jira_comment['id']),
verify=False, auth=(JIRA_USERNAME, JIRA_PASSWORD))
# check if http response is No Content (204)
if RESPONSE.status_code != 204:
print('Exit-102: Could not connect to api [HTTP-Error: {0}; {1}]' \
.format(RESPONSE.status_code, RESPONSE.text))
sys.exit(102)
else:
print('User abort script...')
source control: https://gist.github.com/fty4/151ee7070f2a3f9da2cfa9b1ee1c132d
Use the JIRA REST API through the Chrome JavaScript Console.
Background:
We didn't want to write a full application for what we hope is an isolated occurrence. We originally planned to use PowerShell's Invoke-WebRequest. However, authentication proved to be a challenge. The API supports Basic Authentication, though it's only recommended when using SSL, which we weren't using for our internal server. Also, our initial tests resulted in 401 errors (perhaps due to a bug).
However, the API also supports cookie-based authentication, so as long as you are generating the request from a browser which has a valid JIRA session, it just works. We chose that method.
Solution details:
First, find and review the relevant comment and issue IDs:
SELECT * FROM jira..jiraaction WHERE actiontype = 'comment' AND actionbody LIKE '%RESOLVER.ADR.RecipNotFound%';
This might be a slow query depending on the size of your JIRA data. It seems to be indexed on the issueid, so if you know that, specify it. Also, add other criteria to this query so that it only represents the comments you wish to delete.
The solution below is written for comments on a single issue, but with some additional JavaScript could be expanded to support multiple issues.
We need the list of comment IDs for use in the Chrome JavaScript console. A useful format is a comma-delimited list of strings, which you can create as follows:
SELECT '"' + CONVERT(VARCHAR(50),ID) + '", ' FROM jira..jiraaction WHERE actiontype = 'comment' AND actionbody LIKE '%RESOLVER.ADR.RecipNotFound%' AND issueid = #issueid FOR XML PATH('')
(This is not necessarily the best way to concatenate strings in SQL, but it's simple and works for this purpose.)
Now, open a new browser session and authenticate to your JIRA instance. We used Chrome, but any browser with a JavaScript console should do.
Take the string produced by that query and drop it in the JavaScript console inside of a statement like this:
CommentIDs = [StringFromSQL];
You will need to trim the trailing comma manually (or adjust the above query to do so for you). It will look like this:
CommentIDs = ["1234", "2345"];
When you run that command, you will have created a JavaScript array with all of those comment IDs.
Now we arrive at the meat of the technique. We will loop over the contents of that array and make a new AJAX call to the REST API using XMLHttpRequest (often abbreviated XHR). (There is also a jQuery option.)
for (let s of CommentIDs) {let r = new XMLHttpRequest; r.open("DELETE","http://jira.example.com/rest/api/2/issue/11111/comment/"+s,true); r.send();}
You must replace "11111" with the relevant issue ID. You can repeat this for multiple issue IDs, or you can build a multi-dimensional array and a fancier loop.
This is not elegant. It doesn't have any error handling, but you can monitor the progress using the Chrome JavaScript API.
I would use a jira-python script or a ScriptRunner groovy script. Even for a one-off bulk update, because it is easier to test and requires no database access.
Glad it worked for you though!
We solved this problem, which occurs from time to time, with ScriptRunner and a Groovy script:
// this script takes some time, when executing it in console, it takes a long time to repsonse, and then the console retunrs "null"
// - but it kepps running in the backgorund, give it some time - at least 1 second per comment and attachment to delete.
import com.atlassian.jira.component.ComponentAccessor
import com.atlassian.jira.issue.IssueManager
import com.atlassian.jira.issue.MutableIssue
import com.atlassian.jira.issue.comments.Comment
import com.atlassian.jira.issue.comments.CommentManager
import com.atlassian.jira.issue.attachment.Attachment
import com.atlassian.jira.issue.managers.DefaultAttachmentManager
import com.atlassian.jira.issue.AttachmentManager
import org.apache.log4j.Logger
import org.apache.log4j.Level
log.setLevel(Level.DEBUG)
// NRS-1959
def issueKeys = ['XS-8071', 'XS-8060', 'XS-8065', 'XRFS-26', 'NRNM-45']
def deleted_attachments = 0
def deleted_comments = 0
IssueManager issueManager = ComponentAccessor.issueManager
CommentManager commentManager = ComponentAccessor.commentManager
AttachmentManager attachmentManager = ComponentAccessor.attachmentManager
issueKeys.each{ issueKey ->
MutableIssue issue = issueManager.getIssueObject(issueKey)
List<Comment> comments = commentManager.getComments(issue)
comments.each {comment ->
if (comment.body.contains('550 5.1.1 The email account that you tried to reach does not exist')) {
log.info issueKey + " DELETE comment:"
//log.debug comment.body
commentManager.delete(comment)
deleted_comments++
} else {
log.info issueKey + " KEEP comment:"
log.debug comment.body
}
}
List<Attachment> attachments = attachmentManager.getAttachments(issue)
attachments.each {attachment ->
if (attachment.filename.equals('icon.png')) {
log.info issueKey + " DELETE attachment " + attachment.filename
attachmentManager.deleteAttachment(attachment)
deleted_attachments++
} else {
log.info issueKey + " KEEP attachment " + attachment.filename
}
}
}
log.info "${deleted_comments} deleted comments, and ${deleted_attachments} deleted attachments"
return "${deleted_comments} deleted comments, and ${deleted_attachments} deleted attachments"
I am using Jenkins and email-ext to send build notifications. Build produces small custom report stored as simple HTML in out directory. Right now I can easy attach this report to the email, but what I want is to have this report rendered to the email body itself. I know that I can write my own jelly template, but this requires access to the build server(to install jelly script), I want to avoid this too. So my questions are:
Can I include content of the arbitrary file(generated during build) to Content field of the Email-Ext plugin?
If I cannot, what is most easy way to send such report in Jenkins?
It looks like most easy way is to use 'Pre-send Script' of the Email-ext
Script looks like this:
def reportPath = build.getWorkspace().child("HealthTestResults.html")
msg.setContent(reportPath.readToString(), "text/html");
It renders content of the HealthTestResults.html located in the root of the workspace just as body of the message.
Easy way to include the html file to email content is to add below line in the default content = ${FILE, path="yourfilename.html"}
this works for me in jenkins email ext plugin
If you use groovy template, then you can do something like this
stage("Notifications") {
steps {
script {
env.ForEmailPlugin = env.WORKSPACE
emailext mimeType: "text/html",
body: '''${SCRIPT, template="notification.template"}''',
subject: env.JENKINS_PROJECT + " " + currentBuild.currentResult + " : " + env.JOB_NAME + "/" + env.GIT_COMMIT,
recipientProviders: [requestor(), developers()],
from: env.JENKINS_EMAIL,
attachLog: true
}
}
}
and slice of /var/lib/jenkins/email-templates/notification.template
<!-- ARGOCD SYNC OUTPUT -->
<h3>ArgoCD<h3/>
<hr size="1" />
<tt>
${new File("/tmp/sync.txt").text.replaceAll('\n','<br>').replaceAll(' ', ' ')}
</tt>
I have stored SOAP XML file into local machine now to validate that XML i need to load that file and extract the XML from SOAP body to validate it.
SOAP XML
<SOAP:Envelope>
<SOAP:Body>
<request>
<Login>
<Username>abc</Username>
<Password>abc</Password>
</Login>
</request>
</SOAP:Body>
</SOAP:Envelope>
I need to extract the <Login> XML from above SOAP request.
I am using FileUtils to read file into string. when i read the file using that it also read the characters like \n\t\t etc means it consider the newline and tab which is in the XML formatted file.
i extract the child node from XML as string using below code.
InputStream requestXMLInputStream = new ByteArrayInputStream(soapRequestXML.getBytes());
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder;
String requestXMLBody = "";
Document doc = null;
try {
dBuilder = dbFactory.newDocumentBuilder();
doc = dBuilder.parse(requestXMLInputStream);
NodeList requestNodeList = doc.getElementsByTagName(parentTagName);
Node node = requestNodeList.item(0);
DOMImplementationLS domImplLS = (DOMImplementationLS) doc.getImplementation();
LSSerializer serializer = domImplLS.createLSSerializer();
if(node != null && node.getFirstChild() != null)
requestXMLBody = serializer.writeToString(node.getFirstChild());
} catch (SAXException e) {
logger.error(e.getMessage());
} catch (IOException e) {
logger.error(e.getMessage());
}catch (ParserConfigurationException e) {
logger.error(e.getMessage());
}
return requestXMLBody;
How can i read an XML file in without these characters.
Please help.
Using string manipulation for reading XML documents is a well-tested recipe for headache and frustration. Use only published XML API's for that.
The general approach would be:
1) Assuming that the SOAP XML is stored in a file, use the JAXP API to parse the XML into a Document object.
2) Navigate to the Login element using XPath (use the standard Java XPath API, included in the JDK).
3) Once you have a reference to the Login element, use a JAXP Transformer to serialize the Login element into a string.
(Step (1) would change if the SOAP packet is received through usage of one of the standard web services API's. You mentioned that the SOAP packet is given as stored in a file)