Rsyslog avro log decoding fails - avro

I am using rsyslog, I have my devices send custom logs to rsyslog encoded in avro. When i receive this log in rsyslog via udp port, I am able t decode. But when it get stored in rsyslog and I try to parse the file and decode the log it fails. Some encoding the rsyslog is doing and lot of # being introduced when stored in file by rsyslog. Can anybody guide me how can i instruct rsyslog not to add any encoding and store this binary data as is? or any decoding level changes i can try out?
sample rsyslog looks like below
Apr 20 13:57:27 10.64.41.10 #000#000#000#000#010????\H4258f87a-7ffe-11ea-8dcf-9a2557ed7397P332B076268B04DE65C3479662A6A8DEB978D27ED#032vip-1_80_http#000P332B076268B04DE65C3479662A6A8DEB978D27ED#006slb#026200.11.10.2#016sg-testHa1536532-7ffd-11ea-a9c6-f673c1138ce6#016http1.1#006GET#002/#026200.11.10.6#000??#006?#003?#001?#003#000#000#000#000#000#000#026200.11.10.2#000#026curl/7.29.0#000#03010.64.42.101#000?#001#001#000#002#002#002#002#002#002#002#002#002#002#000#000#016Unknown#004--#004--#000#000#000#02610.64.41.10#000??#004#000#000 #000#000#000#000

Related

How to forward logs from docker container to Graylog server without pre-formatting?

I have a Docker container that sends its logs to Graylog via udp.
Previously I just used it to output raw messages, but now I've come up with a solution that logs in GELF format.
However, Docker just puts it into "message" field (screen from Graylog Web Interface):
Or in plain text:
{
"version":"1.1",
"host":"1eefd38079fa",
"short_message":"Content root path: /app",
"full_message":"Content root path: /app",
"timestamp":1633754884.93817,
"level":6,
"_contentRoot":"/app",
"_LoggerName":"Microsoft.Hosting.Lifetime",
"_threadid":"1",
"_date":"09-10-2021 04:48:04,938",
"_level":"INFO",
"_callsite":"Microsoft.Extensions.Hosting.Internal.ConsoleLifetime.OnApplicationStarted"
}
GELF-driver is configured in docker-compose file:
logging:
driver: "gelf"
options:
gelf-address: "udp://sample-ip:port"
How to make Docker just forward these already formatted logs?
Is there any way to process these logs and append them as custom fields to docker logs?
The perfect solution would be to somehow enable gelf log driver, but disable pre-processing / formatting since logs are already GELF.
PS. For logs I'm using NLog library, C# .NET 5 and its NuGet package https://github.com/farzadpanahi/NLog.GelfLayout
In my case, there was no need to use NLog at all. It was just a logging framework which no one attempted to dive into.
So a better alternative is to use GELF logger provider for Microsoft.Extensions.Logging: Gelf.Extensions.Logging - https://github.com/mattwcole/gelf-extensions-logging
Don't forget to disable GELF for docker container if it is enabled.
It supports additional fields, parameterization of the formatted string (parameters in curly braces {} become the graylog fields) and is easily configured via appsettings.json
Some might consider this not be an answer since I was using NLog, but for me -- this is a neat way to send customized logs without much trouble. As for NLog, I could not come up with a solution.

log file handling with docker syslog logging driver

Is there a way to pick up the log messages which are logged to a log file, when using the syslog log driver of Docker?
Whatever I write to sysout are getting picked by Rsyslog but anything logged to a log file is not picked. I don't see any option in the syslog driver option which could help indicate a log file to be picked up.
Thanks
Dockers logging interface is defined as stdout and stderr, so the best way is to modify the log settings of your process to send any log data to stdout and stderr.
Some applications can configure logging to go directly to syslog. Java processes using log4j are a good example of this.
If logging to file is the only option available, scripts, logstash, fluentd, rsyslog, and syslog-ng can all ingest text files and output syslog. This can either be done inside the a container with an additional service, or using a shared, standardised logging area on each Docker host and running the ingestion from there.

Logging to Logstash: separate logs of different applications in one container

I have rails application on passenger web server running in docker container. I'm trying to redirect application logs to Logstash. I redirect rails logs to STDOUT and configure container to use gelf log driver, wich redirects STDOUT to given Logstash server. But problem arises: Passenger web server writes his own logs to STDOUT too. And I get mixture of two logs, what make it difficult to separate and analyze.
What is best practices in such situation? How could I label each log stream to separate it in logstash?
If you really wanted, you could configure Passenger to write to its own stdout log, but I would avoid using STDOUT as an intermediary for logstash.
Try a library like logstash-logger. You could then write to a separate file, socket, or database. I think that's a cleaner approach, and potentially faster depending on the log destination.

Installing Wireshark on server to capture web service SOAP request and response

I am new to using Wireshark. Can I install Wireshark on server which is hosting Web Service to capture incoming requests and out going responses?
Example end point URL of my Web Service: http://MyIP:9086/WebService
For example my web service is using 9086 port. If I start capturing traffic on 9086, will it give me all request and response (SOAP messages)?
I have installed Wireshark on local laptop and can packets when SOAP UI send request to Web Service. But I want to install it on server and want to capture from that end. Is that feasible?
If the server is a linux box, you can use tcpdump, and tell it dump the traffic into a pcap file. This pcap file you can transfer to a local machine and load into wireshark.
From https://www.wireshark.org/docs/wsug_html_chunked/AppToolstcpdump.html
D.3. tcpdump: Capturing with tcpdump for viewing with Wireshark
It’s often more useful to capture packets using tcpdump rather than wireshark. For example, you might want to do a remote capture and either don’t have GUI access or don’t have Wireshark installed on the remote machine.
Older versions of tcpdump truncate packets to 68 or 96 bytes. If this is the case, use -s to capture full-sized packets:
$ tcpdump -i <interface> -s 65535 -w <some-file>
You will have to specify the correct interface and the name of a file to save into. In addition, you will have to terminate the capture with ^C when you believe you have captured enough packets.

How to use Tika in server mode

On Tika's website it says (concerning tika-app-1.2.jar) it can be used in server mode. Does anyone know how to send documents and receive parsed text from this server once it is running?
Tika supports two "server" modes. The simpler and original is the --server flag of Tika-App. The more functional, but also more recent is the JAX-RS JSR-311 server component, which is an additional jar.
The Tika-App Network Server is very simple to use. Simply start Tika-App with the --server flag, and a --port ### flag telling it what port to listen on. Then, connect to that port and send it a single file. You'll get back the html version. NetCat works well for this, something like java -jar tika-app.jar --server --port 12345 followed by nc 127.0.0.1 12345 < MyFileToExtract will get you back the html
The JAX-RS JSR-311 server component supports a few different urls, for things like metadata, plain text etc. You start the server with java -jar tika-server.jar, then do HTTP put calls to the appropriate url with your input document and you'll get the resource back. There are loads of details and examples (including using curl for testing) on the wiki page
The Tika App Network Server is fairly simple, only supports one mode (extract to HTML), and is generally used for testing / demos / prototyping / etc. The Tika JAXRS Server is a fully RESTful service which talks HTTP, and exposes a wide range of Tika's modes. It's the generally recommended way these days to interface with Tika over the network, and/or from non-Java stacks.
Just adding to #Gagravarr's great answer.
When talking about Tika in server mode, it is important to differentiate between two versions which can otherwise cause confusion:
tika-app.jar has the --server --port 9998 options to start a simple server
tika-server.jar is a separate component using JAX-RS
The first option only provides text extraction and returns the content as HTML. Most likely, what you really want is the second option, which is a RESTful service exposing many more of Tika's features.
You can simply download the tika-server.jar from the Tika project site. Start the server using
java -jar tika-server-x.x.jar -h 0.0.0.0
The -h 0.0.0.0 (host) option makes the server listen for any incoming requests, otherwise without it it would only listen for requests from localhost. You can also add the -p option to change the port, otherwise it defaults to 9998.
Then, once the server has started you can simply access it using your browser. It will list all available endpoints.
Finally to extract meta data from a file you can use cURL like this:
curl -T testWORD.doc http://example.com:9998/meta
Returns the meta data as key/value pairs one per line. You can also have Tika return the results as JSON by adding the proper accept header:
curl -H "Accept: application/json" -T testWORD.doc http://example.com:9998/meta
[Update 2015-01-19] Previously the comment said that tika-server.jar is not available as download. Fixed that since it actually does exist as a binary download.
To enhance Gagravarr perfect answer:
If your document is got from a WEB server => curl -u
"http://myserver-domain/*path-to-doc*/doc-name.extension" | nc
127.0.0.1 12345
And it is even better if the document is protected by password => curl -u
login:*password*
"http://myserver-domain/*path-to-doc*/doc-name.extension" | nc
127.0.0.1 12345

Resources