Apache TIKA - MediaDataBox iso files - apache-tika

It seems that Apacke Tika 1.24.1 is creating lots of /tmp/MediaDataBox ISO files, and my /tmp partition gets filled up.
What is MediaDataBox ISO file used for?
Can we somehow tell Tika to save it in another directory?
Tika runs in server mode as follows:
java -Xmx3G -jar tika-server.jar -spawnChild --host=hostname.domain.com

This example shows how to save temporary files in an alternate directory:
java -Djava.io.tmpdir=/somewhere/tmp -jar tika-server.jar -spawnChild -JXmx3G -JDjava.io.tmpdir=/somewhere/tmp --host=hostname.domain.com
I found useful information in Tika Server docs

Related

How to set log4j.property to .jar location

I'm Setting up Log4j2 in a Spring-boot application. I now want to create a /log directory exactly where the .jar file is located.
This is needed as we start the java application from a startup script and the configuration should work on both windows and unix developer machines as well as a server.
I already tried with:
<RollingFile name="FileAppender" fileName="./logs/mylog.log"
filePattern="logs/mylog-%d{yyyy-MM-dd}-%i.log">
which just creates a log folder at the directory where the jar gets started.
then I read i should use .\log/mylog.log as .\ points to the directory of the jar file.
But then it just creates a folder called .\log.
I also tried with configuration with jvm arguments and calling them at the log4j2.xml with: ${logFile}. Now a directory gets created called '${logFile}.
The only ${} command working is the directory of the log4j configuration file. But as this is inside the jar it just gets me a pretty useless folder structure
Thanks in Advance
EDIT: In the End what I did was setting up two configuration files, log4j2.xml and log4j2-prod.xml
The log4j2.xml took the system property as Vikas Sachdeva mentioned, while the prod.xml got the location hard coded.
Not really the solution I was looking for but made it work.
One solution is to pass log directory location through system properties.
Configuration file will look like -
<RollingFile name="FileAppender" fileName="${sys:basePath}/mylog.log"
filePattern="${sys:basePath}/mylog-%d{yyyy-MM-dd}-%i.log">
Now, pass VM argument basePath with absolute path of directory containing JAR file -
java -jar myapp.jar -DbasePath=/home/ubuntu/app

java -jar saxon9he.jar persons.xml persons_users.xslt -o:persons_transformed.txt

I am having an XSLT to convert my xml into html format. I didn't achive any Experience about Saxon before but I'll try again and again.
This is the problem I had :
C:>java -cp saxon9he.jar net.sf.saxon.Transform -t -s:samples\date\books,xml-csl:samples\styles\books,xsl -o:c:\temp.html
Error: Main class net.sf.saxon.Transform could not be found or loaded
I did everything step by step from the Saxon Website :
https://www.saxonica.com/html/documentation/about/gettingstarted/gettingstartedjava.html
and I saw MR.Michael Kay Videos a lot before but it isn't work any way.
Can perhaps any one help me please ?
The problems are with Java, not with Saxon, in case that helps you look in the right place for documentation.
The message "Main class net.sf.saxon.Transform could not be found or loaded" is Java telling you that it can't find Saxon.
The bit of the command that tells it where to look is this:
java -cp saxon9he.jar net.sf.saxon.Transform
Here "java" is telling the operating system to load the Java virtual machine (which has succeeded). The "-cp" option is telling Java what Jar files to search for the relevant classes, and the "net.sf.saxon.Transform" part is saying what the relevant class is.
The problem is probably that there is no file called saxon9he.jar in the current working directory. Unfortunately Java doesn't give you an explicit error message for this, it just ignores this part of the command. Probably the current working directory isn't what you think it is. If you do "ls" or "dir" immediately before the "java" command, it will tell you what files are in the current working directory, which should include the saxon9he.jar file. If the JAR file is in some other directory, you can supply an explicit path, e.g. -cp c:/mike/java/saxon9he.jar.

Tika server ROOT directory

How can I find the location of the tika server ROOT directory that hosts the html welcome page for the Apache Tika 1.x-SNAPSHOT Server? For example, say that maven built tika in file:///opt/tika-trunk/tika-server/target/classes/org/apache/tika/server/ , so in this path there exist all the classes that maven created. But where is the server root directory and the folder structure that someone would expect, like where is the /detectors folder? where is the /mime-types folder etc.? I am sure missing something important here, so I wish you could give me a hint...
Thanks

Cannot open Selenium Jar file from CMD. Path or ClassPath issue?

I'm trying to launch:
java -jar selenium-server-standalone-2.14.0.jar -role hub
from my Command Prompt but output was as below:
C:\Program Files (x86)>java -jar selenium-server-standalone-2.14.0.jar -role hub
Unable to access jarfile selenium-server-standalone-2.14.0.jar
C:\Program Files (x86) is where the jar file is located.
I've put C:\Program Files (x86) in my PATH and CLASSPATH and it still won't work.
Your filename must be wrong. Check whether you have a file named -selenium-server-standalone-2.14.0.jar. Chances are you won't be. :)
I encountered the same issue.
The solution is that the naming convention matters.
if you have a selenium server standalone.jar file, you can rename it first
to make it look simple(example abc.jar).
1) If the jar file in your system is encountered with the .jar extension,
then after renaming, give the .jar extension.(eg abc.jar)
2) If the jar file in your system is not having the .jar extension,
then after renaming, dont provide the .jar extension(eg abc)
3)Start the hub now:
java -jar abc.jar -role hub
Regards,
Nikhil Kanojia
Unable to access the jarfile is considered as Common Error.
This error can occur when starting up either a hub or node. This means Java cannot find the selenium-server jar file. Either run the command from the directory where the selenium-server-XXXX.jar file is stored, or specify an explicit path to the jar.
Go for details here
1.goto root mode
2. install $apt install mlocate
3. locate your jar file
4. check the correct jar file name and again try to open it with specific command
5 java -jar ./selenium-server-standalone-2.14.0.jar
I had the same issue with ubuntu. Try following steps..
Go to the directory where the jar file located.
Then execute the .jar file in the directory using,
java -jar ./selenium-server-standalone-2.14.0.jar
Go to desired location in command prompt and enter the below command.
java -jar ./selenium-server-standalone-3.141.59.jar
That means if you save the .jar file in "C:\Eclipse\jar" location then the command should be:
C:\Eclipse\jar>java -jar ./selenium-server-standalone-3.141.59.jar

Solr : Can post.jar be used to post files recursively within folder and subfolder

If I would like to post all the xml files in a folder then I use post.jar.
java -jar post.jar *.xml
In case if I would like to post the files recursively ( i.e post xml files under subfolder level also is there anyway to achieve this.)
If you're on a unix-like (OSX or Linux), you can do something like this:
find . -name \*xml | xargs java -jar post.jar
That'll find all .xml files in or under the current directory ('.') and pass them as parameters to the java -jar post.jar command.
find is incredibly opaque, but very useful for stuff like this.
Please think twice before using stuff under the /example directory of Solr. I use Solr from Tomcat (instead of the Jetty embedded in the start.jar) and I use URLLIB2 in Python for POSTing data to Solr. (Jetty is a production-level software, so dont worry too much about that).
So, for uploading files, consider writing it in your favorite programming language. You can implement folder recursion yourself. For POSTing files, you need libCURL , which can send HTTP GET, form POSTs, multipart POSTs etc. A C program using libCURL needs no more than 8 lines to POST a file. CURL bindings exist for all major languages, so you can recycle libCURL stuff written in C to PHP, for example.
Yes, you can!
You just need to ad the -Drecursive flag.
java -Drecursive -jar post.jar *
For further options for the flags and usage just use:
java -jar post.jar -h

Resources