Apache Tika - Data file mime type is shown as text/plain - apache-tika

I am new to Apache Tika . I am using tika to get the file mime type. Below is the code I use
TikaConfig config = TikaConfig.getDefaultConfig();
Detector detector = config.getDetector();
MediaType mediaType = null;
try{
try(TikaInputStream stream = TikaInputStream.get(file)) {
Metadata metadata = new Metadata();
metadata.add(Metadata.RESOURCE_NAME_KEY, fileName);
mediaType = detector.detect(stream, metadata);
}
}catch (IOException e) {
}
System.out.println("Mime Type ="+mediaType.toString())
Unfortunately the mime type for a .dat file is being shown as 'text/plain', instead of "application/octet-stream". Any help to fix this issue ?
Thanks

Related

Apache Tika BodyContentHandler() is Empty

I'm using Apache Tika 1.18 and when I use one web service framework (sparkjava), the code below works. Yet in SpringBoot, the BodyContentHandler() line of code is empty. Thus, my returned text is empty.
Not sure what's up with this but would appreciate any suggestions.
I'm passing a Base64 encoded string to this code and it's also URLEncoded. Thus, the two decodes as the first two lines.
Running this code in the debugger in SpringBoot, the variable contents have the same values as in sparkjava, but once I get to the BodyContentHandler(), instead of having the input text as the sparkjava version has for the handler variable, the SpringBoot version has "" for handler.
I also tested this behavior with Tika 1.17. Same. Also tried removing the -1 parameter from the new BodyContentHandler() constructor. Same.
Thanks in advance.
String "data=" passed into SpringBoot POST method.
String bodyData = URLDecoder.decode(data.substring(data.indexOf("data=") + 5));
byte[] decodedBodyData = java.util.Base64.getMimeDecoder().decode(bodyData);
Tika tika = new Tika();
try
{
Parser parser = new AutoDetectParser();
// line of code below returns "". Problem!
BodyContentHandler handler = new BodyContentHandler(-1); // handle larger files.
Metadata metadata = new Metadata();
InputStream inputStream = new ByteArrayInputStream(decodedBodyData);
ParseContext context = new ParseContext();
//parsing the file
parser.parse(inputStream, handler, metadata, context);
textToReturn = handler.toString();
}
catch (IOException e)
{
e.printStackTrace();
}
catch (SAXException e)
{
e.printStackTrace();
}
catch (TikaException e)
{
e.printStackTrace();
}
catch (Exception e)
{
e.printStackTrace();
}

Losing the input stream in Apache tika

I am getting the Input stream from the HttpRequest and using same input stream to extract the metadata. like as shown below.
ServletFileUpload upload = new ServletFileUpload();
FileItemIterator iter = upload.getItemIterator(request);
--- more lines for the iteration and getting the stream ------
InputStream input = item.openStream();
This input is getting passed to the parser as below
public Map<String, String> extractMetadata(InputStream is) {
Map<String,String> map = new HashMap<>();
ContentHandler contentHandler = new BodyContentHandler(-1);
Metadata metadata = new Metadata();
Parser parser = new AutoDetectParser();
ParseContext parseContext = new ParseContext();
parseContext.set(Parser.class ,
new ParserDecorator(parser));
try {
TikaInputStream tikaInputStream = TikaInputStream.get(is);
parser.parse(tikaInputStream, contentHandler, metadata,parseContext);
for (String name : metadata.names()) {
map.put(name ,metadata.get(name));
}
} catch (IOException|SAXException|TikaException e) {
map.put("ERROR","Error while retriving Metadata");
}
return map;
}
But when I try to get the input stream then it is not same as if i dont use tika for extract.
Does Tika Dirty the stream ?

sending json data to server in blackberry

In my application i have to integrate API. I am not getting code for how to check whether internet is available or not. How to send JSON Data to server. Please help me out. As in android we call API in AsyncTask class. In blackberry i did not found like this.
Suggest me some link or ideas so that i can integrate code. I am googling. But did not getting result .
What I have tried is here:
JSONObject postData = new JSONObject();
postData.put("userId", "24");
postData.put("messageTime","06:00:00");
postData.put("language", language[lang_Ocf.getSelectedIndex()]);
System.out.println("********json********"+postData);
ConnectionFactory conFactory = new ConnectionFactory();
ConnectionDescriptor conDesc = null;
try
{
conDesc = conFactory.getConnection(url+";deviceside=true");
}
catch(Exception e)
{
System.out.println(e.toString()+":"+e.getMessage());
}
String response = ""; // this variable used for the server response
// if we can get the connection descriptor from ConnectionFactory
if(null != conDesc)
{
try
{
HttpConnection connection = (HttpConnection)conDesc.getConnection();
//set the header property
connection.setRequestMethod(HttpConnection.POST);
connection.setRequestProperty("Content-Length", Integer.toString(postData.length()));
connection.setRequestProperty("Content-Type","application/json");
OutputStream out = connection.openOutputStream();
out.write(postData.get);
out.flush();
out.close();
int responseCode = connection.getResponseCode();
if(responseCode == HttpConnection.HTTP_OK){
InputStream in = connection.openInputStream();
StringBuffer buf = new StringBuffer();
int read = -1;
while((read = in.read())!= -1)
buf.append((char)read);
response = buf.toString();
}
Dialog.inform(response);
connection.close();
} catch(Exception e) {
}
}
return response;
Thanks
I solved this problem
Error:
Error: Cannot run program "jar": CreateProcess error=2, The system cannot find the file specified Packaging project HelaBibleWhereUR failed (took 10.715 seconds)
I simply put jar.exe that is under java bin folder in the jre bin folder.

How to compress the files in Blackberry?

In my application I used html template and images for browser field and saved in the sdcard . Now I want to compress that html,image files and send to the PHP server. How can I compress that files and send to server? Provide me some samples that may help lot.
i tried this way... my code is
EDIT:
private void zipthefile() {
String out_path = "file:///SDCard/" + "newtemplate.zip";
String in_path = "file:///SDCard/" + "newtemplate.html";
InputStream inputStream = null;
GZIPOutputStream os = null;
try {
FileConnection fileConnection = (FileConnection) Connector
.open(in_path);//read the file from path
if (fileConnection.exists()) {
inputStream = fileConnection.openInputStream();
}
byte[] buffer = new byte[1024];
FileConnection path = (FileConnection) Connector
.open(out_path,
Connector.READ_WRITE);//create the out put file path
if (!path.exists()) {
path.create();
}
os = new GZIPOutputStream(path.openOutputStream());// for create the gzip file
int c;
while ((c = inputStream.read()) != -1) {
os.write(c);
}
} catch (Exception e) {
Dialog.alert("" + e.toString());
} finally {
if (inputStream != null) {
try {
inputStream.close();
} catch (IOException e) {
e.printStackTrace();
Dialog.alert("" + e.toString());
}
}
if (os != null) {
try {
os.close();
} catch (IOException e) {
e.printStackTrace();
Dialog.alert("" + e.toString());
}
}
}
}
this code working fine for single file but i want to compress all the file(more the one file)in the folder .
In case you are not familiar with them, I can tell you that in Java the stream classes follow the Decorator Pattern. These are meant to be piped to other streams to perform additional tasks. For instance, a FileOutputStream allows you to write bytes to a file, if you decorate it with a BufferedOutputStream then you get also buffering (big chunks of data are stored in RAM before being finally written to disc). Or if you decorate it with a GZIPOutputStream then you get also compression.
Example:
//To read compressed file:
InputStream is = new GZIPInputStream(new FileInputStream("full_compressed_file_path_here"));
//To write to a compressed file:
OutputStream os = new GZIPOutputStream(new FileOutputStream("full_compressed_file_path_here"));
This is a good tutorial covering basic I/O . Despite being written for JavaSE, you'll find it useful since most things work the same in BlackBerry.
In the API you have these classes available:
GZIPInputStream
GZIPOutputStream
ZLibInputStream
ZLibOutputStream
If you need to convert between streams and byte array use IOUtilities class or ByteArrayOutputStream and ByteArrayInputStream.

how to upload a file

i try to upload a file from client to server
on the client side, i have a file input
on server side i have
private void uploadFile(final FileTransfer fileTransfer) {
String destinationFile = "/home/nat/test.xls";
InputStream fis = null;
FileOutputStream out = null;
byte buf[] = new byte[1024];
int len;
try {
fis = fileTransfer.getInputStream();
out = new FileOutputStream(new File(destinationFile));
while ((len = fis.read(buf)) > 0) {
out.write(buf, 0, len);
}
}
}
a file is created on the server, but it's empty
when i debug, i can see then fis is not null
any idea?
Here is a code extract of mine:
try {
File fileData = new File(fileTransfer.getFilename());
// Write the content (data) in the file
// Apache Commons IO: (FileUtils)
FileOutputStream fos = FileUtils.openOutputStream(fileData);
// Spring Utils: FileCopyUtils
FileCopyUtils.copy(fileTransfer.getInputStream(), fos);
// Alternative with Apache Commons IO
// FileUtils.copyInputStreamToFile(fileTransfer.getInputStream(), fileData);
// Send the file to a back-end service
myService.persistFile( fileData );
} catch (IOException ioex) {
log.error("Error with io")
}
return fileTransfer.getFilename(); // this is for my javascript callback fn
Apache Commons IO is a good library to use for such manipulations (I use Spring Utils as well). If you do not have a Spring context, use the commented alternative with Apache (check the syntax, it is not verified).

Resources