Mime Type detection of Office files is resulting in application/x-tika-ooxml - apache-tika

I am trying to detect the mime types of the file input stream.
I just have tika core in my classpath. I am using 2.0.0 version.
However for a docx file "application/x-tika-ooxml" is always detected. Office file detection is always resulting in x-tika-ooxml.
I tried wrapping the input stream in TikaInputStream also but the same result.
Below is my code
public class TikaTester {
public static void main (String a[]) {
try {
FileInputStream stream = new FileInputStream("/Users/<>/Downloads/Test DMS.docx");
detectMimeType(stream);
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
public static void detectMimeType(InputStream stream) {
Tika tika = new Tika();
try {
String mimeType = tika.detect(stream);
System.out.println("Mime type detected " + mimeType);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}

This works if i add tika-parsers in the classpath and with TikaInputStream needs to be used.
May be tika core does not have the parser for office files.

Related

Modify tokenizer in ANTLR

In ANTLR, how to make output the tokens one by one following like push "enter" in keyboard that I try to a class named hello.java like this
public class Hello{
public static void main(String args[]){
System.out.println("Hello World ...");
}
}
Now, it is time to parse the tokens
final Antlr3JavaLexer lexer = new Antlr3JavaLexer();
try {
lexer.setCharStream(new ANTLRReaderStream(in)); // in is a file
} catch (IOException e) {
e.printStackTrace();
}
final CommonTokenStream tokens = new CommonTokenStream();
tokens.setTokenSource(lexer);
tokens.LT(10); // force load
Antlr3JavaParser parser = new Antlr3JavaParser(tokens);
System.out.println(tokens);
it gives me an output like this,
publicclassHello{publicstaticvoidmain(Stringarggs[]){System.out.println("Hello World ...");}}
How to make an output looked like this
public
class
Hello
{
public
static ... untill the end...
I've try using Stringbuilder, but it's not working.
Thanks 4 the help..
Instead of just printing out tokens, you have to iterate over tokenstream to get back desired result.
Modify your code like this.
final Antlr3JavaLexer lexer = new Antlr3JavaLexer();
try {
lexer.setCharStream(new ANTLRReaderStream(in)); // in is a file
} catch (IOException e) {
e.printStackTrace();
}
final CommonTokenStream tokens = new CommonTokenStream();
tokens.setTokenSource(lexer);
//tokens.LT(10); // force load - not needed
Antlr3JavaParser parser = new Antlr3JavaParser(tokens);
// Iterate over tokenstream
for (Object tk: tokens.getTokens())
{
CommonToken commontk = (CommonToken) tk;
if (commontk.getText() != null && commontk.getText().trim().isEmpty() == false)
{
System.out.println(commontk.getText());
}
}
After this, You will get this result.
public
class
Hello
{
public
static ... etc...
Hope this will solve your issue.

Parsing arabic text using Sax produce?

I'm developing LWUIT project using netbeans to run on Blackberry environment. this project will read data from .net web service, I used ksoap2 and Sax Parser. Parser looks like that
public static Vector ParseSAX(String input ,final String[] elements) {
final Vector values = new Vector();
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
for(int u = 0;u < elements.length;u++){
if (qName.equalsIgnoreCase(elements[u].toString())) {
flag = true;
}
}
}
public void endElement(String uri, String localName,
String qName) throws SAXException {
}
public void characters(char ch[], int start, int length) throws SAXException {
if (flag) {
values.addElement(new String(ch, start, length));
flag = false;
}
}
};
InputStreamReader inputStream = new InputStreamReader(new ByteArrayInputStream(input.getBytes()), "UTF-8");
InputSource is = new InputSource();
is.setEncoding("UTF-8");
is.setCharacterStream(inputStream);
saxParser.parse(is, handler);
} catch (Exception e) {
e.printStackTrace();
}
return values;
}
I cared to parse arabic characters.
By the way, I converted the project encoding to UTF-8 and changed javac.encoding=UTF-8 in project.properties and in private.properties I added runtime.encoding=UTF-8
if I put this code in isolated project, it runs fine.
If I added in BB project or web project, will produce?
I do not know what can I do?

Connection being made, but content is unable to be retrieved from web service

public class ConsumeFactoryThread extends Thread {
private String url;
private HttpConnection httpConn;
private InputStream is;
private CustomMainScreen m;
private JSONArray array;
public ConsumeFactoryThread(String url, CustomMainScreen m){
System.out.println("Connection begin!");
this.url = url;
this.m = m;
}
public void finished(){
m.onFinish(array);
}
public void run(){
myConnectionFactory connFact = new myConnectionFactory();
ConnectionDescriptor connDesc;
connDesc = connFact.getConnection(url);
System.out.println("Connection factory!");
if(connDesc != null)
{
System.out.println("Connection not null!");
httpConn = (HttpConnection) connDesc.getConnection();
is = null;
try
{
final int iResponseCode = httpConn.getResponseCode();
UiApplication.getUiApplication().invokeLater(new Runnable()
{
public void run()
{
System.out.println("Connection in run!");
// Get InputConnection and read the server's response
InputConnection inputConn = (InputConnection) httpConn;
try {
is = inputConn.openInputStream();
System.out.println("Connection got inputstream!");
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
byte[] data = null;
try {
data = IOUtilities.streamToBytes(is);
System.out.println("Connection got data!");
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
String result = new String(data);
System.out.println("Connection Data: "+result);
try {
array = new JSONArray(result);
//finished();
} catch (JSONException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
});
}
catch(IOException e)
{
System.err.println("Caught IOException: " + e.getMessage());
}
}
}
}
I'm using the blackberry torch 9800 simulator and hardware device for testing.
In the simulator I cannot retrieve the data over wifi, even though the connection to wifi is found. It works when the mobile network is enabled.
Now, when I replace my web service with the Twitter api, I get the data regardless of transport type. I tried adding ;deviceside=false to my url, but nothing. It's not https or anything.
I just want my web service accessed! I know nothing about this mds,bis,bes,bis_b junk.
EDIT:
Jeez. I'm realizing it may be my site. Not using the web service and just retrieving the page, www.example.com, I get nothing. But, google.com or any other site I use retrieves the html. Am I missing headers!?!
Try appending ;interface=wifi to the end of your URL, this will force the simulator to use your simulated Wi-Fi connection, which is your PC's network connection.
You will need to have setup Wi-Fi on the simulator by going to Manage Connections->Set Up Wi-Fi Network, then connect to Default WLAN Network.

Blackberry Java - Fixed length streaming a POST body over a HTTP connect

I'm working on some code which POSTs large packets often over HTTP to a REST server on IIS. I'm using the RIM/JavaME HTTPConnection class.
As far as I can tell HTTPConnection uses an internal buffer to "gather" up the output stream before sending the entire contents to the server. I'm not surprised, since this is how HttpURLConnect works by default as well. (I assume it does this so that the content-length is set correctly.) But in JavaSE I could override this behavior by using the method setFixedLengthStreamingMode so that when I call flush on the output stream it would send that "chunk" of the stream. On a phone this extra buffering is too expensive in terms of memory.
In Blackberry Java is there a way to do fixed-length streaming on a HTTP request, when you know the content-length in advance?
So, I never found a way to do this was the base API for HTTPConnection. So instead, I created a socket and wrapped it with my own simple HTTPClient, which did support chunking.
Below is the prototype I used and tested on BB7.0.
package mypackage;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import javax.microedition.io.Connector;
import javax.microedition.io.SocketConnection;
public class MySimpleHTTPClient{
SocketConnection sc;
String HttpHeader;
OutputStreamWriter outWriter;
InputStreamReader inReader;
public void init(
String Host,
String port,
String path,
int ContentLength,
String ContentType ) throws IllegalArgumentException, IOException
{
String _host = (new StringBuffer())
.append("socket://")
.append(Host)
.append(":")
.append(port).toString();
sc = (SocketConnection)Connector.open(_host );
sc.setSocketOption(SocketConnection.LINGER, 5);
StringBuffer _header = new StringBuffer();
//Setup the HTTP Header.
_header.append("POST ").append(path).append(" HTTP/1.1\r\n");
_header.append("Host: ").append(Host).append("\r\n");
_header.append("Content-Length: ").append(ContentLength).append("\r\n");
_header.append("Content-Type: ").append(ContentType).append("\r\n");
_header.append("Connection: Close\r\n\r\n");
HttpHeader = _header.toString();
}
public void openOutputStream() throws IOException{
if(outWriter != null)
return;
outWriter = new OutputStreamWriter(sc.openOutputStream());
outWriter.write( HttpHeader, 0 , HttpHeader.length() );
}
public void openInputStream() throws IOException{
if(inReader != null)
return;
inReader = new InputStreamReader(sc.openDataInputStream());
}
public void writeChunkToServer(String Chunk) throws Exception{
if(outWriter == null){
try {
openOutputStream();
} catch (IOException e) {e.printStackTrace();}
}
outWriter.write(Chunk, 0, Chunk.length());
}
public String readFromServer() throws IOException {
if(inReader == null){
try {
openInputStream();
} catch (IOException e) {e.printStackTrace();}
}
StringBuffer sb = new StringBuffer();
int data = inReader.read();
//Note :: This will also read the HTTP headers..
// If you need to parse the headers, tokenize on \r\n for each
// header, the header section is done when you see \r\n\r\n
while(data != -1){
sb.append( (char)data );
data = inReader.read();
}
return sb.toString();
}
public void close(){
if(outWriter != null){
try {
outWriter.close();
} catch (IOException e) {}
}
if(inReader != null){
try {
inReader.close();
} catch (IOException e) {}
}
if(sc != null){
try {
sc.close();
} catch (IOException e) {}
}
}
}
Here is example usage for it:
MySimpleHTTPClient myConn = new MySimpleHTTPClient() ;
String chunk1 = "ID=foo&data1=1234567890&chunk1=0|";
String chunk2 = "ID=foo2&data2=123444344&chunk1=1";
try {
myConn.init(
"pdxsniffe02.webtrends.corp",
"80",
"TableAdd/234234234443?debug=1",
chunk1.length() + chunk2.length(),
"application/x-www-form-urlencoded"
);
myConn.writeChunkToServer(chunk1);
//The frist chunk is already on it's way.
myConn.writeChunkToServer(chunk2);
System.out.println( myConn.readFromServer() );
} catch (IllegalArgumentException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
}finally{
myConn.close();
}

Append to an existing file in a new line

I want to write some texts in a new line into an existing file.I tried following code but failed,Can any one suggest how can I append to file in a new row.
private void writeIntoFile1(String str) {
try {
fc=(FileConnection) Connector.open("file:///SDCard/SpeedScence/MaillLog.txt");
OutputStream os = fc.openOutputStream(fc.fileSize());
os.write(str.getBytes());
os.close();
fc.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
and calling
writeIntoFile1("aaaaaaaaa");
writeIntoFile1("bbbbbb");
Its successfully writing to the file I simulated(SDCard) but its appears in the same line.
How can I write "bbbbbb" to new line?
Write a newline (\n) after writing the string.
private void writeIntoFile1(String str) {
try {
fc = (FileConnection) Connector.open("file:///SDCard/SpeedScence/MaillLog.txt");
OutputStream os = fc.openOutputStream(fc.fileSize());
os.write(str.getBytes());
os.write("\n".getBytes());
os.close();
fc.close();
} catch (IOException e) {
e.printStackTrace();
}
}
N.B. a PrintStream is generally better-suited for printing text, though I'm not familiar enough with the BlackBerry API to know if it's possible to use a PrintStream at all. With a PrintStream you'd just use println():
private void writeIntoFile1(String str) {
try {
fc = (FileConnection) Connector.open("file:///SDCard/SpeedScence/MaillLog.txt");
PrintStream ps = new PrintStream(fc.openOutputStream(fc.fileSize()));
ps.println(str.getBytes());
ps.close();
fc.close();
} catch (IOException e) {
e.printStackTrace();
}
}

Resources