Parsing arabic text using Sax produce? - blackberry

I'm developing LWUIT project using netbeans to run on Blackberry environment. this project will read data from .net web service, I used ksoap2 and Sax Parser. Parser looks like that
public static Vector ParseSAX(String input ,final String[] elements) {
final Vector values = new Vector();
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
for(int u = 0;u < elements.length;u++){
if (qName.equalsIgnoreCase(elements[u].toString())) {
flag = true;
}
}
}
public void endElement(String uri, String localName,
String qName) throws SAXException {
}
public void characters(char ch[], int start, int length) throws SAXException {
if (flag) {
values.addElement(new String(ch, start, length));
flag = false;
}
}
};
InputStreamReader inputStream = new InputStreamReader(new ByteArrayInputStream(input.getBytes()), "UTF-8");
InputSource is = new InputSource();
is.setEncoding("UTF-8");
is.setCharacterStream(inputStream);
saxParser.parse(is, handler);
} catch (Exception e) {
e.printStackTrace();
}
return values;
}
I cared to parse arabic characters.
By the way, I converted the project encoding to UTF-8 and changed javac.encoding=UTF-8 in project.properties and in private.properties I added runtime.encoding=UTF-8
if I put this code in isolated project, it runs fine.
If I added in BB project or web project, will produce?
I do not know what can I do?

Related

Sentiment Analysis with OpenNLP

I found this description of implementing a Sentiment Analysis task with OpenNLP. In my case I am using the newest OPenNLP-version, i.e., version 1.8.0. In the following example, they use a Maximum Entropy Model. I am using the same input.txt (tweets.txt)
http://technobium.com/sentiment-analysis-using-opennlp-document-categorizer/
public class StartSentiment {
public static DoccatModel model = null;
public static String[] analyzedTexts = {"I hate Mondays!"/*, "Electricity outage, this is a nightmare"/*, "I love it"*/};
public static void main(String[] args) throws IOException {
// begin of sentiment analysis
trainModel();
for(int i=0; i<analyzedTexts.length;i++){
classifyNewText(analyzedTexts[i]);
}
}
private static String readFile(String pathname) throws IOException {
File file = new File(pathname);
StringBuilder fileContents = new StringBuilder((int)file.length());
Scanner scanner = new Scanner(file);
String lineSeparator = System.getProperty("line.separator");
try {
while(scanner.hasNextLine()) {
fileContents.append(scanner.nextLine() + lineSeparator);
}
return fileContents.toString();
} finally {
scanner.close();
}
}
public static void trainModel() {
MarkableFileInputStreamFactory dataIn = null;
try {
dataIn = new MarkableFileInputStreamFactory(
new File("bin/text.txt"));
ObjectStream<String> lineStream = null;
lineStream = new PlainTextByLineStream(dataIn, StandardCharsets.UTF_8);
ObjectStream<DocumentSample> sampleStream = new DocumentSampleStream(lineStream);
TrainingParameters tp = new TrainingParameters();
tp.put(TrainingParameters.CUTOFF_PARAM, "2");
tp.put(TrainingParameters.ITERATIONS_PARAM, "30");
DoccatFactory df = new DoccatFactory();
model = DocumentCategorizerME.train("en", sampleStream, tp, df);
} catch (Exception e) {
e.printStackTrace();
} finally {
if (dataIn != null) {
try {
} catch (Exception e2) {
e2.printStackTrace();
}
}
}
}
public static void classifyNewText(String text){
DocumentCategorizerME myCategorizer = new DocumentCategorizerME(model);
double[] outcomes = myCategorizer.categorize(new String[]{text});
String category = myCategorizer.getBestCategory(outcomes);
if (category.equalsIgnoreCase("1")){
System.out.print("The text is positive");
} else {
System.out.print("The text is negative");
}
}
}
In my case no matter what input String I am using, I am only getting a positive estimation of the input string. Any idea what could be the reason?
Thanks

SAX getting only the end of a content string

I need to catch data from < itunes:sumary > tag but my handler is getting only the end of tag's content (last three words for example). I don't know what to do because other tags are being handled as expected, getting all content.*
I've seen that some tags are ignored by parser, but I don't think it's happening with because as I said it gets the content but only the end of that.
The source XML is hosted in -> http://djpaulonla.podomatic.com/archive/rss2.xml
Please, could someone help me???
The code is the following:
public class PodOMaticCustomHandler extends CustomHandler {
public PodOMaticCustomHandler(int quantityToFetch, String startTagValue,
String endTagValue) {
super(quantityToFetch, startTagValue, endTagValue);
}
#Override
public void characters(char[] ch, int start, int length)
throws SAXException {
super.characters(ch, start, length);
this.value = new String(ch, start, length);
}
#Override
public void endDocument() throws SAXException {
super.endDocument();
this.endDoc = true;
}
#Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
super.endElement(uri, localName, qName);
if (this.podcast != null) {
if (qName.equalsIgnoreCase("title")) {
podcast.setTitle(this.value);
} else if (qName.equalsIgnoreCase("pubDate")) {
podcast.setPubDate(this.value);
} else if (qName.equalsIgnoreCase("description")) {
podcast.setContent(this.value);
} else if (qName.equalsIgnoreCase("guid")) {
this.podcast.setLink(value);
}
}
}
#Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
super.startElement(uri, localName, qName, attributes);
if (this.startTagValue == null) {
this.startTagValueFound = true;
} else if (qName.equalsIgnoreCase("guid")
&& this.value.equalsIgnoreCase(this.startTagValue)) {
this.startTagValueFound = true;
}
if (this.endTagValue != null) {
if (qName.equalsIgnoreCase("guid")
&& this.value.equalsIgnoreCase(this.endTagValue)) {
this.endDoc = true;
}
}
if (!this.endDoc) {
if (this.quantityToFetch != this.podcasts.size()) {
if (this.startTagValueFound == true) {
if (qName.equalsIgnoreCase("item")) {
this.podcast = new Podcast();
} else if (qName.equalsIgnoreCase("enclosure")) {
this.podcast.setMedia(attributes.getValue("url"));
this.podcasts.add(podcast);
}
}
} else {
this.podcast = null;
}
}else{
this.podcast = null;
}
}
}
You can't rely on the characters method being called once with the entire element text, it may be called multiple times, each time with only part of the text.
Add a debug log statement to the characters method showing what you're setting value to and you will see that values is getting set with the first part of the string and then getting overwritten with the last part.
The answer is to buffer the text passed in from the characters calls in a CharArrayWriter or StringBuilder. Then you have to clear the buffer when the end of the element is found.
Here's what the Java tutorial on SAX has to say about the characters method:
Parsers are not required to return any particular number of characters at one time. A parser can return anything from a single character at a time up to several thousand and still be a standard-conforming implementation. So if your application needs to process the characters it sees, it is wise to have the characters() method accumulate the characters in a java.lang.StringBuffer and operate on them only when you are sure that all of them have been found.

Tried to read incoming SMS content but getting Error in Blackberry

Hi friends i am trying to read incoming sms but getting warning like this . Invocation of questionable method: java.lang.String.(String) found in: mypackage.MyApp$ListeningThread.run()
Here is my code is
public class MyApp extends UiApplication {
//private ListeningThread listener;
public static void main(String[] args) {
MyApp theApp = new MyApp();
theApp.enterEventDispatcher();
}
public MyApp() {
invokeAndWait(new Runnable() {
public void run() {
ListeningThread listener = new ListeningThread();
listener.start();
}
});
pushScreen(new MyScreen());
}
private static class ListeningThread extends Thread {
private boolean _stop = false;
private DatagramConnection _dc;
public synchronized void stop() {
_stop = true;
try {
_dc.close(); // Close the connection so the thread returns.
} catch (IOException e) {
System.err.println(e.toString());
}
}
public void run() {
try {
_dc = (DatagramConnection) Connector.open("sms://");
for (;;) {
if (_stop) {
return;
}
Datagram d = _dc.newDatagram(_dc.getMaximumLength());
_dc.receive(d);
String address = new String(d.getAddress());
String msg = new String(d.getData());
if(msg.startsWith("START")){
Dialog.alert("hello");
}
System.out.println("Message received: " + msg);
System.out.println("From: " + address);
System.exit(0);
}
} catch (IOException e) {
System.err.println(e.toString());
}
}
}
}
Please correct me where i am wrong.Is possible give me some code to read incoming sms content in blackberry.
A few points about your code:
That invokeAndWait call to launch a thread makes no sense. It doesn't harm, but is kind of waste. Use that method only to perform UI related operations.
You should try using "sms://:0" as param for Connector.open. According to the docs, a parameter with the form {protocol}://[{host}]:[{port}] will open the connection in client mode (which makes sense, since you are on the receiving part), whereas not including the host part will open it in server mode.
Finally, if you can't get it working, you could use instead the third method specified in this tutorial, which you probably have already read.
The error you quoted is complaining about the use of the String constructor that takes a string argument. Since strings are immutable in Java-ME, this is just a waste. You can use the argument string directly:
Invocation of questionable method: java.lang.String.(String) found in: mypackage.MyApp$ListeningThread.run()
//String address = new String(d.getAddress());
String address = d.getAddress();
// getData() returns a byte[], so this is a different constructor
// However, this leaves the character encoding unspecified, so it
// will default to cp1252, which may not be what you want
String msg = new String(d.getData());

Blackberry Java - Fixed length streaming a POST body over a HTTP connect

I'm working on some code which POSTs large packets often over HTTP to a REST server on IIS. I'm using the RIM/JavaME HTTPConnection class.
As far as I can tell HTTPConnection uses an internal buffer to "gather" up the output stream before sending the entire contents to the server. I'm not surprised, since this is how HttpURLConnect works by default as well. (I assume it does this so that the content-length is set correctly.) But in JavaSE I could override this behavior by using the method setFixedLengthStreamingMode so that when I call flush on the output stream it would send that "chunk" of the stream. On a phone this extra buffering is too expensive in terms of memory.
In Blackberry Java is there a way to do fixed-length streaming on a HTTP request, when you know the content-length in advance?
So, I never found a way to do this was the base API for HTTPConnection. So instead, I created a socket and wrapped it with my own simple HTTPClient, which did support chunking.
Below is the prototype I used and tested on BB7.0.
package mypackage;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import javax.microedition.io.Connector;
import javax.microedition.io.SocketConnection;
public class MySimpleHTTPClient{
SocketConnection sc;
String HttpHeader;
OutputStreamWriter outWriter;
InputStreamReader inReader;
public void init(
String Host,
String port,
String path,
int ContentLength,
String ContentType ) throws IllegalArgumentException, IOException
{
String _host = (new StringBuffer())
.append("socket://")
.append(Host)
.append(":")
.append(port).toString();
sc = (SocketConnection)Connector.open(_host );
sc.setSocketOption(SocketConnection.LINGER, 5);
StringBuffer _header = new StringBuffer();
//Setup the HTTP Header.
_header.append("POST ").append(path).append(" HTTP/1.1\r\n");
_header.append("Host: ").append(Host).append("\r\n");
_header.append("Content-Length: ").append(ContentLength).append("\r\n");
_header.append("Content-Type: ").append(ContentType).append("\r\n");
_header.append("Connection: Close\r\n\r\n");
HttpHeader = _header.toString();
}
public void openOutputStream() throws IOException{
if(outWriter != null)
return;
outWriter = new OutputStreamWriter(sc.openOutputStream());
outWriter.write( HttpHeader, 0 , HttpHeader.length() );
}
public void openInputStream() throws IOException{
if(inReader != null)
return;
inReader = new InputStreamReader(sc.openDataInputStream());
}
public void writeChunkToServer(String Chunk) throws Exception{
if(outWriter == null){
try {
openOutputStream();
} catch (IOException e) {e.printStackTrace();}
}
outWriter.write(Chunk, 0, Chunk.length());
}
public String readFromServer() throws IOException {
if(inReader == null){
try {
openInputStream();
} catch (IOException e) {e.printStackTrace();}
}
StringBuffer sb = new StringBuffer();
int data = inReader.read();
//Note :: This will also read the HTTP headers..
// If you need to parse the headers, tokenize on \r\n for each
// header, the header section is done when you see \r\n\r\n
while(data != -1){
sb.append( (char)data );
data = inReader.read();
}
return sb.toString();
}
public void close(){
if(outWriter != null){
try {
outWriter.close();
} catch (IOException e) {}
}
if(inReader != null){
try {
inReader.close();
} catch (IOException e) {}
}
if(sc != null){
try {
sc.close();
} catch (IOException e) {}
}
}
}
Here is example usage for it:
MySimpleHTTPClient myConn = new MySimpleHTTPClient() ;
String chunk1 = "ID=foo&data1=1234567890&chunk1=0|";
String chunk2 = "ID=foo2&data2=123444344&chunk1=1";
try {
myConn.init(
"pdxsniffe02.webtrends.corp",
"80",
"TableAdd/234234234443?debug=1",
chunk1.length() + chunk2.length(),
"application/x-www-form-urlencoded"
);
myConn.writeChunkToServer(chunk1);
//The frist chunk is already on it's way.
myConn.writeChunkToServer(chunk2);
System.out.println( myConn.readFromServer() );
} catch (IllegalArgumentException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
}finally{
myConn.close();
}

Xps printing from windows service

I'm trying to print XPS documents from a windows service on the .net framework. Since Microsoft does not support printing by using System.Drawing.Printing nor by using System.Printing (WPF), I'm using the native XPSPrint API.
This is suggested to me by Aspose in http://www.aspose.com/documentation/.net-components/aspose.words-for-.net/howto-print-a-document-on-a-server-via-the-xpsprint-api.html.
When I try to print an XPS document from a windows service, the result contains strange characters instead of the text I want.
I tried with different printers (including virtual printers like for instance PDFCreator), different users and user-privileges for the service, different xps generators (aspose, word 2007, word 2010), different platforms (windows 7, windows 2008 R2) but all have the same result.
Does anybody knows how to solve this? Any help would be appreciated!
For those who want to try it, I shared some files via:
https://docs.google.com/leaf?id=0B4J93Ly5WzQKNWU2ZjM0MDYtMjFiMi00NzM0LTg4MTgtYjVlNDA5NWQyMTc3&hl=nl
document.xps: the XPS document to print
document_printed_to_pdfcreator.pdf: the printed document that demonstrates what is going wrong
XpsPrintTest.zip: a sample VS2010 solution with the sample code
The sample code for the managed windows service is:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Diagnostics;
using System.Linq;
using System.ServiceProcess;
using System.Text;
using System.IO;
using System.Threading;
using System.Runtime.InteropServices;
namespace PrintXpsService
{
public partial class XpsPrintService : ServiceBase
{
// Change name of printer here
private String f_printerName = "PDFCreator";
// path to some file where logging is done
private String f_logFile = #"C:\temp\testdoc\xps_printing_service_log.txt";
// path to xps file to print
private String f_xpsFile = #"C:\temp\testdoc\document.xps";
public XpsPrintService()
{
InitializeComponent();
}
private void Log(String fmt, params Object[] args)
{
try
{
DateTime now = DateTime.Now;
using (StreamWriter wrt = new StreamWriter(f_logFile, true))
{
wrt.Write("{0} {1} - ", now.ToShortDateString(), now.ToShortTimeString());
wrt.WriteLine(fmt, args);
}
}
catch (Exception ex)
{
}
}
protected override void OnStart(string[] args)
{
// uncomment to allow to connect debugger
//int i = 0;
//while (i == 0)
//{
// if (i == 0)
// {
// Thread.Sleep(1000);
// }
//}
Log("Starting Service");
try
{
Log("Printing xps file {0}", f_xpsFile);
using (Stream stream = new FileStream(f_xpsFile, FileMode.Open, FileAccess.Read))
{
Log("Starting to print on printer {0}", f_printerName);
String jobName = f_xpsFile;
this.Print(stream, jobName);
}
Log("Document printed");
}
catch (Exception ex)
{
Log("Exception during execution: {0}", ex.Message);
Log(" {0}", ex.StackTrace);
Exception inner = ex.InnerException;
while (inner != null)
{
Log("=== Inner Exception: {0}", inner.Message);
Log(" {0}", inner.StackTrace);
inner = inner.InnerException;
}
}
}
protected override void OnStop()
{
}
public void Print(Stream stream, String jobName)
{
String printerName = f_printerName;
IntPtr completionEvent = CreateEvent(IntPtr.Zero, true, false, null);
try
{
IXpsPrintJob job;
IXpsPrintJobStream jobStream;
StartJob(printerName, jobName, completionEvent, out job, out jobStream);
CopyJob(stream, job, jobStream);
WaitForJob(completionEvent, -1);
CheckJobStatus(job);
}
finally
{
if (completionEvent != IntPtr.Zero)
CloseHandle(completionEvent);
}
}
private void StartJob(String printerName,
String jobName, IntPtr completionEvent,
out IXpsPrintJob job,
out IXpsPrintJobStream jobStream)
{
int result = StartXpsPrintJob(printerName, jobName, null, IntPtr.Zero, completionEvent,
null, 0, out job, out jobStream, IntPtr.Zero);
if (result != 0)
throw new Win32Exception(result);
}
private void CopyJob(Stream stream, IXpsPrintJob job, IXpsPrintJobStream jobStream)
{
try
{
byte[] buff = new byte[4096];
while (true)
{
uint read = (uint)stream.Read(buff, 0, buff.Length);
if (read == 0)
break;
uint written;
jobStream.Write(buff, read, out written);
if (read != written)
throw new Exception("Failed to copy data to the print job stream.");
}
// Indicate that the entire document has been copied.
jobStream.Close();
}
catch (Exception)
{
// Cancel the job if we had any trouble submitting it.
job.Cancel();
throw;
}
}
private void WaitForJob(IntPtr completionEvent, int timeout)
{
if (timeout < 0)
timeout = -1;
switch (WaitForSingleObject(completionEvent, timeout))
{
case WAIT_RESULT.WAIT_OBJECT_0:
// Expected result, do nothing.
break;
case WAIT_RESULT.WAIT_TIMEOUT:
// timeout expired
throw new Exception("Timeout expired");
case WAIT_RESULT.WAIT_FAILED:
throw new Exception("Wait for the job to complete failed");
default:
throw new Exception("Unexpected result when waiting for the print job.");
}
}
private void CheckJobStatus(IXpsPrintJob job)
{
XPS_JOB_STATUS jobStatus;
job.GetJobStatus(out jobStatus);
switch (jobStatus.completion)
{
case XPS_JOB_COMPLETION.XPS_JOB_COMPLETED:
// Expected result, do nothing.
break;
case XPS_JOB_COMPLETION.XPS_JOB_IN_PROGRESS:
// expected, do nothing, can occur when printer is paused
break;
case XPS_JOB_COMPLETION.XPS_JOB_FAILED:
throw new Win32Exception(jobStatus.jobStatus);
default:
throw new Exception("Unexpected print job status.");
}
}
[DllImport("XpsPrint.dll", EntryPoint = "StartXpsPrintJob")]
private static extern int StartXpsPrintJob(
[MarshalAs(UnmanagedType.LPWStr)] String printerName,
[MarshalAs(UnmanagedType.LPWStr)] String jobName,
[MarshalAs(UnmanagedType.LPWStr)] String outputFileName,
IntPtr progressEvent, // HANDLE
IntPtr completionEvent, // HANDLE
[MarshalAs(UnmanagedType.LPArray)] byte[] printablePagesOn,
UInt32 printablePagesOnCount,
out IXpsPrintJob xpsPrintJob,
out IXpsPrintJobStream documentStream,
IntPtr printTicketStream); // This is actually "out IXpsPrintJobStream", but we don't use it and just want to pass null, hence IntPtr.
[DllImport("Kernel32.dll", SetLastError = true)]
private static extern IntPtr CreateEvent(IntPtr lpEventAttributes, bool bManualReset, bool bInitialState, string lpName);
[DllImport("Kernel32.dll", SetLastError = true, ExactSpelling = true)]
private static extern WAIT_RESULT WaitForSingleObject(IntPtr handle, Int32 milliseconds);
[DllImport("Kernel32.dll", SetLastError = true)]
[return: MarshalAs(UnmanagedType.Bool)]
private static extern bool CloseHandle(IntPtr hObject);
}
/// <summary>
/// This interface definition is HACKED.
///
/// It appears that the IID for IXpsPrintJobStream specified in XpsPrint.h as
/// MIDL_INTERFACE("7a77dc5f-45d6-4dff-9307-d8cb846347ca") is not correct and the RCW cannot return it.
/// But the returned object returns the parent ISequentialStream inteface successfully.
///
/// So the hack is that we obtain the ISequentialStream interface but work with it as
/// with the IXpsPrintJobStream interface.
/// </summary>
[Guid("0C733A30-2A1C-11CE-ADE5-00AA0044773D")] // This is IID of ISequenatialSteam.
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
interface IXpsPrintJobStream
{
// ISequentualStream methods.
void Read([MarshalAs(UnmanagedType.LPArray)] byte[] pv, uint cb, out uint pcbRead);
void Write([MarshalAs(UnmanagedType.LPArray)] byte[] pv, uint cb, out uint pcbWritten);
// IXpsPrintJobStream methods.
void Close();
}
[Guid("5ab89b06-8194-425f-ab3b-d7a96e350161")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
interface IXpsPrintJob
{
void Cancel();
void GetJobStatus(out XPS_JOB_STATUS jobStatus);
}
[StructLayout(LayoutKind.Sequential)]
struct XPS_JOB_STATUS
{
public UInt32 jobId;
public Int32 currentDocument;
public Int32 currentPage;
public Int32 currentPageTotal;
public XPS_JOB_COMPLETION completion;
public Int32 jobStatus; // UInt32
};
enum XPS_JOB_COMPLETION
{
XPS_JOB_IN_PROGRESS = 0,
XPS_JOB_COMPLETED = 1,
XPS_JOB_CANCELLED = 2,
XPS_JOB_FAILED = 3
}
enum WAIT_RESULT
{
WAIT_OBJECT_0 = 0,
WAIT_ABANDONED = 0x80,
WAIT_TIMEOUT = 0x102,
WAIT_FAILED = -1 // 0xFFFFFFFF
}
}
Note: some links for more information:
MS not supporting printing from managed code: http://support.microsoft.com/kb/324565 , http://msdn.microsoft.com/en-us/library/system.drawing.printing.aspx and http://msdn.microsoft.com/en-us/library/bb613549.aspx
XPSPrint API: http://msdn.microsoft.com/en-us/library/dd374565(VS.85).aspx
I talked with microsoft about this issue and we discovered the problem is related to incorrect font substitution in the printer-spooler. When the printer is set to not spool the documents, they are printed correctly, also from a windows service. Otherwise, all fonts, except arial (and maybe some others), are substituted by another font. In the sample I provided, calibri is substituted by wingdings.
So, they acknowledge this to be a bug but at the moment they will not resolve it. It will depend on how many people will suffer from this bug in order for them to decide whether are not they are willing to fix it...

Resources