I am using the next implementation for Java Servlet -
String url = "http://mydomain.com/test.php?myparam="+myname;
Document doc = null;
try {
doc = Jsoup.connect(url).get();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Where myname is a String in UTF Charset.
For some reason the result received is not OK (unreadable chars).
Is there a way to force the URL in JSoup to be UTF as well?
Thanks
Try this
url = URLEncoder.encode("http://mydomain.com/test.php?myparam="+myname, "UTF-8")
Related
I have been trying to implement an application to determine content type of any file. I use Apache Tika for determination.
Here is a basic code implementation for that:
InputStream fileStream = ContentTypeController.class.getClassLoader().getResourceAsStream(fileName);
Tika tika = new Tika();
String contentType = null;
try {
contentType = tika.detect(fileStream);
} catch (IOException e) {
e.printStackTrace();
}
Instead of code above I have to download files from Openstack to determine file content type. Some files are more than 100GB and downloading all file is heavy.
I can not figure out how to overcome this necessity of downloading all file, I hope you have any idea/solution without downloading all file
Tika has ability to determine content type of file without downloading all if you pass a URL parameter to detect() function.
Tika tika = new Tika();
String contentType = null;
try {
contentType = tika.detect(new URL("a url"));
} catch (IOException e) {
e.printStackTrace();
}
How to validate Text Response using restAssured?
Basically I have downloaded the file In CSV format, now the response is coming in text format any suggestion how can we validate the column headers in the text?
I have got the answer.
try {
CsvSchema bootstrapSchema = CsvSchema.emptySchema().withHeader();
File file = new File(fileName) ;
MappingIterator<T> readValues = mapper.readerFor(type).with(bootstrapSchema).readValues(file);
return readValues.readAll();
} catch (Exception e) {
log.error("Error occurred while loading object list from file :{} with }
using Jackson csv formatter dependency
I am trying to write a function that takes an input URL of any Stack Overflow link, gets the source code of the page, parses it, gets the accepted answer, and also gets the answer with the most upvotes.
I am new to this and I don't know how to do this. This is what I've tried out. It just returns the first answer using jsoup.
protected void doHtmlParse(String url) {
// TODO Auto-generated method stub
Document doc;
try {
doc = Jsoup.connect(url).userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")
.referrer("http://www.google.com")
.get();
Element answer = doc.select("td[class=answercell]").get(0);
System.out.println("Answer is \n" + answer.toString());
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
I only need to display the answer part, but it has to be the accepted answer. How do I approach this?
You don't really need to parse html. Use their REST API.
Have a look.
Here's an example. Note the is_accepted attribute.
EDIT:
Well, after you've got the chosen answer through the API, you could do this:
String answer = document.getElementById("answer-"+id).outerHtml();
I am now able to get the accepted answer via this code.
protected void doHtmlParse(String url) {
// TODO Auto-generated method stub
Document doc;
try {
doc = Jsoup.connect(url).userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")
.referrer("http://www.google.com")
.get();
Element answer = doc.select("div[class=answer accepted-answer]").first();
Elements tds = answer.getElementsByTag("td");
for(Element td : tds) {
String clasname = td.attr("class");
if(clasname.equals("answercell")) {
System.out.println("\n\nAccepted answerrr is \n" + td.text());
}
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
Been breaking my head to get this straight. Pretty simple though.. have not been able to figure out why. Any help would be very much appreciated.
Here my XML file
<?xml version="1.0" encoding="UTF-8"?>
<User mode="Retrieve" simCardNumber=“9602875089237652" softwareVersion=“9" phoneManufacturer=“Nokia" phoneModel="I747" deviceId=“562372389498734" networkOperator=“Blu">
<Errors>
<Error number="404"/>
</Errors>
</User>
private static Document convertStringToDocument(String xmlStr) {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
try
{
DocumentBuilder builder =factory.newDocumentBuilder();
//The below statement fails and jumps to return null
//Document doc = builder.parse( new InputSource(new StringReader(xmlStr)));
//Adding replace method on the string to handle the strange looking double quote on the xml string. However I still get the same error.
Document doc = builder.parse( new InputSource(new StringReader(xmlStr.replace("“", "\'\""))));
return doc;
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
check the quotes..
networkOperator=“Blu"
Don't know if it isn't a paste error but you used “ instead of " in your code. The first one if often used in rich text editors as a starting quote, you need to change it manually to let it be parseable.
Ok this solution works. Thanks everyone for your time and support.
Document doc = null;
try
{
DocumentBuilder db = DocumentBuilderFactory.newInstance().newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(xmlStr));
doc = db.parse(is);
} catch (Exception e) {
e.printStackTrace();
}
return doc;
I have an URL to encode on my java serveur and then to decode with javascript.
I try to retrieve a String I send in param with java. It is an error message from a form validation function.
I do it like that (server side. Worker.doValidateForm() return a String) :
response.sendRedirect(URLEncoder.encode("form.html?" + Worker.doValidateForm(), "ISO-8859-1"));
Then, in my javascript, I do that :
function retrieveParam() {
var error = window.location.search;
decodeURIComponent(error);
if(error)
alert(error);
}
Of course it doesn't work. Not the same encoding I guess.
So my question is : which method can I use in Java to be able to decode my URL with javascript ?
It's ok ! I have found a solution.
Server side with Java :
URI uri = null;
try {
uri = new URI("http", "localhost:8080", "/PrizeWheel/form.html", Worker.doValidateForm(), null);
} catch (URISyntaxException e) {
this.log.error("class Worker / method doPost:", e); // Just writing the error in my log file
}
String url = uri.toASCIIString();
response.sendRedirect(url);
And in the Javascript (function called in the onload of the redirected page) :
function retrieveParam() {
var error = decodeURI(window.location.search).substring(1);
if(error)
alert(error);
}
You don't use URLEncoder to encode URLs, it us used to encode form data to application/x-www-form-urlencoded MIME format. You use URIEncoder instead, see http://contextroot.blogspot.fi/2012/04/encoding-urls-in-java-is-quite-trivial.html