How can I connect to a website and grab the HTML into a string? I would like to do this behind the scenes of my application. I want to parse this information in a later screen.
As a starting point check the RIM documentation on HttpConnection (scroll to "Example using HttpConnection").
The example reads the response as a byte array, but it can be easily changed to read a String if you are OK in Java SE.
Another point is to use a proper transport (BIS, BES, TCP, WiFi, etc. - it should be usable on the particular device). For transport detection you can check this.
public static String getContentsFrom(String urlString) throws IOException {
URL url = new URL(urlString);
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
String inputLine;
String content = "";
while ((inputLine = in.readLine()) != null) {
content += inputLine;
}
in.close();
return content;
}
Related
I want to use the Java SDK of MS Graph to look up raw content (/$value endpoint) of the attachment.
For /messages resources, I know that raw content can be inquired in the following way.
#Test
void downloadEmlFile() {
// while empty value, but let's assume there is a value.
String userId = "";
String messageId = "";
InputStream inputStream =
graphServiceClient.users(userId).messages(messageId)
.content()
.buildRequest()
.get();
}
As you can see above, MS Graph SDK provides an api called .content() for messages.
However, it seems that they do not provide the corresponding api for the attachment resource.
In summary, there are two things I am curious about
Why don't you provide an API like .content() for the attachment resource?
How do I get raw content (/$value) for an attachment?
I think you need to use the FileAttachmentRequestBuilder which has the content method.
#Test
void downloadEmlFile() {
// while empty value, but let's assume there is a value.
String userId = "";
String messageId = "";
String attachmentId = "";
String requestUrl = graphServiceClient.users(userId).messages(messageId)
.attachments(attachmentId)
.getRequestUrl();
FileAttachmentRequestBuilder fileAttachReqBuilder = new FileAttachmentRequestBuilder(
requestUrl,
graphServiceClient,
null)
InputStream inputStream = fileAttachReqBuilder
.content()
.buildRequest()
.get();
}
I try a POST Request with the new JxBrowser Version. Unfortunately the data in the body is not handed over.
I guess I am just not using JxBrowser 7 properly.
GET Request does work.
// Post Request
protected void postRequestFromScout(JxBrowserEvent event) {
String url = event.getUrl();
Map<String, String> postData = event.getPostData();
getBrowser().navigation().loadUrl(LoadRequest.newBuilder()
.setUrl(url)
.setPostData(toPostDataString(postData))
.build());
}
// data in POST Request Body as String
protected String toPostDataString(Map<String, String> postData) {
StringBuilder sb = new StringBuilder();
for (Entry<String, String> entry : postData.entrySet()) {
sb
.append(entry.getKey())
.append("=")
.append(IOUtility.urlEncode(entry.getValue()))
.append("&");
}
sb.deleteCharAt(sb.length() - 1);
return sb.toString();
}
I obviously need to hand over the data in this way:
LoadUrlParams.newBuilder(url)
.postData(toPostDataString(postData))
.build();
As we are using a Compiler based on Java 7 in our Project, this is not a solution for me right now and I will check for another one if possible, but it surely works when used with Java 8.
Suppose I have an C# MVC app which has a controller method that returns one of 3 content types: image png, image jpeg, or application pdf. I have read that it is possible to have images that contain XSS payloads. What would be the best way to Encode/escape these return contents so they aren't vulnerable to XSS? The controller method looks like this:
string contentType = "image/png";
MemoryStream mem = new MemoryStream();
if (ImageFormat == null || ImageFormat == "")
{
image.Save(mem, System.Drawing.Imaging.ImageFormat.Png);
}
else
{
if (ImageFormat.ToUpper() == "PNG") image.Save(mem, System.Drawing.Imaging.ImageFormat.Png);
if (ImageFormat.ToUpper() == "JPEG")
{
image.Save(mem, System.Drawing.Imaging.ImageFormat.Jpeg);
contentType = "image/jpeg";
}
}
mem.Position = 0;
mem.Seek(0, SeekOrigin.Begin);
return this.Image(mem, contentType);
Where Image is defined the following class here:
using …
namespace x.Classes
{
public static class ControllerExtensions
{
public static ImageResult Image(this Controller controller, Stream imageStream, string contentType)
{
return new ImageResult(imageStream, contentType);
}
}
}
And the OutputStream is written to using:
using …
namespace x.Classes
{
public class ImageResult : ActionResult
{
public ImageResult(Stream imageStream, string contentType)
{
if (imageStream == null)
throw new ArgumentNullException("imageStream");
if (contentType == null)
throw new ArgumentNullException("contentType");
this.ImageStream = imageStream;
this.ContentType = contentType;
}
public Stream ImageStream { get; private set; }
public string ContentType { get; private set; }
public override void ExecuteResult(ControllerContext context)
{
if (context == null)
throw new ArgumentNullException("context");
HttpResponseBase response = context.HttpContext.Response;
response.ContentType = this.ContentType;
byte[] buffer = new byte[4096];
while (true)
{
int read = this.ImageStream.Read(buffer, 0, buffer.Length);
if (read == 0)
break;
response.OutputStream.Write(buffer, 0, read);
}
response.End();
}
}
}
Is there a way for me to escape/encode the buffer that is getting written to the OutputStream here:`
response.OutputStream.Write(buffer, 0, read);
To protect against XSS attacks? For example if this were HTML that was being returned:
response.OutputStream.Write(HttpUtility.HtmlEncode(buffer), 0, read);
But we know we are returning a jpeg, pdf, or png which means Html encode won't work here. So what do we use to safely escape/encode an image/pdf?
By the time you have buffer ready, it's too late. The same as with HTML, you want to context-sensitively encode any user input in those files, not the whole thing.
Now, with images this doesn't make much sense in the context of XSS, an image is rendered by an image renderer, and not as html, so there won't be any javascript to be run. The general best practice for uploaded images is to process them on the server and save them as a new image, because this removes all unnecessary things, but it has its risks as well if your processor itself is the target of an attack.
SVG for example is a different beast, SVG can have code in it, as can PDF. But again, PDFs will be open on the client with a PDF viewer, not in the context of the web application even if the PDF viewer is the browser itself (the browser hopefully separates Javascript in the PDF from the web page even if the origin is the same).
But javascript in a PDF can still be an issue for the client. Javascript running in a PDF may do harmful things, the simplest of which is consume client resources (ie. DoS of some sort), or it may try to break out of the PDF context somehow exploiting a viewer vulnerability. So the attack would be that one user uploads a malicious PDF for others to download. I think the best you can do against this is scan uploaded files for malware (which you should do anyway).
If you are generating all of this from user input (images, PDFs), then the libraries you use should take care of properly encoding values so that a malicious user can't inject code in a PDF. When the PDF is already generated, you can't "fix" it anymore, user input is mixed with code.
Also make sure to set the following header in responses (along with the correct Content-Type of course):
X-Content-Type-Options: nosniff
You do not need to encode the images themselves, you need to encode/escape the links to the images.
For example:
Link Title
where image.url.png?logout comes from user input.
You would url encode image.url.png?logout as image.url.png%3Flogout so that it is rendered useless to an attacker.
I have a need to access the encoded stream in OpenRasta before it gets sent to the client. I have tried using a PipelineContributor and registering it before KnownStages.IEnd, tried after KnownStages.IOperationExecution and after KnownStages.AfterResponseConding but in all instances the context.Response.Entity stream is null or empty.
Anyone know how I can do this?
Also I want to find out the requested codec fairly early on yet when I register after KnowStages.ICodecRequestSelection it returns null. I just get the feeling I am missing something about these pipeline contributors.
Without writing your own Codec (which, by the way, is really easy), I'm unaware of a way to get the actual stream of bytes sent to the browser. The way I'm doing this is serializing the ICommunicationContext.Response.Entity before the IResponseCoding known stage. Pseudo code:
class ResponseLogger : IPipelineContributor
{
public void Initialize(IPipeline pipelineRunner)
{
pipelineRunner
.Notify(LogResponse)
.Before<KnownStages.IResponseCoding>();
}
PipelineContinuation LogResponse(ICommunicationContext context)
{
string content = Serialize(context.Response.Entity);
}
string Serialize(IHttpEntity entity)
{
if ((entity == null) || (entity.Instance == null))
return String.Empty;
try
{
using (var writer = new StringWriter())
{
using (var xmlWriter = XmlWriter.Create(writer))
{
Type entityType = entity.Instance.GetType();
XmlSerializer serializer = new XmlSerializer(entityType);
serializer.Serialize(xmlWriter, entity.Instance);
}
return writer.ToString();
}
}
catch (Exception exception)
{
return exception.ToString();
}
}
}
This ResponseLogger is registered the usual way:
ResourceSpace.Uses.PipelineContributor<ResponseLogger>();
As mentioned, this doesn't necessarily give you the exact stream of bytes sent to the browser, but it is close enough for my needs, since the stream of bytes sent to the browser is basically just the same serialized entity.
By writing your own codec, you can with no more than 100 lines of code tap into the IMediaTypeWriter.WriteTo() method, which I would guess is the last line of defense before your bytes are transferred into the cloud. Within it, you basically just do something simple like this:
public void WriteTo(object entity, IHttpEntity response, string[] parameters)
{
using (var writer = XmlWriter.Create(response.Stream))
{
XmlSerializer serializer = new XmlSerializer(entity.GetType());
serializer.Serialize(writer, entity);
}
}
If you instead of writing directly to to the IHttpEntity.Stream write to a StringWriter and do ToString() on it, you'll have the serialized entity which you can log and do whatever you want with before writing it to the output stream.
While all of the above example code is based on XML serialization and deserialization, the same principle should apply no matter what format your application is using.
I need to create an ActionResult in an ASP.NET MVC application which has a .csv filetype.
I will provide a 'do not call' email list to my marketing partners and i want it to have a .csv extension in the filetype. Then it'll automatically open in Excel.
http://www.example.com/mailinglist/donotemaillist.csv?password=12334
I have successfully done this as follows, but I want to make sure this is the absolute best and recommended way of doing this.
[ActionName("DoNotEmailList.csv")]
public ContentResult DoNotEmailList(string username, string password)
{
return new ContentResult()
{
Content = Emails.Aggregate((a,b)=>a+Environment.NewLine + b),
ContentType = "text/csv"
};
}
This Actionmethod will respond to the above link just fine.
I'm just wondering if there is any likelihood of any unexpected conflict of having the file extension like this with any different version of IIS, any kind of ISAPI filter, or anything else I cant think of now.
I need to be 100% sure because I will be providing this to external partners and don't want to have to change my mind later. I really cant see any issues, but maybe theres something obscure - or another more "MVC" like way of doing this.
I used the FileContentResult action to also do something similar.
public FileContentResult DoNotEmailList(string username, string password)
{
string csv = Emails.Aggregate((a,b)=>a+Environment.NewLine + b);
byte[] csvBytes = ASCIIEncoding.ASCII.GetBytes( csv );
return File(csvBytes, "text/csv", "DoNotEmailList.csv");
}
It will add the content-disposition header for you.
I think your Response MUST contain "Content-Disposition" header in this case. Create custom ActionResult like this:
public class MyCsvResult : ActionResult {
public string Content {
get;
set;
}
public Encoding ContentEncoding {
get;
set;
}
public string Name {
get;
set;
}
public override void ExecuteResult(ControllerContext context) {
if (context == null) {
throw new ArgumentNullException("context");
}
HttpResponseBase response = context.HttpContext.Response;
response.ContentType = "text/csv";
if (ContentEncoding != null) {
response.ContentEncoding = ContentEncoding;
}
var fileName = "file.csv";
if(!String.IsNullOrEmpty(Name)) {
fileName = Name.Contains('.') ? Name : Name + ".csv";
}
response.AddHeader("Content-Disposition",
String.Format("attachment; filename={0}", fileName));
if (Content != null) {
response.Write(Content);
}
}
}
And use it in your Action instead of ContentResult:
return new MyCsvResult {
Content = Emails.Aggregate((a,b) => a + Environment.NewLine + b)
/* Optional
* , ContentEncoding = ""
* , Name = "DoNotEmailList.csv"
*/
};
This is how I'm doing something similar. I'm treating it as a download:
var disposition = String.Format(
"attachment;filename=\"{0}.csv\"", this.Model.Name);
Response.AddHeader("content-disposition", disposition);
This should show up in the browser as a file download with the given filename.
I can't think of a reason why yours wouldn't work, though.
The answer you accepted is good enough, but it keeps the content of the output in memory as it outputs it. What if the file it generates is rather large? For example, when you dump a contents of the SQL table. Your application could run out of memory. What you do want in this case is to use FileStreamResult. One way to feed the data into the stream could be using pipe, as I described here