Excel doesn't want to open XLSX files downloaded through ASP output - asp.net-mvc

I use ASP MVC3 framework, created an Excel file and outputted it using FileResult action with content type "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet".
When attempting to open it Excel just says "file is corrupt and cannot be opened."
When I open the source generated Excel file that was used to send the output it works without any problems. I also run file comparison on the bytes for both copies and the files are identical. I tried to email the corrupt file to myself and the attachment opens fine.
This leads me to believe it's a problem with headers or some sort of Excel/Windows security config.
If it is the latter, then I need a solution that won't make clients change their security settings.
EDIT - Found the setting:
I've found what setting causes this - "Enable protected view from files originated from the internet" in Excel's Trust Center / Protected View settings.
So I guess the question is - Is there a way for the file to appear trusted?
Here are the response headers:
Cache-Control:private
Content-Disposition:attachment;
filename="Report - Monday, March 19, 2012.xlsx" Content-Length:20569
Content-Type:application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
The action method that makes the output:
[HttpPost]
public virtual FileResult Export()
{
try
{
...
string newFilePath = createNewFile(...);
string downloadedFileName = "Report - " + DateTime.Now.ToString("D") + ".xlsx";
return File(newFilePath, "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet", downloadedFileName);
}
catch (Exception ex)
{
...
}
}
How I create the Excel file:
I have a template XLSX file witch column names and some pivot charts in other sheets. From C# I create a copy of this template and then call SQL Server which outputs data into 1st sheet using OLEDB connector:
set #SQL='insert into OPENROWSET(''Microsoft.ACE.OLEDB.12.0'', ''Excel 12.0;Database=' + #PreparedXLSXFilePath + ';'', ''SELECT * FROM [Data$]'') ...
Thanks in advance for any help.

You would need a digital signature in your Excel file. How to do this from code is another question.
More info here:
http://www.delphifaq.com/faq/windows_user/f2751.shtml

Related

How to avoid Excel "bad format error" when saving spreadsheet

Using PHPSpreadsheet saving XLSX format works OK running the default code
$writer = new \PhpOffice\PhpSpreadsheet\Writer\Xlsx($spreadsheet);
$writer->save("filename.xlsx");
But if I want to have the user to select the target directory using
header('Content-Type: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet');
header('Content-Disposition: attachment;filename="filename.xlsx"');
header('Cache-Control: max-age=0');
$writer = IOFactory::createWriter($spreadSheet, 'Xlsx');
$writer->save('php://output');
The file saves OK but Excel 2016 does not want to open it. Excel returns the following error
Excel Error
I looked through all documentation and posts but cannot find the solution.
Thanks !
Edit: Just in case, this solution does not work for me.
Edit 2: The sample provided Simple Download Xlsx works perfectly, but when doing a copy/paste for my spreadsheet, Chrome gives me a
Resource interpreted as Document but transferred with MIME type application/octet-stream
Edit 3: Used
ob_end_flush(); to clean any left over header in my code.
The file now saves OK, but needs repair when opening in Excel. Why ?
Thanks
Solution:
Bug from PhpSpreadsheet.
When using
header("Content-Type: application/vnd.ms-excel");
i.e. compatibility mode for Excel, the file opens OK.

How to get only file names from sharepoint using any API's and c#

I am trying to get only existing file names list with respected date and time from SharePoint using any API and C#.
I am able to download and upload files from SharePoint using webclient, but i am not able to get only file names list with respected date and time to datagridview. Please let me know if there is any solution for that.
I am developing in windows forms application using visual studio environment.
Thanks,
Please try the c# code below to get the information about the files in directory
class FileSysInfo
{
static void Main()
{
// Get the files in the directory and prints out filename, Last access time and length
System.IO.FileInfo[] fileNames = dirInfo.GetFiles("*.*");
foreach (System.IO.FileInfo fi in fileNames)
{
Console.WriteLine("{0}: {1}: {2}", fi.Name, fi.LastAccessTime, fi.Length);
}
}
}

Japanese language translation issue in CSV File - ASP.NET MVC

I am facing an issue while exporting japanese text in CSV format. Junk characters are being exported instead of original japanese text. I am using .NET MVC FileStreamResult to export records in Csv file and used encoding format as UTF8 (I have also used some other encoding format, but no luck). I debugged my code and able to convert string from memory stream and vice versa and able to see original japanese text being exported. Once exporting completed, I opened the CSV file, but only able to see junk character instead of expected text. If I open the CSV file in NotePad ( Opening the csv file in Notepad is NOT my requirement. I am referring Notepad only to verify whether i am able to see Japanese translated language ), then i can see the expected japanese text. It would be really helpful if someone please help me find root cause of this issue and provide a resolution.
Ex. 東京都品川区大崎 gets written as æ±äº¬éƒ½å“å·åŒºå¤§å´Ž
Note: I can see expected japanese text is exported properly if I opened the sample .CSV file using LibreOffice Calc, Linux default gEdit. But the issue is with opening this csv file using MS Office.
Please find the below attached code -
Controller/Action to execute while clicking on export to Csv button
================================================================================
[HttpPost]
[ValidateInput(false)]
public FileStreamResult SaveCustomerInfo()
{
return ExportToCsv();
}
================================================================================
private static FileStreamResult ExportToCsv()
{
var exportedData = new StringBuilder();
exportedData
.AppendLine("実行日,口座番号,支店番号,アカウント名,支店名,の/受益秩序,ステートメント日,入力日,お問い合わせ番号, ,Date Range")
.Append(
"CS0001,Demo FName,Demo LName,8/20/2015,\"Demo User Address\",City,Country,08830,0123456789,15813,Absolute from 8/20/2015 to 8/22/2015");
var stream = PrintingHelper.StringToMemoryStream(Encoding.UTF8, exportedData.ToString());
var fileStreamResult = new FileStreamResult(stream, "text/csv")
{
FileDownloadName =
new StringBuilder("TestExportedFileInCsv")
.Append(".csv").ToString()
};
return fileStreamResult;
}
It sound as though you haven't installed the language pack for MS Office on the machine that you are trying to open the csv on.

How do I save the origin html file with Apache Nutch

I'm new to search engines and web crawlers. Now I want to store all the original pages in a particular web site as html files, but with Apache Nutch I can only get the binary database files. How do I get the original html files with Nutch?
Does Nutch support it? If not, what other tools can I use to achieve my goal.(The tools that support distributed crawling are better.)
Well, nutch will write the crawled data in binary form so if if you want that to be saved in html format, you will have to modify the code. (this will be painful if you are new to nutch).
If you want quick and easy solution for getting html pages:
If the list of pages/urls that you intend to have is quite low, then better get it done with a script which invokes wget for each url.
OR use HTTrack tool.
EDIT:
Writing a your own nutch plugin will be great. Your problem will get solved plus you can contribute to nutch by submitting your work !!! If you are new to nutch (in terms of code & design), then you will have to invest lot of time building a new plugin ... else its easy to do.
Few pointers for helping your initiative:
Here is a page which talks about writing own nutch plugin.
Start with Fetcher.java. See lines 647-648. That is the place where you can get the fetched content on per url basis (for those pages which got fetched successfully).
pstatus = output(fit.url, fit.datum, content, status, CrawlDatum.STATUS_FETCH_SUCCESS);
updateStatus(content.getContent().length);
You should add code right after this to invoke your plugin. Pass content object to it. By now, you would have guessed that content.getContent() is the content for url you want. Inside the plugin code, write it to some file. Filename should be based on the url name else it will be difficult to work with that. Url can be obtained by fit.url.
You must do modifications in run Nutch in Eclipse.
When you are able to run, open Fetcher.java and add the lines between "content saver" command lines.
case ProtocolStatus.SUCCESS: // got a page
pstatus = output(fit.url, fit.datum, content, status, CrawlDatum.STATUS_FETCH_SUCCESS, fit.outlinkDepth);
updateStatus(content.getContent().length);'
//------------------------------------------- content saver ---------------------------------------------\\
String filename = "savedsites//" + content.getUrl().replace('/', '-');
File file = new File(filename);
file.getParentFile().mkdirs();
boolean exist = file.createNewFile();
if (!exist) {
System.out.println("File exists.");
} else {
FileWriter fstream = new FileWriter(file);
BufferedWriter out = new BufferedWriter(fstream);
out.write(content.toString().substring(content.toString().indexOf("<!DOCTYPE html")));
out.close();
System.out.println("File created successfully.");
}
//------------------------------------------- content saver ---------------------------------------------\\
To update this answer -
It is possible to post process the data from your crawldb segment folder, and read in the html (including other data nutch has stored) directly.
Configuration conf = NutchConfiguration.create();
FileSystem fs = FileSystem.get(conf);
Path file = new Path(segment, Content.DIR_NAME + "/part-00000/data");
SequenceFile.Reader reader = new SequenceFile.Reader(fs, file, conf);
try
{
Text key = new Text();
Content content = new Content();
while (reader.next(key, content))
{
System.out.println(new String(content.GetContent()));
}
}
catch (Exception e)
{
}
The answers here are obsolete. Now, it is simply possible to get the plain HTML-files with nutch dump. Please see this answer.
In apache Nutch 2.3.1
You can save the raw HTML by edit the Nutch code firstly run the nutch in eclipse by following https://wiki.apache.org/nutch/RunNutchInEclipse
After you finish ruunning nutch in eclipse edit file FetcherReducer.java , add this code to the output method, run ant eclipse again to rebuild the class
Finally the raw html will added to reportUrl column in your database
if (content != null) {
ByteBuffer raw = fit.page.getContent();
if (raw != null) {
ByteArrayInputStream arrayInputStream = new ByteArrayInputStream(raw.array(), raw.arrayOffset() + raw.position(), raw.remaining());
Scanner scanner = new Scanner(arrayInputStream);
scanner.useDelimiter("\\Z");//To read all scanner content in one String
String data = "";
if (scanner.hasNext()) {
data = scanner.next();
}
fit.page.setReprUrl(StringUtil.cleanField(data));
scanner.close();
}

What's the difference between the four File Results in ASP.NET MVC

ASP.NET has four different types of file results:
FileContentResult: Sends the contents of a binary file to the response.
FilePathResult: Sends the contents of a file to the response
FileResult: Returns binary output to write to the response
FileStreamResult: Sends binary content to the response by using a Stream instance
Those descriptions are take from MSDN and with the exception of the FileStreamResult the first three sound identical. So what is the difference between them?
FileResult is an abstract base class for all the others.
FileContentResult - you use it when you have a byte array you would like to return as a file
FilePathResult - when you have a file on disk and would like to return its content (you give a path)
FileStreamResult - you have a stream open, you want to return its content as a file
However, you'll rarely have to use these classes - you can just use one of Controller.File overloads and let ASP.NET MVC do the magic for you.
Great question...and deserves more details. I find myself here as a result of an interesting situation. We were delivering some pdf attachments via the MVC3/C# environment. Our code got released and we started getting some responses from our clients that the downloads were behaving strangely when they were using Chrome and the file type was being converted over to 'pdf-, attachment.pdf-, attachment'. Yup...you got it...the whole thing. So, one could rewrite it to just be 'pdf' and the file would still save intact, but what a mess!
So, to describe the initial situation, we were setting the 'Content-Disposition' header then returning a FileContentResult...
var cd = new System.Net.Mime.ContentDisposition
{
FileName = result.Attachment.FileName,
Inline = false
};
Response.AppendHeader("Content-Disposition", cd.ToString());
return File(result.Attachment.Data, MimeExtensionHelper.GetMimeType(result.Attachment.FileName), result.Attachment.FileName);
Seemed good. Worked fine in IE. So I did some research and tried implementing FileStreamResult instead (keeping the Content-Disposition setter):
MemoryStream dataStream = new MemoryStream();
dataStream.Write(result.Attachment.Data, 0, result.Attachment.Data.Length);
dataStream.Position = 0;
return new FileStreamResult(dataStream, MimeExtensionHelper.GetMimeType(result.Attachment.FileName));
It fixed the issue in Chrome! Hmmm...but why in the heck should I have to take my perfectly good byte array and stream it and then return it via this to get the file name to work right?
Then came the Fiddler.
With FileContentResult, I got 2 Content-Dispositions in the header.
With FileStreamResult, I got 1.
FileContentResult appends a Content-Disposition header when providing the File Name and Chrome considers multiples of this header as an error.
Odd reaction...but definitely one that's good to know.

Resources