Where to save files from Firefox add-on? - firefox-addon

I am working on a Firefox add-on which among other stuff generates thumbnails of websites for use by the add-on. So far I've been storing them by their image data URL using simple-storage. Two problems with this: the storage space is limited and sending very long strings around doesn't seem optimal(I assume the browser has optimized ways of loading image files, but maybe not data URLs). I think it shouldn't be a problem to save the files to disk, the question is where though. I googled quite a bit and could not find anything. Is there a natural place for this? Are there any restrictions?

As of Firefox 32, the place to store data for your add-on is supposed to be: [profile]/extension-data/[add-on ID]. This was established by the resolution of "Bug 915838 - Provide add-ons a standard directory to store data, settings". There is a follow-on bug, "Bug 952304 - (JSONStore) JSON storage API for addons to use in storing data and settings" which is supposed to provide an API for easy access.
For the Addon-SDK, you can obtain the addon ID (which you define in package.json) with:
let self = require("sdk/self");
let addonID = self.id;
For XUL and restartless extensions, you should be able to get the ID of your addon (which you define in the install.rdf file) with:
Components.utils.import("resource://gre/modules/Services.jsm");
let addonID = Services.appInfo.ID
You can then do the following to generate a URI for a file in that directory:
userProfileDirectoryPath = Components.classes["#mozilla.org/file/directory_service;1"]
.getService( Components.interfaces.nsIProperties)
.get("ProfD", Components.interfaces.nsIFile).path,
/**
* Generate URI for a filename in the extension's data directory under the preferences
* directory.
*/
function generateURIForFileInPrefExtensionDataDirectory (fileName) {
//Account for the path separator being OS dependent
let toReturn = "file://" + userProfileDirectoryPath.replace(/\\/g,"/");
return toReturn +"/extension-data/" + addonID + "/" + fileName;
}
}
The object myExtension.addonData is a copy that I store of the Bootstrap data provided to entry points in bootstrap.js.

Related

Absolute path for the internal storage on iOS device

I am using PCLStorage to interact with local files on both Android and iOS platforms.
I am using the following code snippet.
IFolder rootFolder = await FileSystem.Current.GetFolderFromPathAsync(path);
IFolder folder = await rootFolder.CreateFolderAsync("HandSAppPdf", CreationCollisionOption.OpenIfExists);
IFile file = await folder.CreateFileAsync("Hello.pdf", CreationCollisionOption.GenerateUniqueName);
in the case of Android, I have the
path ="/storage/emulated/0/"
But I am not sure what would be the path in the case of iOS. if anyone can help me out, I would much appreciate that.
Your application’s access to the file system (and other resources such as the network and hardware features) is limited for security reasons. This restriction is known as the Application Sandbox.
Since iOS11, Files App in your phone has been used for users to access the document which an iOS application created. I recommend you to follow this File system access in Xamarin.iOS and its demo. You could generate a new text file in your Application's Documents Folder like this:
public static string WriteFile()
{
var documents = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
var filename = Path.Combine(documents, "Write.txt");
File.WriteAllText(filename, "Write this text into a file!");
return "Text was written to a file." + Environment.NewLine
+ "-----------------" + Environment.NewLine
+ File.ReadAllText(filename);
}
And this file could be accessed through Files App.
Also to allow the user to directly access files in your app, remember to create a new boolean key in the Info.plist file LSSupportsOpeningDocumentsInPlace and set it to true.

How to read config file in electronjs app

It's my first time using Electron JS and nodejs. I've built a small app that reads some records from a database and updates them. Everything is working fine. I have a config file with the database credentials but when I build a portable win app, I cannot figure out how to read the config file that I would like to place next to the exe. I would like to have easy access to the file, so I could run the same app on different databases.
Can anyone tell me if what I want is possible and how? I already tried to get the exe location but I couldn't. I also read a lot of topics here but nothing seems to solve my problem (I might be doing something wrong).
I'm using electron-builder to build my app.
Thanks in advance.
Edit #1
My Config file is
{
"user" :"X",
"password" :"X",
"server":"X",
"database":"X",
"options":
{
"trustedconnection": true,
"enableArithAbort" : true,
"trustServerCertificate": true
}
}
This is what I've and works when I run the project with npm start
const configRootPath = path.resolve(__dirname,'dbConfig.json');
dbConfig = JSON.parse(fs.readFileSync(configRootPath, { encoding: 'utf-8' }));
However, when I build it, the app is looking for the file in another location different from the one where the executable is.
Use of Electron's app.getPath(name) function will get you the path(s) you are after, irrespective of which OS (Operating System) you are using.
Unless your application writes your dbConfig.json file, it may be difficult for your user to understand exactly where they should place their database config file as each OS will run and store your application data in a different directory. You would need to be explicit to the user as to where to place their config file(s). Alternatively, your application could create the config file(s) on the user's behalf (automatically or through a html form) and save it to a location 'known' to the application.
A common place where application specific config files are stored is in the user's application data directory. With the application name automatically amended to the directory, it can be found as shown below.
const electronApp = require('electron').app;
let appUserDataPath = electronApp.getPath('userData');
console.log(appUserDataPath );
In your use case, the below would apply.
const electronApp = require('electron').app;
const nodeFs = require('fs');
const nodePath = require('path');
const configRootPath = nodePath.join(electronApp.getPath('userData'), 'dbConfig.json');
dbConfig = JSON.parse(nodeFs.readFileSync(configRootPath, 'utf-8'));
console.log(configRootPath);
console.log(dbConfig);
You can try electron-store to store config.
Electron doesn't have a built-in way to persist user preferences and other data. This module handles that for you, so you can focus on building your app. The data is saved in a JSON file named config.json in app.getPath('userData').

How can I locate a file to use with the system/child_process API within a Firefox add-on?

I would like to write a Firefox add-on that communicates with a locally installed program to exchange data. It looks like this can be done using either js-ctypes or the low-level system/child_process API, with the latter being the recommended solution.
The child_process API appeals because it sends and receives data abstractly over a pipe rather than directly at the C interface level. However, to use it you need (it seems) to supply the full path to the executable within your code:
var child_process = require("sdk/system/child_process");
var ls = child_process.spawn('/bin/ls', ['-lh', '/usr']);
In my case, the executable is installed by another application and we don't know it's exact location - it will differ according to OS, the user's drives, and possibly the user's preference. I imagine this problem will be common to most executables that are not built in to the OS. So my question is: what means do I have to locate the full path of the executable I want to use? I will need to support multiple OSes but presumably could have different solutions for each if needed.
Thanks!
Here's the code I used on Windows - the key was being able to read an environment variable to find the location of the appropriate application folder. After that I assume that my application is stored under a well-known subpath (we don't allow customization of it).
var system = require("sdk/system");
var iofile = require('sdk/io/file');
var child_process = require('sdk/system/child_process');
var progFilesFolder = system.env["programfiles(x86)"],
targetFile = iofile.join(progFilesFolder, 'FolderName', 'Program.exe');
targetFileExists = iofile.exists(targetFile);
if (targetFileExists) {
var p = child_process.spawn(targetFile);
}
I haven't written the code for Mac yet but I expect it to be similar, with the difference being that there are no drive letters to worry about and the system folders in OS X have standard names (even on localized systems).

How to download all files in an Azure Container Directory?

I have an aspnet app which i upload files to the azure blobs. I know that azure don't create structural paths in the containers, just blobs, but you can emulate directories putting a "/" on the uri.
i.e
I'd upload a list of files and my uri is like this
http://myaccount.windowsazure.blob.net/MyProtocolID-01/MyDocumentID-01/FileName01.jpg
http://myaccount.windowsazure.blob.net/MyProtocolID-01/MyDocumentID-01/FileName02.jpg
http://myaccount.windowsazure.blob.net/MyProtocolID-01/MyDocumentID-01/FileName03.jpg
My download method:
public RemoteFile Download(DownloadRequest request)
{
var fileFinal = string.Format("{0}/{1}/{2}",request.IDProtocol ,request.IDDocument, request.FileName);
var blobBlock = InitializeDownload(fileFinal);
if (!blobBlock.Exists())
{
throw new FileNotFoundException("Error");
}
var stream = new MemoryStream();
blobBlock.DownloadToStream(stream);
return File(request.FileName)
}
private CloudBlob InitializeDownload(string uri)
{
var blobBlock = _blobClient.GetBlobReference(uri);
return blobBlock;
}
This way, i'm getting just one file. But i need to see and download all files inside http://myaccount.windowsazure.blob.net/MyProtocolID-01/MyDocumentID-01/
Thanks
Adding more details. You will need to use one of the listing APIs provided by the client library: CloudBlobContainer.ListBlobs(), CloudBlobContainer.ListBlobsSegmented(), and CloudBlobContainer.ListBlobsSegmentedAsync() (and various overloads.). You can specify the directory prefix, and the service will only enumerate blobs matching the prefix. You can then download each blob. You may also want to look at the ‘useFlatBlobListing’ argument, depending on your scenario.
http://msdn.microsoft.com/en-us/library/microsoft.windowsazure.storage.blob.cloudblobcontainer.listblobs.aspx
In addition AzCopy (see http://blogs.msdn.com/b/windowsazurestorage/archive/2012/12/03/azcopy-uploading-downloading-files-for-windows-azure-blobs.aspx) also supports this scenario of downloading all blobs in a given directory path.
Since each blob is a separate web resource, function above will download only one file. One thing you could do is list all blobs using the logic you are using and then download those blobs on your server first, zip them and the return that zip file to your end user.
Use AzCopy functionalities, now, it has a lot of supports.
https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10

How do I save the origin html file with Apache Nutch

I'm new to search engines and web crawlers. Now I want to store all the original pages in a particular web site as html files, but with Apache Nutch I can only get the binary database files. How do I get the original html files with Nutch?
Does Nutch support it? If not, what other tools can I use to achieve my goal.(The tools that support distributed crawling are better.)
Well, nutch will write the crawled data in binary form so if if you want that to be saved in html format, you will have to modify the code. (this will be painful if you are new to nutch).
If you want quick and easy solution for getting html pages:
If the list of pages/urls that you intend to have is quite low, then better get it done with a script which invokes wget for each url.
OR use HTTrack tool.
EDIT:
Writing a your own nutch plugin will be great. Your problem will get solved plus you can contribute to nutch by submitting your work !!! If you are new to nutch (in terms of code & design), then you will have to invest lot of time building a new plugin ... else its easy to do.
Few pointers for helping your initiative:
Here is a page which talks about writing own nutch plugin.
Start with Fetcher.java. See lines 647-648. That is the place where you can get the fetched content on per url basis (for those pages which got fetched successfully).
pstatus = output(fit.url, fit.datum, content, status, CrawlDatum.STATUS_FETCH_SUCCESS);
updateStatus(content.getContent().length);
You should add code right after this to invoke your plugin. Pass content object to it. By now, you would have guessed that content.getContent() is the content for url you want. Inside the plugin code, write it to some file. Filename should be based on the url name else it will be difficult to work with that. Url can be obtained by fit.url.
You must do modifications in run Nutch in Eclipse.
When you are able to run, open Fetcher.java and add the lines between "content saver" command lines.
case ProtocolStatus.SUCCESS: // got a page
pstatus = output(fit.url, fit.datum, content, status, CrawlDatum.STATUS_FETCH_SUCCESS, fit.outlinkDepth);
updateStatus(content.getContent().length);'
//------------------------------------------- content saver ---------------------------------------------\\
String filename = "savedsites//" + content.getUrl().replace('/', '-');
File file = new File(filename);
file.getParentFile().mkdirs();
boolean exist = file.createNewFile();
if (!exist) {
System.out.println("File exists.");
} else {
FileWriter fstream = new FileWriter(file);
BufferedWriter out = new BufferedWriter(fstream);
out.write(content.toString().substring(content.toString().indexOf("<!DOCTYPE html")));
out.close();
System.out.println("File created successfully.");
}
//------------------------------------------- content saver ---------------------------------------------\\
To update this answer -
It is possible to post process the data from your crawldb segment folder, and read in the html (including other data nutch has stored) directly.
Configuration conf = NutchConfiguration.create();
FileSystem fs = FileSystem.get(conf);
Path file = new Path(segment, Content.DIR_NAME + "/part-00000/data");
SequenceFile.Reader reader = new SequenceFile.Reader(fs, file, conf);
try
{
Text key = new Text();
Content content = new Content();
while (reader.next(key, content))
{
System.out.println(new String(content.GetContent()));
}
}
catch (Exception e)
{
}
The answers here are obsolete. Now, it is simply possible to get the plain HTML-files with nutch dump. Please see this answer.
In apache Nutch 2.3.1
You can save the raw HTML by edit the Nutch code firstly run the nutch in eclipse by following https://wiki.apache.org/nutch/RunNutchInEclipse
After you finish ruunning nutch in eclipse edit file FetcherReducer.java , add this code to the output method, run ant eclipse again to rebuild the class
Finally the raw html will added to reportUrl column in your database
if (content != null) {
ByteBuffer raw = fit.page.getContent();
if (raw != null) {
ByteArrayInputStream arrayInputStream = new ByteArrayInputStream(raw.array(), raw.arrayOffset() + raw.position(), raw.remaining());
Scanner scanner = new Scanner(arrayInputStream);
scanner.useDelimiter("\\Z");//To read all scanner content in one String
String data = "";
if (scanner.hasNext()) {
data = scanner.next();
}
fit.page.setReprUrl(StringUtil.cleanField(data));
scanner.close();
}

Resources