How to read local file with Grails? - grails

When my Grails application starts, I build up a data structure from a CSV file downloaded from a remote URL. If the file is not accessible, I'd like to fall back to a local copy. Currently processing the file in the service layer, initiated using a Quartz job.
What is the best practice, using Groovy, for reading a local resource in Grails?
Where should I stash the file?
How do I safely and properly read the file?
General-case answers will be very acceptable.

I think the best way to deal with this is to store the file's location in an externalized configuration file.
So, you'd determine a standardized location (such as /etc/myappname/CSVFileConfig.groovy), or pass the config file path in using an environment variable or something similar. See Externalized Configuration for examples.
Then you can simply add the actual path to the local file to that extenal config, like so:
// CSVFileConfig.groovy
my.custom.csv.path = ...
Finally, access it using normal config operations:
// in your Quartz job
def path = grailsApplication.config.my?.custom?.csv?.path
if(!path) {
// no file to load
} else {
// load file
}
As far as reading the file, what are your primary concerns? If you are using a CSV library, such as OpenCSV (used in most of the Grails libraries for CSV parsing), it will handle the opening and parsing of the file.
For security issues beyond that, I'm not sure how to handle them in a generic way. It will depend on your specific scenario. I think the one coming from a URL has a higher risk factor.

Related

is there any way I can avoid reading old files from old folder with Apache Beam's TextIo watchForNewFiles(Duration, condition)?

Use Case: During dataflow job start up we should provide initial file name to read data and later on it should watch for new files in that directory and it should consider all remaining old files as already read.
Issues:
Approach 1:
PCollection<String> readfile = pipeline.apply(TextIO.read().from("gs://folder-Name/*").
watchForNewFiles(Duration.standardSeconds(10),
Watch.Growth.afterTimeSinceNewOutput(Duration.standardSeconds(30))));
If we are using like this its considering old files as new files for this dataflow job and reading all those files in that folder
Approach 2:
PCollection<String> readfile = pipeline.apply(TextIO.read().from("gs://folder-Name/file-name").
watchForNewFiles(Duration.standardSeconds(10),
Watch.Growth.afterTimeSinceNewOutput(Duration.standardSeconds(30))));
Its reading only this particular file and not able to read upcoming new files
can anyone please suggest the approach to achieve my use case?
The watchForNewFiles() function will always read all files matching the filepattern, both existing and new. In your second approach, the file pattern is only one file, so you just get that.
However, you can use the lower-level building block transforms in FileIO to accomplish what you need. The following code will just read files written after the pipeline starts:
PCollection<String> lines = p
.apply(FileIO.match().filepattern("gs://folder-Name/*")
.continuously(Duration.standardSeconds(30), afterTimeSinceNewOutput(Duration.standardHours(1)))
.setCoder(MetadataCoderV2.of())
.apply(Filter.by(metadata -> metadata.lastModifiedMillis() > PIPELINE_START))
.apply(FileIO.readMatches())
.apply(apply(TextIO.readFiles()))
You can change the details of the Filter transform to whatever precise condition you need. To also include specific older files, you can read those with a standard TextIO.read().from(...) and then use Flatten to combine that PCollection with the continuous set. Like this:
PCollection allLines =
PCollectionList.of(lines).and(p.apply(TextIO.read().from("gs://folder-Name/file-name)
.apply(Flatten.pCollections())
Maybe you need to clarify your Use Case, do you provide a file name to read ? or a file pattern ? What is the number of files expected ? Should you really use a Dataflow streaming pipeline ? Doesn't a Cloud Function answer your need ? What is your issue ? Files get read again when you restart your pipeline ?
You can, as suggested by danielm use FileIO to fetch and filter on file metadata in order to know which file was added after the pipeline began.
If you provide a file pattern, then all file will be read once by the pipeline. There's no way to keep a State between pipelines if you not code it yourself, so when you restart the pipeline you will read again all the file matching the pattern.
If you want to avoid that, you can manually move old files to another path between stopping the old pipeline and starting a new one.
You could also consider is consuming GCS notification on file creation with PubsubIO and use this event to know which file to treat in your pipeline.
A good practice though is to have multiple folders that reflects the status of the files:
input
processing
failed
succeed
This way you know the state of each file. You can put files to treat in the input folder, and inside your pipeline move the file to its corresponding state folder.

Is it possible to configure Serilog to truncate (i.e. make empty) the log file for each new process?

Moving from nlog to serilog, I would like my .NET framework desktop application to reuse a statically-named log file each time I run it, but to clear out the contents of the file with each new process. Is it possible to configure serilog this way?
This is a similar question, but it's not quite the same. In the linked question, the user uses a new log file each time with a unique filename. In my case, I want to use the same log file name each time.
This is not something Serilog can do for you as of this writing.
Serilog.Sinks.File is hard-coded to open the file with FileMode.Append thus if the file already exists, it will always append contents at the end of the file.
FileLifecycleHooks allows you to intercept when the file is being opened, and that would give you an opportunity to remove the contents of the file (by calling SetLength(0) on the stream), but unfortunately the stream implementation that Serilog.Sinks.File uses (WriteCountingStream) does not support SetLength.
Your best bet is to just truncate or delete the log file yourself at the start of the app, and let Serilog create a new one.
e.g.
// Ensure that the log file is empty
using (var fs = File.OpenWrite("mylog.log")) { fs.SetLength(0); }
// ... Configure Serilog pipeline

Generate URL of resources that are handled by Grails AssetPipeline

I need to access a local JSON file. Since Grails 2.4 implements the AssetPipeline plugin by default, I saved my local JSON file at:
/grails-app/assets/javascript/vendor/me/json/local.json
Now what I need is to generate a URL to this JSON file, to be used as a function parameter on my JavaScript's $.getJSON() . I've tried using:
var URL.local = ""${ raw(asset.assetPath(src: "local.json")) }";
but it generates an invalid link:
console.log(URL.local);
// prints /project/assets/local.json
// instead of /project/assets/vendor/me/json/local.json
I also encountered the same scenario with images that are handled by AssetPipeline1.9.9— that are supposed to be inserted dynamically on the page. How can I generate the URL pointing this resource? I know, I can always provide a static String for the URL, but it seems there would be a more proper solution.
EDIT
I was asked if I could move the local JSON file directly under the assets/javascript root directory instead of placing it under a subdirectory to for an easier solution. I prefer not to, for organization purposes.
Have you tried asset.assetPath(src: "/me/json/local.json")
The assets plugin looks in all of the immediate children of assets/. Your local.json file would need to be placed in /project/assets/foo/ for your current code to pick it up.
Check out the relevant documentation here which contains an example.
The first level deep within the assets folder is simply used for organization purposes and can contain folders of any name you wish. File types also don't need to be in any specific folder. These folders are omitted from the URL mappings and relative path calculations.

Get config value from file, or environment variable if file doesn't exist

I'm trying to get a setting from a configuration file (preferably something simple like .ini or JSON, not XML). If the file or setting does not exist, I want to be able to fall back to retrieving an environment variable.
I'd prefer to use an existing library for working with JSON/INI and not parsing the file myself. However, most libraries I've found won't work if a file doesn't exist.
How would I access a configuration value from a file that may or may not exist in F#?
You can use File.Exists to test whether or not the file exists:
open System.IO
let getConfig file =
if File.Exists file
then "config from file"
else "config from somewhere else"
OpenExeConfiguration (despite it's name) can open an arbitrary config file.
There's also the ASP.NET vNext Configuration stuff, outlined in this article which is quite flexible - no idea how separable (or relevant to your actual use case) it is [aside from the fact that you could conditionally include the config file into the config manager depending on whether it exists a la Mark's answer].
In addition to type providers, FSharp.Data provides some basic parsers, including JSON. This allows you to do a runtime check using File.Exists and then parse using your preferred utility.
I took the following approach in FAKE:
if File.Exists "local.json" then
let localVarProps = JsonValue.Parse(File.ReadAllText"local.json").Properties
for key, jsonValue in localVarProps do
setEnvironVar key (jsonValue.AsString())

Custom configuration file in MVC4

I'm building an ASP.Net MVC4 application and the customer wants to be able to supply an XML configuration file, to configure a vendor list in the application, something like this:
<Vendor>
<Vendor name="ABC Computers" deliveryDays="10"/>
<Vendor name="XYZ Computers" deliveryDays="15"/>
</Vendors>
The file needs to be dropped onto a network location (i.e. not on the web server) and I don't have a database to import and store the data.
The customer also wants the ability to update it daily. So I'm thinking I'll have to do some kind of import (and validate the file) when the application starts up.
Any good ideas on the best way to accomplish this?
- The data needs to be quickly accessible
- Ideally I just want to import/store it once, or be able to access it quickly
- I need to be able to validate the file, so it might be prudent to be able to be able to switch to a backup
One thought was to use something like Entity Framework and simply read the file whenever I needed it, but if possible I'd hold it in memory in the application if possible.
Cheers
Vincent
No need to import it into a database or use Entity Framework. You can simply use .NET Xml Serialization to accomplish this.
The command line tool xsd.exe will generate c# classes from your Xml file. From the command line:
xsd.exe myfile.xml
xsd.exe /c myfile.xsd
The first command will infer and create an xml schema file (myfile.xsd) from your xml. The second command will convert the schema file to c# classes.
Then use the XmlSerializer class to deserialize your xml file into objects (assuming multiple objects in one file):
MyCollection myObjects= null;
string path = "mydata.xml";
XmlSerializer serializer = new XmlSerializer(typeof(MyCollection));
StreamReader reader = new StreamReader(path);
myObjects = (MyCollection)serializer.Deserialize(reader);
reader.Close();
You can use the .xsd file generated above to validate your xml files. Here's a link showing how: http://msdn.microsoft.com/en-us/library/ms162371.aspx.

Resources