Dart directory scan and list population

Dart directory scan and list population - dart

I'm new to this dart stuff and having problems with creating a list with file names from a directory. The example code makes sense but doesn't work and provides no error messages.
I'm really not enjoying how Dart complicates simple tasks.
var flist = new List();
Process.run('cmd', ['/c', 'dir *.txt /b']).then((ProcessResult results) {
flist.add(results.toString());
});
i know it's way off.. how do i go about this without having to call any outside voids.. i'd like to keep my code in the 'main' function.

You might find this answer useful. It shows how to use the Directory object to list contents of a directory: How do I list the contents of a directory with Dart?
Looks like you're trying to find all files in a directory that end in *.txt. You could do this:
import 'dart:io';
main() {
var dir = new Directory('some directory');
var contents = dir.listSync();
var textFiles = contents.filter((f) => f.name.endsWith('.txt'));
}

Related

Dart build runner generate one dart file with content

I am working on a dart package with includes over 200 models and at the moment i have to write manually one line of "export" for each model, to make the models available for everyone who uses this package.
I want the build runner to generate one dart file which contains every export definition.
Therefore I would create an annotation "ExportModel". The builder should search for each class annotated with this annotation.
I tried creating some Builders, but they will generate a *.g.dart file for each class that is annotated. I just want to have one file.
Is where a way to create a builder that runs only once and creates a file at the end ?

The short answer to your question of a builder that only runs once and creates a single file in the package is to use r'$lib$' as the input extension. The long answer is that to find the classes that are annotated you probably want an intermediate output to track them.
I'd write this with 2 builders, one to search for the ExportModel annotation, and another to write the exports file. Here is a rough sketch with details omitted - I haven't tested any of the code here but it should get you started on the right path.
Builder 1 - find the classes annotated with #ExportModel().
Could write with some utilities from package:source_gen, but can't use LibraryBuilder since it's not outputting Dart code...
Goal is to write a .exports file next to each .dart file which as the name of all the classes that are annotated with #ExportModel().
class ExportLocatingBuilder implements Builder {
#override
final buildExtensions = const {
'.dart': ['.exports']
};
#override
Future<void> build(BuildStep buildStep) async {
final resolver = buildStep.resolver;
if (!await resolver.isLibrary(buildStep.inputId)) return;
final lib = LibraryReader(await buildStep.inputLibrary);
final exportAnnotation = TypeChecker.fromRuntime(ExportModel);
final annotated = [
for (var member in lib.annotatedWith(exportAnnotation)) element.name,
];
if (annotated.isNotEmpty) {
buildStep.writeAsString(
buildStep.inputId.changeExtension('.exports'), annotated.join(','));
}
}
}
This builder should be build_to: cache and you may want to have a PostProcessBuilder that cleans up all the outputs it produces which would be specified with applies_builder. You can use the FileDeletingBuilder to cheaply implement the cleanup. See the FAQ on temporary outputs and the angular cleanup for example.
Builder 2 - find the .exports files and generate a Dart file
Use findAssets to track down all those .exports files, and write an export statement for each one. Use a show with the content of the file which should contain the names of the members that were annotated.
class ExportsBuilder implements Builder {
#override
final buildExtensions = const {
r'$lib$': ['exports.dart']
};
#override
Future<void> build(BuildStep buildStep) async {
final exports = buildStep.findAssets(Glob('**/*.exports'));
final content = [
await for (var exportLibrary in exports)
'export \'${exportLibrary.changeExtension('.dart').uri}\' '
'show ${await buildStep.readAsString(exportLibrary)};',
];
if (content.isNotEmpty) {
buildStep.writeAsString(
AssetId(buildStep.inputId.package, 'lib/exports.dart'),
content.join('\n'));
}
}
}
This builder should likely be build_to: source if you want to publish this file on pub. It should have a required_inputs: [".exports"] to ensure it runs after the previous builder.
Why does it need to be this complex?
You could implement this as a single builder which uses findAssets to find all the Dart files. The downside is that rebuilds would be much slower because it would be invalidated by any content change in any Dart file and you'd end up parsing all Dart code for a change in any Dart code. With the 2 builder approach then only the individual .exports which come from a changed Dart file need to be resolved and rebuilt on a change, and then only if the exports change will the exports.dart file be invalidated.
Older versions of build_runner also didn't support using the Resolver to resolve code that isn't transitively imported from the input library. Recent version of build_runner have relaxed this constraint.

Using Futures to load config.json in Flutter

Being new to Dart/Flutter I am using this snippet to try and load a config.json file that I have stored in my assets folder. In trying to read this file, I am using models on the Dart language Futures documentation and in the Flutter docs on reading local text files:
import 'dart:async' show Future;
import 'package:flutter/services.dart' show rootBundle;
import 'dart:convert';
Future<List> loadAsset() async {
String raw = await rootBundle.loadString('assets/config.json');
List configData = json.decode(raw);
return configData;
}
Then, inside my class, I try to load the config into a List, like this:
Future<List> configData = loadAsset();
print(configData.toString());
// prints out: Instance of 'Future<List<dynamic>>'
The result of all this seems to work. Yet I can find no way of using the data I have loaded. Any effort to access elements in the List, e.g. configData[0] results in an error:
The following _CompileTimeError was thrown building
HomePage(dirty, state: HomePageState#b1af8):
'package:myapp/pages/home_page.dart': error:
line 64 pos 19: lib/pages/home_page.dart:64:19:
Error: The method '[]' isn't defined for the class
'dart.async::Future<dart.core::List<dynamic>>'.
Try correcting the name to the name of an existing method,
or defining a method named '[]'.
I would like to convert the configData Future into a normal object that I can read and pass around my app. I am able to do something very similar, and to get it to work inside a widget's build method, using a FutureBuilder and the DefaultAssetBundle thus...
DefaultAssetBundle
.of(context)
.loadString('assets/config.json')
...but I don't want the overhead of reloading the data inside all the widgets that need it. I would like to load inside a separate Dart package and have it available as a global configuration across all my app. Any pointers would be appreciated.
I have tried the suggestion by Rémi Rousselet:
List configData = await loadAsset();
print(configData[0]);
In this case, I get a compiler error:
compiler message: lib/pages/home_page.dart:55:21: Error: Getter not found: 'await'.
compiler message: List configData = await loadAsset();
compiler message: ^^^^^

You can't do configData[0] as configData is not a List but a Future.
Instead, await the future to have access to the List inside
List configData = await loadAsset();
print(configData[0]);

You can only use await INSIDE async methods.
If you want to you your assets in entire application you want to load the asset in the main method similar like this.
void main() async {
StorageUtils.localStorage = await SharedPreferences.getInstance();
}
Now you can use localStorage synchronously in entire application and you don't need to deal with another asynchronous calls or load it again.
Different example, same principle.

How to read file from an imported library

I have two packages: webserver and utils which provides assets to webserver.
The webserver needs access to static files inside utils. So I have this setup:
utils/
lib/
static.html
How can I access the static.html file in one of my dart scripts in webserver?
EDIT: What I tried so far, is to use mirrors to get the path of the library, and read it from there. The problem with that approach is, that if utils is included with package:, the url returned by currentMirrorSystem().findLibrary(#utils).uri is a package uri, that can't be transformed to an actual file entity.

Use the Resource class, a new class in Dart SDK 1.12.
Usage example:
var resource = new Resource('package:myapp/myfile.txt');
var contents = await resource.loadAsString();
print(contents);
This works on the VM, as of 1.12.
However, this doesn't directly address your need to get to the actual File entity, from a package: URI. Given the Resource class today, you'd have to route the bytes from loadAsString() into the HTTP server's Response object.

I tend to use Platform.script or mirrors to find the main package top folder (i.e. where pubspec.yaml is present) and find imported packages exported assets. I agree this is not a perfect solution but it works
import 'dart:io';
import 'package:path/path.dart';
String getProjectTopPath(String resolverPath) {
String dirPath = normalize(absolute(resolverPath));
while (true) {
// Find the project root path
if (new File(join(dirPath, "pubspec.yaml")).existsSync()) {
return dirPath;
}
String newDirPath = dirname(dirPath);
if (newDirPath == dirPath) {
throw new Exception("No project found for path '$resolverPath");
}
dirPath = newDirPath;
}
}
String getPackagesPath(String resolverPath) {
return join(getProjectTopPath(resolverPath), 'packages');
}
class _TestUtils {}
main(List<String> arguments) {
// User Platform.script - does not work in unit test
String currentScriptPath = Platform.script.toFilePath();
String packagesPath = getPackagesPath(currentScriptPath);
// Get your file using the package name and its relative path from the lib folder
String filePath = join(packagesPath, "utils", "static.html");
print(filePath);
// use mirror to find this file path
String thisFilePath = (reflectClass(_TestUtils).owner as LibraryMirror).uri.toString();
packagesPath = getPackagesPath(thisFilePath);
filePath = join(packagesPath, "utils", "static.html");
print(filePath);
}
To note that since recently Platform.script is not reliable in unit test when using the new test package so you might use the mirror tricks that I propose above and explained here: https://github.com/dart-lang/test/issues/110

How to Get Filename when using file pattern match in google-cloud-dataflow

Someone know how to get Filename when using file pattern match in google-cloud-dataflow?
I'm newbee to use dataflow. How to get filename when use file patten match, in this way.
p.apply(TextIO.Read.from("gs://dataflow-samples/shakespeare/*.txt"))
I'd like to how I detect filename that kinglear.txt,Hamlet.txt, etc.

If you would like to simply expand the filepattern and get a list of filenames matching it, you can use GcsIoChannelFactory.match("gs://dataflow-samples/shakespeare/*.txt") (see GcsIoChannelFactory).
If you would like to access the "current filename" from inside one of the DoFn's downstream in your pipeline - that is currently not supported (though there are some workarounds - see below). It is a common feature request and we are still thinking how best to fit it into the framework in a natural, generic and high-performant way.
Some workarounds include:
Writing a pipeline like this (the tf-idf example uses this approach):
DoFn readFile = ...(takes a filename, reads the file and produces records)...
p.apply(Create.of(filenames))
.apply(ParDo.of(readFile))
.apply(the rest of your pipeline)
This has the downside that dynamic work rebalancing features won't work particularly well, because they currently apply at the level of Read PTransform's only, but not at the level of ParDo's with high fan-out (like the one here, which would read a file and produce all records); and parallelization will only work to the level of files but files will not be split into sub-ranges. At the scale of reading Shakespeare this is not an issue, but if you are reading a set of files of wildly different size, some extremely large, then it may become an issue.
Implementing your own FileBasedSource (javadoc, general documentation) which would return records of type something like Pair<String, T> where the String is the filename and the T is the record you're reading. In this case the framework would handle the filepattern matching for you, dynamic work rebalancing would work just fine, however it is up to you to write the reading logic in your FileBasedReader.
Both of these work-arounds are non-ideal, but depending on your requirements, one of them may do the trick for you.

Update based on latest SDK
Java (sdk 2.9.0):
Beams TextIO readers do not give access to the filename itself, for these use cases we need to make use of FileIO to match the files and gain access to the information stored in the file name. Unlike TextIO, the reading of the file needs to be taken care of by the user in transforms downstream of the FileIO read. The results of a FileIO read is a PCollection the ReadableFile class contains the file name as metadata which can be used along with the contents of the file.
FileIO does have a convenience method readFullyAsUTF8String() which will read the entire file into a String object, this will read the whole file into memory first. If memory is a concern you can work directly with the file with utility classes like FileSystems.
From: Document Link
PCollection<KV<String, String>> filesAndContents = p
.apply(FileIO.match().filepattern("hdfs://path/to/*.gz"))
// withCompression can be omitted - by default compression is detected from the filename.
.apply(FileIO.readMatches().withCompression(GZIP))
.apply(MapElements
// uses imports from TypeDescriptors
.into(KVs(strings(), strings()))
.via((ReadableFile f) -> KV.of(
f.getMetadata().resourceId().toString(), f.readFullyAsUTF8String())));
Python (sdk 2.9.0):
For 2.9.0 for python you will need to collect the list of URI from outside of the Dataflow pipeline and feed it in as a parameter to the pipeline. For example making use of FileSystems to read in the list of files via a Glob pattern and then passing that to a PCollection for processing.
Once fileio see PR https://github.com/apache/beam/pull/7791/ is available, the following code would also be an option for python.
import apache_beam as beam
from apache_beam.io import fileio
with beam.Pipeline() as p:
readable_files = (p
| fileio.MatchFiles(‘hdfs://path/to/*.txt’)
| fileio.ReadMatches()
| beam.Reshuffle())
files_and_contents = (readable_files
| beam.Map(lambda x: (x.metadata.path,
x.read_utf8()))

One approach is to build a List<PCollection> where each entry corresponds to an input file, then use Flatten. For example, if you want to parse each line of a collection of files into a Foo object, you might do something like this:
public static class FooParserFn extends DoFn<String, Foo> {
private String fileName;
public FooParserFn(String fileName) {
this.fileName = fileName;
}
#Override
public void processElement(ProcessContext processContext) throws Exception {
String line = processContext.element();
// here you have access to both the line of text and the name of the file
// from which it came.
}
}
public static void main(String[] args) {
...
List<String> inputFiles = ...;
List<PCollection<Foo>> foosByFile =
Lists.transform(inputFiles,
new Function<String, PCollection<Foo>>() {
#Override
public PCollection<Foo> apply(String fileName) {
return p.apply(TextIO.Read.from(fileName))
.apply(new ParDo().of(new FooParserFn(fileName)));
}
});
PCollection<Foo> foos = PCollectionList.<Foo>empty(p).and(foosByFile).apply(Flatten.<Foo>pCollections());
...
}
One downside of this approach is that, if you have 100 input files, you'll also have 100 nodes in the Cloud Dataflow monitoring console. This makes it hard to tell what's going on. I'd be interested in hearing from the Google Cloud Dataflow people whether this approach is efficient.

I also had the 100 input files = 100 nodes on the dataflow diagram when using code similar to #danvk. I switched to an approach like this which resulted in all the reads being combined into a single block that you can expand to drill down into each file/directory that was read. The job also ran faster using this approach rather than the Lists.transform approach in our use case.
GcsOptions gcsOptions = options.as(GcsOptions.class);
List<GcsPath> paths = gcsOptions.getGcsUtil().expand(GcsPath.fromUri(options.getInputFile()));
List<String>filesToProcess = paths.stream().map(item -> item.toString()).collect(Collectors.toList());
PCollectionList<SomeClass> pcl = PCollectionList.empty(p);
for(String fileName : filesToProcess) {
pcl = pcl.and(
p.apply("ReadAvroFile" + fileName, AvroIO.Read.named("ReadFromAvro")
.from(fileName)
.withSchema(SomeClass.class)
)
.apply(ParDo.of(new MyDoFn(fileName)))
);
}
// flatten the PCollectionList, combining all the PCollections together
PCollection<SomeClass> flattenedPCollection = pcl.apply(Flatten.pCollections());

This might be a very late post for the above question, but I wanted to add answer with Beam bundled classes.
This could also be seen as an extracted code from the solution provided by #Reza Rokni.
PCollection<String> listOfFilenames =
pipe.apply(FileIO.match().filepattern("gs://apache-beam-samples/shakespeare/*"))
.apply(FileIO.readMatches())
.apply(
MapElements.into(TypeDescriptors.strings())
.via(
(FileIO.ReadableFile file) -> {
String f = file.getMetadata().resourceId().getFilename();
System.out.println(f);
return f;
}));
pipe.run().waitUntilFinish();
Above PCollection<String> will have a list of files available at any provided directory.

I was struggling with the same use case while using wildcard to read files from GCS but also needed to modify the collection based on the file name.The key is to use ReadFromTextWithFilename instead of readfromtext In java you already have a way out and you can use:
String filename =context.element().getMetadata().resourceId().getCurrentDirectory().toString()
inside your processElement method.
But for Python below technique will work:
-> Use beam.io.ReadFromTextWithFilename for reading the wildcard path from GCS
-> As per the document, ReadFromTextWithFilename returns the file's name and the file's content.
Below is the code snippet:
class GetFileNameFromWildcard(beam.DoFn):
def process(self, element, *args, **kwargs):
file_path, content = element
schema = ["id","name","mob","email","dept","store"]
store_name = file_path.split("/")[-2]
content_list = content.split(",")
content_list.append(store_name)
out_dict = dict(zip(schema,content_list))
print(out_dict)
yield out_dict
def run():
pipeline_options = PipelineOptions()
with beam.Pipeline(options=pipeline_options) as p:
# saving main session so that it can load global namespace on the Cloud Dataflow Worker
init = p | 'Begin Pipeline With Initiator' >> beam.Create(
["pcollection initializer"]) | 'Read From GCS' >> beam.io.ReadFromTextWithFilename(
"gs://<bkt-name>/20220826/*/dlp*", skip_header_lines=1) | beam.ParDo(
GetFileNameFromWildcard()) | beam.io.WriteToText(
'df_out.csv')

How do I list the contents of a directory with Dart?

I would like to list all the contents of a directory (on the file system) using Dart. How can I do this?

How to list the contents of a directory in Dart
final dir = Directory('path/to/directory');
final List<FileSystemEntity> entities = await dir.list().toList();
This creates a Directory from a path. Then it converts it to a list of FileSystemEntity, which can be a File, a Directory, or a Link. By default subdirectories are not listed recursively.
If you want to print that list, then add this line:
entities.forEach(print);
If you want to only get the files then you could do it like so:
final Iterable<File> files = entities.whereType<File>();

The API has changed and I have updated the async code for M4 release (0.5.16_r23799 ):
Future<List<FileSystemEntity>> dirContents(Directory dir) {
var files = <FileSystemEntity>[];
var completer = Completer<List<FileSystemEntity>>();
var lister = dir.list(recursive: false);
lister.listen (
(file) => files.add(file),
// should also register onError
onDone: () => completer.complete(files)
);
return completer.future;
}

The list method returns a Stream where each emitted event is a directory entry:
Directory dir = Directory('.');
// execute an action on each entry
dir.list(recursive: false).forEach((f) {
print(f);
});
As the name suggest, listSync method is the blocking version:
// create a list of entries
List<FileSystemEntity> entries = dir.listSync(recursive: false).toList();
What method to use depends on application context. A note directly from the docs:
Unless you have a specific reason for using the synchronous version of a method, prefer the asynchronous version to avoid blocking your program.

This answer is out of date. Please see the accepted answer.
There are two ways to list the contents of a directory using the Dart VM and the dart:io library.
(note: the following only works in the Dart VM when running on the command-line or as a server-side app. This does not work in a browser or when compiled to JavaScript.)
Setup
First, you need to import the dart:io library. This library contains the classes required to access files, directories, and more.
import 'dart:io';
Second, create a new instance of the Directory class.
var dir = new Directory('path/to/my/dir');
Listing contents in a script
The easiest way is to use the new listSync method. This returns a List of contents. By default this does not recurse.
List contents = dir.listSync();
for (var fileOrDir in contents) {
if (fileOrDir is File) {
print(fileOrDir.name);
} else if (fileOrDir is Directory) {
print(fileOrDir.path);
}
}
If you want to recurse through directories, you can use the optional parameter recursive.
List allContents = dir.listSync(recursive: true);
WARNING if your directory structure has circular symlinks, the above code will crash because it's following symlinks recursively.
This method, using listSync, is especially useful when you are writing a shell script, command-line utility, or similar app or script with Dart.
Listing contents in a server
A second way to list the contents of a directory is to use the async version of list. You would use this second method when you need to list a directory in response to, say, an HTTP request. Remember that each of Dart's isolates runs in a single thread. Any long running process can block the event loop. When interactivity is important, or serving lots of clients from a single Dart script, use the async version.
With the async version, dir.list() returns a DirectoryLister. You can register three different callbacks on DirectoryLister:
onFile: called when a file or directory is encountered
onDone: called when the directory lister is done listing contents
onError: called when the lister encounters some error
Here is a simple function that returns a Future of a list of strings, containing file names in a directory:
Future<List<String>> dirContents(Directory dir) {
var filenames = <String>[];
var completer = new Completer();
var lister = dir.list();
lister.onFile = (filename) => filenames.add(filename);
// should also register onError
lister.onDone = (_) => completer.complete(filenames);
return completer.future;
}
Of course, this method is perfect for servers, it's more cumbersome for simple scripts.
Luckily, Dart supports both methods for you to use!

Inside and asynchronous function write this
List<FileSystemEntity> allContents = await Directory("folder/path").list().toList();
Now you have a list with all of the contents

Here is my version using async/await, returning a List of Files only:
List<File> filesInDirectory(Directory dir) async {
List<File> files = <File>[];
await for (FileSystemEntity entity in dir.list(recursive: false, followLinks: false)) {
FileSystemEntityType type = await FileSystemEntity.type(entity.path);
if (type == FileSystemEntityType.FILE) {
files.add(entity);
print(entity.path);
}
}
return files;
}

With this function you can print all the directories and files of a directory.
You just need to pass a specific path.
Future listDir(String folderPath) async {
var directory = new Directory(folderPath);
print(directory);
var exists = await directory.exists();
if (exists) {
print("exits");
directory
.list(recursive: true, followLinks: false)
.listen((FileSystemEntity entity) {
print(entity.path);
});
}
}

To get a list of file names with a certain string, you can use this code:
String directory = (await getApplicationDocumentsDirectory()).path;
List<FileSystemEntity> files = Directory(directory).listSync(recursive: false);
List<String> filePaths = [];
for (var fileSystemEntity in files) {
if (basename(fileSystemEntity.path).contains('mystring')) {
filePaths.add(fileSystemEntity.path);
}
}
You can use the basename function if you need just the file name, and not the whole path.

To get all files with given extension:
import 'package:path/path.dart' as p;
Future<List<File>> getAllFilesWithExtension(String path, String extension) async {
final List<FileSystemEntity> entities = await Directory(path).list().toList();
return entities.whereType<File>().where((element) => p.extension(element.path) == extension).toList();
}

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Dart directory scan and list population - dart

Related

Dart build runner generate one dart file with content

Using Futures to load config.json in Flutter

How to read file from an imported library

How to Get Filename when using file pattern match in google-cloud-dataflow

How do I list the contents of a directory with Dart?

Categories

Resources