Say I have a StreamController<int> called myController. I understand that I can map the output of the stream by doing something like Stream<int> intStream = myController.map((i) => i * 2) and then listen to that.
But what if I wanted to intercept/map the input of the sink, how would I achieve this? Is there an API for this? For clarity, have a look at this fake non-working example: Sink<bool> boolSink = myController.sink.resink<int>((bool b) => b ? 1 : 0). So a mapping that takes a sink of bools and converts it to a sink of ints. Then I would do boolSink.add(true) and expect intStream to emit 1.
My goal is to provide simplified sinks (Sink<void>) to a few components, but without them having to place the particular value into the sink (because they will each only ever be adding a predefined value). And ideally without having to manage multiple StreamControllers myself.
My solution was to have a class which implements the Sink interface. It intercepts calls to the add method, maps the value being added to a new value, and then forwards it on to the "real" sink. I need a new instance of the class for each type of mapping.
import 'package:meta/meta.dart';
typedef Mapper<T, R> = R Function(T);
/// A [Sink] which takes a value of type [T] and maps it to a destination sink
/// which expects a value of type [R].
class SinkMapper<T, R> implements Sink<T> {
final Mapper<T, R> _mapper;
final Sink<R> _sink;
SinkMapper({
#required Mapper<T, R> mapper,
#required Sink<R> sink,
}) : _mapper = mapper,
_sink = sink;
#override
void add(T data) => _sink.add(_mapper(data));
#override
void close() => _sink.close();
}
Related
I'm trying to construct a Map object with Set as keys:
final myObject = <Set<MyType1>, MyType2>{};
I can successfully insert new values in my object
myObject[{myType1}] = myType2;
However, I cannot get back anything from it as 2 sets with the same values won't be consider equal:
{0} == {0}; // <- false
myObject[{myType1}]; // <- null
Is there something built-in that could allow me to use a Map<Set<T>, U> object?
You may create your own class that wraps a Set with your own equals and hashCode. Then the problem is solved.
A very rough example to demonstrate my idea:
class MySet<T> {
final Set<T> inner;
...
#override
bool equals(other) => setEquals(this, other); // https://api.flutter.dev/flutter/foundation/setEquals.html
#override
int hashCode => ...
}
P.S. The equals/hashCode of a Set is not implemented as whether the contents are the same, because that would be very costly.
So, basically I need to create restrictions of which types can be used in a Type variable, something like this:
class ElementFilter<T extends Element> {
final Type<T> elementType; // What I want is something like Type<T>, but Type does not have a generic parameter
ElementFilter(this.elementType);
}
List<T> filterElements<T extends Element>(ElementFilter<T> element) {
return elements.where((el) => _isOfType(el, element.type)).toList();
}
filterElements(ElementFilter(ClassThatExtendsElement)); // Would work fine
filterELements(ElementFilter(String)); // Error, String does not extends Element
So it would only be possible to create ElementFilters with types that extend Element. Is this possible in some way?
I think you probably want:
/// Example usage: ElementFilter<ClassThatExtendsElement>();
class ElementFilter<T extends Element> {
final Type elementType;
ElementFilter() : elementType = T;
}
Unfortunately, there's no way to make the generic type argument non-optional. You will have to choose between having a required argument and having a compile-time constraint on the Type argument.
Dart doesn't support algebraic types, so if you additionally want to support a finite set of types that don't derive from Element, you could make specialized derived classes and require that clients use those instead of ElementFilter. For example:
class StringElementFilter extends ElementFilter<Element> {
#override
final Type elementType = String;
}
(You also could create a StringElement class that extends Element if you want, but at least for this example, it would serve no purpose.)
I highly recommend not using Type objects at all. Ever. They're pretty useless, and if you have the type available as a type parameter, you're always better off. (The type variable can always be converted to a Type object, but it can also be actually useful in many other ways).
Example:
class ElementFilter<T extends Element> {
bool test(Object? element) => element is T;
Iterable<T> filterElements(Iterable<Object?> elements) =>
elements.whereType<T>();
}
List<T> filterElements<T extends Element>(ElementFilter<T> filter) =>
filter.filterElements(elements).toList();
filterElements(ElementFilter<ClassThatExtendsElement>()); // Would work fine
filterElements(ElementFilter<String>()); // Error, String does not extends Element
I am attempting to construct an abstract class that requires a named constructor in Dart. Given some Map (m), this generic type must be able instantiate itself.
The Dart compiler is throwing T.fromJson -> Invalid constructor name.
My attempt at coding this:
abstract class JsonMap<T> {
Map toJson();
T.fromJson(Map m);
}
I struggled with the same concept (in the same place ... API parsing :)) ) and I didn't found a proper solution.
But maybe you can use something this thing I found while checking block pattern this (I am not using it for my model part):
abstract class SomeBase {
void load();
}
class Provider<T extends SomeBase> extends InheritedWidget {
final T something;
Provider({
Key key,
#required this.something,
}): super(key: key);
#override
bool updateShouldNotify(_) {
return true;
}
static Type _typeOf<T>() => T;
static T of<T extends SomeBase>(BuildContext context){
final type = _typeOf<Provider<T>>();
Provider<T> provider = context.inheritFromWidgetOfExactType(type);
return provider.something;
}
}
OR just use this without encapsulating it in an inherited widget and provide the already initialised objects (like user or whatever you are parsing) that just load the values from the JSON provided.
You're creating a class named JsonMap that is parameterized on type T. T is not the name of your class, so T.fromJson is not a valid named constructor for JsonMap.
If you want JsonMap to have a named constructor, it should be JsonMap.fromJson(Map m).
Untested, but off the top of my head, you should write your code like so:
abstract class JsonMap<T> {
Map<String, dynamic> toJson();
T fromJson(Map<String, dynamic> m);
}
The dot makes fromJson(Map m) a constructor of type T, or a static function belonging to type T. Without the dot, it is a function belonging to the abstract class JsonMap, returning type T. Specifying the map type is good practice if you know what it will be (like with json).
I've implemented the stream transformer. Please note that it is only an exercise (in order to learn Dart). This transformer converts integers into strings. I give the code below, and you can also find it on GitHub.
// Conceptually, a transformer is simply a function from Stream to Stream that
// is encapsulated into a class.
//
// A transformer is made of:
// - A stream controller. The controller provides the "output" stream that will
// receive the transformed values.
// - A "bind()" method. This method is called by the "input" stream "transform"
// method (inputStream.transform(<the stream transformer>).
import 'dart:async';
/// This class defines the implementation of a class that emulates a function
/// that converts a data with a given type (S) into a data with another type (T).
abstract class TypeCaster<S, T> {
T call(S value);
}
/// This class emulates a converter from integers to strings.
class Caster extends TypeCaster<int, String> {
String call(int value) {
return "<${value.toString()}>";
}
}
// StreamTransformer<S, T> is an abstract class. The functions listed below must
// be implemented:
// - Stream<T> bind(Stream<S> stream)
// - StreamTransformer<RS, RT> cast<RS, RT>()
class CasterTransformer<S, T> implements StreamTransformer<S, T> {
StreamController<T> _controller;
bool _cancelOnError;
TypeCaster<S, T> _caster;
// Original (or input) stream.
Stream<S> _stream;
// The stream subscription returned by the call to the function "listen", of
// the original (input) stream (_stream.listen(...)).
StreamSubscription<S> _subscription;
/// Constructor that creates a unicast stream.
/// [caster] An instance of "type caster".
CasterTransformer(TypeCaster<S, T> caster, {
bool sync: false,
bool cancelOnError: true
}) {
_controller = new StreamController<T>(
onListen: _onListen,
onCancel: _onCancel,
onPause: () => _subscription.pause(),
onResume: () => _subscription.resume(),
sync: sync
);
_cancelOnError = cancelOnError;
_caster = caster;
}
/// Constructor that creates a broadcast stream.
/// [caster] An instance of "type caster".
CasterTransformer.broadcast(TypeCaster<S, T> caster, {
bool sync: false,
bool cancelOnError: true
}) {
_cancelOnError = cancelOnError;
_controller = new StreamController<T>.broadcast(
onListen: _onListen,
onCancel: _onCancel,
sync: sync
);
_caster = caster;
}
/// Handler executed whenever a listener subscribes to the controller's stream.
/// Note: when the transformer is applied to the original stream, through call
/// to the method "transform", the method "bind()" is called behind the
/// scenes. The method "bind()" returns the controller stream.
/// When a listener is applied to the controller stream, then this function
/// (that is "_onListen()") will be executed. This function will set the
/// handler ("_onData") that will be executed each time a value appears
/// in the original stream. This handler takes the incoming value, casts
/// it, and inject it to the (controller) output stream.
/// Note: this method is called only once. On the other hand, the method "_onData"
/// is called as many times as there are values to transform.
void _onListen() {
_subscription = _stream.listen(
_onData,
onError: _controller.addError,
onDone: _controller.close,
cancelOnError: _cancelOnError
);
}
/// Handler executed whenever the subscription to the controller's stream is cancelled.
void _onCancel() {
_subscription.cancel();
_subscription = null;
}
/// Handler executed whenever data comes from the original (input) stream.
/// Please note that the transformation takes place here.
/// Note: this method is called as many times as there are values to transform.
void _onData(S data) {
_controller.add(_caster(data));
}
/// This method is called once, when the stream transformer is assigned to the
/// original (input) stream. It returns the stream provided by the controller.
/// Note: here, you can see that the process transforms a value of type
/// S into a value of type T. Thus, it is necessary to provide a function
/// that performs the conversion from type S to type T.
/// Note: the returned stream may accept only one, or more than one, listener.
/// This depends on the method called to instantiate the transformer.
/// * CasterTransformer() => only one listener.
/// * CasterTransformer.broadcast() => one or more listener.
Stream<T> bind(Stream<S> stream) {
_stream = stream;
return _controller.stream;
}
// TODO: what should this method do ? Find the answer.
StreamTransformer<RS, RT> cast<RS, RT>() {
return StreamTransformer<RS, RT>((Stream<RS> stream, bool b) {
// What should we do here ?
});
}
}
main() {
// ---------------------------------------------------------------------------
// TEST: unicast controller.
// ---------------------------------------------------------------------------
// Create a controller that will be used to inject integers into the "input"
// stream.
StreamController<int> controller_unicast = new StreamController<int>();
// Get the stream "to control".
Stream<int> integer_stream_unicast = controller_unicast.stream;
// Apply a transformer on the "input" stream.
// The method "transform" calls the method "bind", which returns the stream that
// receives the transformed values.
Stream<String> string_stream_unicast = integer_stream_unicast.transform(CasterTransformer<int, String>(new Caster()));
string_stream_unicast.listen((data) {
print('String => $data');
});
// Inject integers into the "input" stream.
controller_unicast.add(1);
controller_unicast.add(2);
controller_unicast.add(3);
// ---------------------------------------------------------------------------
// TEST: broadcast controller.
// ---------------------------------------------------------------------------
StreamController<int> controller_broadcast = new StreamController<int>.broadcast();
Stream<int> integer_stream_broadcast = controller_broadcast.stream;
Stream<String> string_stream_broadcast = integer_stream_broadcast.transform(CasterTransformer<int, String>.broadcast(new Caster()));
string_stream_broadcast.listen((data) {
print('Listener 1: String => $data');
});
string_stream_broadcast.listen((data) {
print('Listener 2: String => $data');
});
controller_broadcast.add(1);
controller_broadcast.add(2);
controller_broadcast.add(3);
}
The class CasterTransformer<S, T> extends the abstract class StreamTransformer<S, T>.
Thus, it implements the method StreamTransformer<RS, RT> cast<RS, RT>().
On the documentation, it is said that :
The resulting transformer will check at run-time that all data events of the stream it transforms are actually instances of S, and it will check that all data events produced by this transformer are actually instances of RT.
See: https://api.dartlang.org/stable/2.1.0/dart-async/StreamTransformer/cast.html
First, I think that there is a typo in this documentation : it should say "...it transforms are actually instances of RS" (instead of S).
However, this seems obscure to me.
Why do we need a stream transformer to check values types ? The purpose of a transformer is to transform, isn't it ? If the purpose of a component is to check, so why don't we call it a checker ?
And, also, why would we need to check that the transformer (we implement) produces the required data ? If it doesn't, then we face a bug that should be fixed.
Can someone explain the purpose of the method Cast() ?
The cast method is there to help typing the operation.
If you have a StreamTransformer<num, int>, it transforms numbers to integers (say, by calling .toInt() on them and then adding 42, because that is obviously useful!).
If you want to use that transformer in some place that expects a StreamTransformer<int, num>, then you can't. Since num is not a sub-type of int, the transformer is not assignable to that type.
But you know, because you understand how a stream transformer actually works, that the first type argument is only used for inputs. Something that accepts any num should safely be useable where it's only given ints.
So, to convince the type system that you know what you are doing, you write:
StreamTransformer<int, num> transform = myTranformer.cast<int, num>();
Now, the tranformer takes any integer (RS), checks that it's a num (S), passes it to myTransformer which calls toInt() and adds 42, then the resulting int (T) is passed back and transformer checks that it is a num (RT) and emits that.
Everything works and the type system is happy.
You can use cast to do things that will never work at run-time, because all it does is to add extra run-time checks that convinces the static type system that things will either succeed or throw at those checks.
The easiest way to get an implementation of StreamTransformer.cast is to use th e StreamTransformer.castFrom static method:
StreamTransformer<RS, RT> cast<RS, RT>() => StreamTransformer.castFrom(this);
That will use the system's default cast wrapper on your own transformer.
Let me simplify my case. I'm using Apache Beam 0.6.0. My final processed result is PCollection<KV<String, String>>. And I want to write values to different files corresponding to their keys.
For example, let's say the result consists of
(key1, value1)
(key2, value2)
(key1, value3)
(key1, value4)
Then I want to write value1, value3 and value4 to key1.txt, and write value4 to key2.txt.
And in my case:
Key set is determined when the pipeline is running, not when constructing the pipeline.
Key set may be quite small, but the number of values corresponding to each key may be very very large.
Any ideas?
Handily, I wrote a sample of this case just the other day.
This example is dataflow 1.x style
Basically you group by each key, and then you can do this with a custom transform that connects to cloud storage. Caveat being that your list of lines per-file shouldn't be massive (it's got to fit into memory on a single instance, but considering you can run high-mem instances, that limit is pretty high).
...
PCollection<KV<String, List<String>>> readyToWrite = groupedByFirstLetter
.apply(Combine.perKey(AccumulatorOfWords.getCombineFn()));
readyToWrite.apply(
new PTransformWriteToGCS("dataflow-experiment", TonyWordGrouper::derivePath));
...
And then the transform doing most of the work is:
public class PTransformWriteToGCS
extends PTransform<PCollection<KV<String, List<String>>>, PCollection<Void>> {
private static final Logger LOG = Logging.getLogger(PTransformWriteToGCS.class);
private static final Storage STORAGE = StorageOptions.getDefaultInstance().getService();
private final String bucketName;
private final SerializableFunction<String, String> pathCreator;
public PTransformWriteToGCS(final String bucketName,
final SerializableFunction<String, String> pathCreator) {
this.bucketName = bucketName;
this.pathCreator = pathCreator;
}
#Override
public PCollection<Void> apply(final PCollection<KV<String, List<String>>> input) {
return input
.apply(ParDo.of(new DoFn<KV<String, List<String>>, Void>() {
#Override
public void processElement(
final DoFn<KV<String, List<String>>, Void>.ProcessContext arg0)
throws Exception {
final String key = arg0.element().getKey();
final List<String> values = arg0.element().getValue();
final String toWrite = values.stream().collect(Collectors.joining("\n"));
final String path = pathCreator.apply(key);
BlobInfo blobInfo = BlobInfo.newBuilder(bucketName, path)
.setContentType(MimeTypes.TEXT)
.build();
LOG.info("blob writing to: {}", blobInfo);
Blob result = STORAGE.create(blobInfo,
toWrite.getBytes(StandardCharsets.UTF_8));
}
}));
}
}
Just write a loop in a ParDo function!
More details -
I had the same scenario today, the only thing is in my case key=image_label and value=image_tf_record. So like what you have asked, I am trying to create separate TFRecord files, one per class, each record file containing a number of images. HOWEVER not sure if there might be memory issues when a number of values per key are very high like your scenario:
(Also my code is in Python)
class WriteToSeparateTFRecordFiles(beam.DoFn):
def __init__(self, outdir):
self.outdir = outdir
def process(self, element):
l, image_list = element
writer = tf.python_io.TFRecordWriter(self.outdir + "/tfr" + str(l) + '.tfrecord')
for example in image_list:
writer.write(example.SerializeToString())
writer.close()
And then in your pipeline just after the stage where you get key-value pairs to add these two lines:
(p
| 'GroupByLabelId' >> beam.GroupByKey()
| 'SaveToMultipleFiles' >> beam.ParDo(WriteToSeparateTFRecordFiles(opt, p))
)
you can use FileIO.writeDinamic() for that
PCollection<KV<String,String>> readfile= (something you read..);
readfile.apply(FileIO. <String,KV<String,String >> writeDynamic()
.by(KV::getKey)
.withDestinationCoder(StringUtf8Coder.of())
.via(Contextful.fn(KV::getValue), TextIO.sink())
.to("somefolder")
.withNaming(key -> FileIO.Write.defaultNaming(key, ".txt")));
p.run();
In Apache Beam 2.2 Java SDK, this is natively supported in TextIO and AvroIO using respectively TextIO and AvroIO.write().to(DynamicDestinations). See e.g. this method.
Update (2018): Prefer to use FileIO.writeDynamic() together with TextIO.sink() and AvroIO.sink() instead.
Just write below lines in your ParDo class :
from apache_beam.io import filesystems
eventCSVFileWriter = filesystems.FileSystems.create(gcsFileName)
for record in list(Records):
eventCSVFileWriter.write(record)
If you want the full code I can help you with that too.