flink how to combine stream and multiply maps - join

I have one stream (not keyed stream), and 3 maps (each map is result comes from different rest api).
These 3 maps are static, won't change after.
I want to map elements to new type in stream by using these 3 maps, how can I broadcast 3 maps to stream?
As I know join or connect is not sufficient to do this.. please help.

If the maps (as in java map) are static, you can just load them inside a RichMapFunction in open and apply them in map. To increase performance, you should initialize them once in a static variable (sync on the class or a static mutex).
If the maps are small, you can also initialize them in your main and just pass them to a MapFunction as a parameter. As long as everything in the map is Serializable, it will just work.

Related

Find if a point (latitude, longitude) is within OpenStreetMaps "way" (area)

I have a list of longitude and latitude points for various houses, offices, etc. I am trying to split them up to determine if they are inside a certain Way or not. I don't want to use the old "centre point" of an area and then radius value as that is not accurate enough.
So for example if I had 4 locations in an Way like "Richmond Upon Thames" that looks like this:
It should return just point B and C. Is this possible using Open Street Maps API?
If you like Java, you could load the way as a Polygon and use the JTS (Java Topology Suite) library, or the AWT library to compute whether your points are inside or not.
Here is an example of how the Atlas library uses a combination of both in that specific case. For you it would look like this:
Convert each latitude/longitude pair of the Way to a Location object
Add each Location to a List and create a new Polygon with it
Call the Polygon.fullyGeometricallyEncloses(Location) method on that Polygon with each of the points of interest you have
The Atlas library is available in Maven Central for you to download.

How to access "key" in combine.perKey in beam

In How to create custom Combine.PerKey in beam sdk 2.0, I asked and got a correct answer on how to create a custom Combine.PerKey in the new beam sdk 2.0. However, I now need to create a custom combinePerKey such that within my custom CombinePerKey logic, I need to be able to access the contents of the key. This was easily possible in dataflow 1.x, but in the new beam sdk 2.0, I'm unsure how to do so. Any little code snippet/example would be extremely useful.
EDIT #1 (per Ben Chambers's request)
The real use case is hard to explain, but I'm going to try:
We have a 3d space composed of millions of little hills. We try to determine the apex of these millions of hills as follows: we create billions of "rectangular probes" for the whole 3d space, and then we ask each of these billions of probes to "move" in a greedy way to the apex. Once it hits the apex, it stops. The probe then returns the apex and itself. The apex is the KEY for which we'll do a custom combine by key.
Now, the custom combine function is going to finally return a final object (called a feature) which is derived from all the probes that reach the same apex (ie the same key). When generating this "feature" object, we need to know infomration about the final apex/key (ie the top of the hill). Hence, I need this key info.
One way to solve this is using a group by key, but that was slow (at least in df 1.x); we got it to be fast (in df 1.x) using a custom combine fn. So, we'd like the key. That said, groupbykey works in beam skd 2.0.
Alternatively, we could stick the "apex" information into the "probe" objects itself, but this means that each of our billions of probe objects now needs to be tripled in size just to hold this apex information (and this apex information repeats itself, since there are only say 1 million apexes but 1 billion probes, so this intuitively feels highly inefficient.)
Rather than relying on the CombineFn to compute the entire result, could you instead have the ComibneFn compute some partial result based only on information about the probes? Then your Combine.perKey(...) returns a PCollection<KV<Apex, InfoAboutProbes>> and you can use a ParDo to combine the information about the apex with the summary information about the probes. This allows you to use the CombineFn for efficiently combining information about many probes, while using a ParDo to access the key.

Splitting a futures::Stream into multiple streams based on a property of the stream item

I have a Stream of items (u32, Bytes) where the integer is an index in the range 0..n I would like to split this stream into n streams, basically filtering by the integer.
I considered several possibilities, including
creating n streams each of which peeks at the underlying stream to determine if the next item is for it
pushing the items to one of n sinks when they arrive, and then use the other side of the sink as a stream again. (This seems to be related to
Forwarding from a futures::Stream to a futures::Sink.).
I feel that neither of these possibilities is convincing. The first one seems to create unnecessary overhead and the second one is just not elegant (if it even works, I am not sure).
What's a good way of splitting the stream?
At one point I had a similar requirement and wrote a group_by operator for Stream.
I haven't yet published this to crates.io as I didn't really feel it was ready for consumption but feel free to take a look at the code at https://github.com/Lukazoid/lz_stream_tools or attempt to use it for yourself.
Add the following to your cargo.toml:
[dependencies]
lz_stream_tools = { git = "https://github.com/Lukazoid/lz_stream_tools" }
And extern crate lz_stream_tools; to your bin.rs/lib.rs.
Then from your code you may use it like so:
use lz_stream_tools::StreamTools;
let groups = some_stream.group_by(|x| x.0);
groups will now be a Stream of (u32, Stream<Item=Bytes)).
You could use channels to represent the index-specific streams. You'd have to spawn one Task that pulls from the original stream and has a map of Senders.

Is MQTT support in Cumulocity?

it's possible to receive MQTT messages from Cumuloyity API?
How can I get with Java Clients the values from following Measurements:
Analog Measurement
Motion Measurement
Thanks
Querying measurements is described here: http://cumulocity.com/guides/java/developing/, Section "Accessing events and measurements". There are currently no pre-defined Java classes for analog measurements and motion measurements, however, you can still retrieve them as generic properties. Check the example on the web page and instead of
measurementFilter.byFragmentType(SignalStrength.class);
try
measurementFilter.byFragmentType("c8y_MotionMeasurement");
and instead of
measurement.get(SignalStrength.class);
try
measurement.getProperty("c8y_MotionMeasurement");
You can also create the Java classes representing the measurements on your own by "stealing" and modifying one of the existing classes:
https://bitbucket.org/m2m/cumulocity-clients-java/src/53216dc587e24476e0578b788672416e8566f92b/device-capability-model/src/main/java/c8y/?at=default

GAE Go template.Execute, passing a struct with vector

I'm storing some data in a Go app in a struct's vector.Vector for convenience. I want to display all the data from the vector on Google App Engine webpage through template.Execute. Is it possible and how would I access the data in the parsed html file? Would it be easier if I used an array or slice instead?
Use slices.
Go Weekly Snapshot History 2011-10-18
The container/vector package has been deleted. Slices are better:
SliceTricks.

Resources