Kafka Streams: serialize back to avro - avro

I'm trying to build a Stream that gets an Avro Topic, do a simple transformation and then sends it back again in Avro format to an other Topic and I'm kind of stuck on the final serialization part.
I have an AVRO schema created, I'm importing it and using it to create the Specific Avro Serde. But I don't know how to serialize the movie object back to AVRO using this serde.
This is the stream class:
class StreamsProcessor(val brokers: String, val schemaRegistryUrl: String) {
private val logger = LogManager.getLogger(javaClass)
fun process() {
val streamsBuilder = StreamsBuilder()
val avroSerde = GenericAvroSerde().apply {
configure(mapOf(Pair("schema.registry.url", schemaRegistryUrl)), false)
}
val movieAvro = SpecificAvroSerde<Movie>().apply{
configure(mapOf(Pair("schema.registry.url", schemaRegistryUrl)), false)
}
val movieAvroStream: KStream<String, GenericRecord> = streamsBuilder
.stream(movieAvroTopic, Consumed.with(Serdes.String(), avroSerde))
val movieStream: KStream<String, StreamMovie> = movieAvroStream.map {_, movieAvro ->
val movie = StreamMovie(
movieId = movieAvro["name"].toString() + movieAvro["year"].toString(),
director = movieAvro["director"].toString(),
)
KeyValue("${movie.movieId}", movie)
}
// This where I'm stuck, the call is wrong because movieStream is not a <String, movieAvro> object
movieStream.to(movieTopic, Produced.with(Serdes.String(), movieAvro))
val topology = streamsBuilder.build()
val props = Properties()
props["bootstrap.servers"] = brokers
props["application.id"] = "movies-stream"
val streams = KafkaStreams(topology, props)
streams.start()
}
}
Thanks

The type of your result stream is KStream<String, StreamMovie> and thus the used value Serde should be of type SpecificAvroSerde<StreamMovie>.
Why do you try to use SpecificAvroSerde<Movie>? If Movie is the desired output type, you should create Movie object in your map step instead of a StreamMovie object and change the value type of the result KStream accordingly.
Compare https://github.com/confluentinc/kafka-streams-examples/blob/5.4.1-post/src/test/java/io/confluent/examples/streams/SpecificAvroIntegrationTest.java

Related

How to collect a Pair in a parallelStream in Java?

Here is roughly the code that I want to change:
final List<Objects> list = evaluators.parallelStream()
.map(evaluator -> evaluator.evaluate())
.flatMap(List::stream)
.collect(Collectors.toList());
I want to change the evaluator.evaluate() method to return a Pair<List, List> instead. Something like:
final Pair<List<Object>, List<String>> pair = evaluators.parallelStream()
.map(evaluator -> evaluate())
...?
Such that if evaluatorA returned Pair<[1,2], [a,b]> and evaluatorB returned Pair<[3], [c,d]> then the end result is a Pair<[1,2,3], [a,b,c,d]>.
Thanks for your help.
I ended up implementing a customer collector for the Pair of Lists:
...
.collect(
// supplier
() -> Pair.of(new ArrayList<>(), new ArrayList<>()),
// accumulator
(accumulatedResult, evaluatorResult) -> {
accumulatedResult.getLeft().addAll(evaluatorResult.getLeft());
accumulatedResult.getRight().addAll(evaluatorResult.getRight());
},
// combiner
(a, b) -> {
a.getLeft().addAll(b.getLeft());
a.getRight().addAll(b.getRight());
}
);

Project Reactor: cache last item for each subscribed publisher

I have a processor which subscribes to publishers which arrive in arbitrary time. For each new subscriber to the processor, I want to emit the last item from each publisher.
class PublishersState {
val outputProcessor = DirectProcessor.create<String>()
fun addNewPublisher(publisher: Flux<String>) {
publisher.subscribe(outputProcessor)
}
fun getAllPublishersState(): Flux<String> = outputProcessor
}
val publisher1 = Mono
.just("Item 1 publisher1")
.mergeWith(Flux.never())
val publisher2 = Flux
.just("Item 1 publisher2", "Item 2 publisher2")
.mergeWith(Flux.never())
val publishersState = PublishersState()
publishersState.getAllPublishersState().subscribe {
println("Subscriber1: $it")
}
publishersState.addNewPublisher(publisher1)
publishersState.addNewPublisher(publisher2)
publishersState.getAllPublishersState().subscribe {
println("Subscriber2: $it")
}
I need to change the code above so it will output the following:
Subscriber1: Item 1 publisher1
Subscriber1: Item 1 publisher2
Subscriber1: Item 2 publisher2
// Subscriber2 subscribers here and receives the last item from each publisher
Subscriber2: Item 1 publisher1
Subscriber2: Item 2 publisher2
Is there a simple way to cache the last item for each publisher?
Use ReplayProcessor instead of DirectProcessor:
val outputProcessor = ReplayProcessor.cacheLast()
I solved my case the following way:
class PublishersState {
val publishersList = Collections.synchronizedList<Flux<String>>(mutableListOf()) // adding sync list for storing publishers
val outputProcessor = DirectProcessor.create<String>()
fun addNewPublisher(publisher: Flux<String>) {
val cached = publisher.cache(1) // caching the last item for a new publisher
publishersList.add(cached)
cached.subscribe(outputProcessor)
}
fun getAllPublishersState(): Flux<String> = publishersList
.toFlux()
.reduce(outputProcessor as Flux<String>) { acc, flux -> acc.mergeWith(flux.take(1)) } // merging the last item of each publisher with outputProcessor
.flatMapMany { it }
}

Building a DspComplex ROM in Chisel

I'm attempting to build a ROM-based Window function using DSPComplex and FixedPoint types, but seem to keep running into the following error:
chisel3.core.Binding$ExpectedHardwareException: vec element 'dsptools.numbers.DspComplex#32' must be hardware, not a bare Chisel type
The source code for my attempt at this looks like the following:
class TaylorWindow(len: Int, window: Seq[FixedPoint]) extends Module {
val io = IO(new Bundle {
val d_valid_in = Input(Bool())
val sample = Input(DspComplex(FixedPoint(16.W, 8.BP), FixedPoint(16.W, 8.BP)))
val windowed_sample = Output(DspComplex(FixedPoint(24.W, 8.BP), FixedPoint(24.W, 8.BP)))
val d_valid_out = Output(Bool())
})
val win_coeff = Vec(window.map(x=>DspComplex(x, FixedPoint(0, 16.W, 8.BP))).toSeq) // ROM storing our coefficients.
io.d_valid_out := io.d_valid_in
val counter = Reg(UInt(10.W))
// Implicit reset
io.windowed_sample:= io.sample * win_coeff(counter)
when(io.d_valid_in) {
counter := counter + 1.U
}
}
println(getVerilog(new TaylorWindow(1024, fp_seq)))
I'm actually reading the coefficients in from a file (this particular window has a complex generation function that I'm doing in Python elsewhere) with the following sequence of steps
val filename = "../generated/taylor_coeffs"
val coeff_file = Source.fromFile(filename).getLines
val double_coeffs = coeff_file.map(x => x.toDouble)
val fp_coeffs = double_coeffs.map(x => FixedPoint.fromDouble(x, 16.W, 8.BP))
val fp_seq = fp_coeffs.toSeq
Does this mean the DSPComplex type isn't able to be translated to Verilog?
Commenting out the win_coeff line seems to make the whole thing generate (but clearly doesn't do what I want it to do)
I think you should try using
val win_coeff = VecInit(window.map(x=>DspComplex.wire(x, FixedPoint.fromDouble(0.0, 16.W, 8.BP))).toSeq) // ROM storing our coefficients.
which will create hardware values like you want. The Vec just creates a Vec of the type specfied

Copy Object Properties to a Map by Value not by Reference

I'm not sure where i'm going wrong, but it seems that I'm not able to copy properties from an object instance and assign them to a map without the values being changed after saving the instance.
This is a sample class:
class Product {
String productName
String proudctDescription
int quantityOnHand
}
Once the form is submitted and it's sent to my controller, I can access the values and manipulate them from the productInstance.properties map that is available from the instance. I want to copy the properties to another map to preserve the values before committing them during an edit. So let's say we are editing a record and these are the values stored in the db: productName = "My Product", productDescription = "My Product Description" and quantityOnHand = 100.
I want to copy them to:
def propertiesBefore = productInstance.properties
This did not work, because when I save the productInstance, the values in propertiesBefore change to whatever the instance had.
So I tried this:
productInstance.properties.each { k,v -> propertiesBefore[k] = v }
Same thing happened again. I am not sure how to copy by value, it seems no matter what I try it copies by reference instead.
EDIT
As per the request of Pawel P., this is the code that I tested:
class Product {
String productName
String productDescription
int quantityOnHand
}
def productInstance = new Product(productName: "Some name", productDescription: "Desciption", quantityOnHand: 10)
def propertiesBefore = [:]
productInstance.properties.each { k,v -> propertiesBefore[k] = (v instanceof Cloneable) ? v.clone() : v }
productInstance.productName = "x"
productInstance.productDescription = "y"
productInstance.quantityOnHand = 9
println propertiesBefore.quantityOnHand // this will print the same as the one after the save()
productInstance.save(flush:true)
println propertiesBefore.quantityOnHand // this will print the same as the one above the save()
Without cloning, copying hash-map [:]'s values to a new hash-map [:]'s space can also be done by "pushing" the first one over, which would achieve the same result that you desired (copy by value)!
def APE = [:]
APE= [tail: 1, body: "hairy", hungry: "VERY!!!"]
def CAVEMAN = [:]
CAVEMAN << APE //push APE to CAVEMAN's space
//modify APE's values for CAVEMAN
CAVEMAN.tail = 0
CAVEMAN.body = "need clothes"
println "'APE': ${APE}"
println "'CAVEMAN': ${CAVEMAN}"
Output ==>
'APE': [tail:1, body:hairy, hungry:VERY!!!]
'CAVEMAN': [tail:0, body:need clothes, hungry:VERY!!!]
The problem is that you actually copy references to variables. To obtain copy of variable you should use clone(). Take a look:
class Product {
String productName
String productDescription
int quantityOnHand
}
def productInstance = new Product(productName: "Some name", productDescription: "Desciption", quantityOnHand: 10)
def propertiesBefore = [:]
productInstance.properties.each { k,v -> propertiesBefore[k] = (v instanceof Cloneable) ? v.clone() : v }
productInstance.productName = "x"
productInstance.productDescription = "y"
productInstance.quantityOnHand = 9
println productInstance.properties
println propertiesBefore
It prints:
[quantityOnHand:9, class:class Product, productName:x, productDescription:y]
[quantityOnHand:10, class:class Product, productName:Some name, productDescription:Desciption]
A simpler example for groovy using Hash-Map [:] can be like this:
def APE = [:]
APE= [tail: 1, body: "hairy", hungry: "VERY!!!"]
def CloneMe = APE //*APE as clone*
def CAVEMAN = [:] //*copy APE's values over thru mapping the clone*
CloneMe.each { key,value -> CAVEMAN[key] = (value instanceof Cloneable) ? value.clone() : value }
println "'CloneMe': ${CloneMe}"
//change some of the clone's values for CAVEMAN
CAVEMAN.tail = 0
CAVEMAN.body = "need clothes"
println "'APE': ${APE}"
println "'CAVEMAN': ${CAVEMAN}"
Output ==>
'CloneMe': [tail:1, body:hairy, hungry:VERY!!!]
'APE': [tail:1, body:hairy, hungry:VERY!!!]
'CAVEMAN': [tail:0, body:need clothes, hungry:VERY!!!]

How to convert a DataSet to FeatureDataSet

I am trying to get the geometry data from a dataset to a featuredataset:
private void QueryCustomer(DataSet ds)
{
SharpMap.Data.FeatureDataSet ds_feature = new SharpMap.Data.FeatureDataSet();
ds_feature = (SharpMap.Data.FeatureDataSet)ds; // ERROR HERE
..
I am getting :
Unable to cast object of type 'System.Data.DataSet' to type 'SharpMap.Data.FeatureDataSet'
Any help would be appreciated. Thanks.
No need to create a DataSet. Just get your table directly from SqLite using the FeatureDataSet:
double x, y;
FeatureDataSet fds = new FeatureDataSet();
Envelope env = new Envelope(double.MinValue, double.MaxValue, double.MinValue, double.MaxValue);
SharpMap.Data.Providers.ManagedSpatiaLite p = new ManagedSpatiaLite(ConnectionString, Table, GeometryColumn.ToUpper(), KeyColumn.ToUpper());
p.Open();
p.ExecuteIntersectionQuery(env, fds);
foreach (FeatureDataRow fdr in ((FeatureDataTable)fds.Tables[0]).Rows)
{
x = fdr.Geometry.Centroid.X;
y = fdr.Geometry.Centroid.Y;
//...process x and y here...
}
p.Close();
p.Dispose();

Resources