How dynamically assign correct message to decode protocol buffer message? - ruby-on-rails

Hi I have a data stream pipeline that works over "events". Those events are simple protocol buffer messages, say:
message OrderCoffee {
int32 id = 1;
}
message CancelOrder {
int32 id = 1;
}
A client then serialize/encode those messages and push them into a message broker (say Google Pub/Sub). A subscriber consumes one message and tries to decode/deserialize (pseudocode):
decoded_message = OrderCoffe.decode(encoded_message)
decoded_message = CancelOrder.decode(encoded_message)
Which of those lines work? Both, at least in my Ruby code. I don't know if I have a conceptual misunderstanding about how to use protocol buffers or that is a ruby bug.
If that is the expected bahaviour, how can I know at runtime which message should I decode the received message?
EDIT:
Ok, the solution seems to be https://developers.google.com/protocol-buffers/docs/techniques?csw=1#self-description .
I couldn't understand though. Could someone provide an example of how to implement that in ruby?

Basically, you can't from there. Protobuf messages are not self-describing. If it was me, I'd add a wrapper:
message SomeType {
oneof the_thing {
OrderCoffee order = 1;
CancelOrder cancel = 2;
}
}
When you deserialize a the_thing, you can test which inner object is assigned.

Related

Should I use Exceptions while parsing complex user input

when looking for Information when and why to use Exceptions there are many people (also on this platform) making the point of not using exceptions when validating user-input because invalid input is not an exceptional thing to happen.
I now have the case where I have to parse a complex string of user input and map it to an Object-Tree basically, similar to a Parser.
Example in pseudo code:
input:
----
hello[5]
+
foo["ok"]
----
results in something like that:
class Hello {
int id = 5
}
class Add {}
class foo {
string name = 'ok'
}
Now in order to "validate" that input I have to parse it, having code that parses the input for validation and code to create the objects separately feels redundant.
Currently I'm using Exceptions while parsing single tokens to collect all Errors.
// one token is basically a single
try {
foreach (token in tokens) {
factory = getFactory(token) // throws ParseException
addObject(factory.create(token)) // throws ParseException
}
} catch (ParseException e) {
// e.g. "Foo Token expects value to be string"
addError(e)
}
is this bad use of exceptions?
An alternative would be to inject a validation class in every factory or mess around with return types (feels a bit dirty)
If exceptions work for your use case, go for it.
The usual problem with exceptions is that they don't let you fix things up and continue, which makes it hard to implement parser error recovery. You can't really fix up a bad input, and you probably shouldn't even in cases where you could, but error recovery lets you report more than one error from the same input, which is often considered convenient.
All of that depends on your needs and parsing strategy, so there's not a lot of information to go on here.

"Guid should contain 32 digits" serilog error with sql server sink

I am getting this error occasionally with the MSSQLServer sink. I can't see what's wrong with this guid. Any ideas? I've verified in every place I can find the data type of the source guid is "Guid" not a string. I'm just a bit mystified.
Guid should contain 32 digits with 4 dashes (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx).Couldn't store <"7526f485-ec2d-4ec8-bd73-12a7d1c49a5d"> in UserId Column. Expected type is Guid.
The guid in this example is:
7526f485-ec2d-4ec8-bd73-12a7d1c49a5d
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
seems to match the template to me?
Further details:
This is an occasional issue, but when it arises it arises a lot. It seems to be tied to specific Guids. Most Guids are fine, but a small subset have this issue. Our app logs thousands of messages a day, but these messages are not logged (because of the issue) so it is difficult for me to track down exactly where the specific logs that are causing this error come from. However, we use a centralized logging method that is run something like this. This test passes for me, but it mirrors the setup and code we use for logging generally, which normally succeeds. As I said, this is an intermittent issue:
[Fact]
public void Foobar()
{
// arrange
var columnOptions = new ColumnOptions
{
AdditionalColumns = new Collection<SqlColumn>
{
new SqlColumn {DataType = SqlDbType.UniqueIdentifier, ColumnName = "UserId"},
},
};
columnOptions.Store.Remove(StandardColumn.MessageTemplate);
columnOptions.Store.Remove(StandardColumn.Properties);
columnOptions.Store.Remove(StandardColumn.LogEvent);
columnOptions.Properties.ExcludeAdditionalProperties = true;
var badGuid = new Guid("7526f485-ec2d-4ec8-bd73-12a7d1c49a5d");
var connectionString = "Server=(localdb)\\MSSQLLocalDB;Database=SomeDb;Trusted_Connection=True;MultipleActiveResultSets=true";
var logConfiguration = new LoggerConfiguration()
.MinimumLevel.Information()
.Enrich.FromLogContext()
.WriteTo.MSSqlServer(connectionString, "Logs",
restrictedToMinimumLevel: LogEventLevel.Information, autoCreateSqlTable: false,
columnOptions: columnOptions)
.WriteTo.Console(restrictedToMinimumLevel: LogEventLevel.Information);
Log.Logger = logConfiguration.CreateLogger();
// Suspect the issue is with this line
LogContext.PushProperty("UserId", badGuid);
// Best practice would be to do something like this:
// using (LogContext.PushProperty("UserId", badGuid)
// {
Log.Logger.Information(new FormatException("Foobar"),"This is a test");
// }
Log.CloseAndFlush();
}
One thing I have noticed since constructing this test code is that the "PushProperty" for the UserId property is not captured and disposed. Since behaviour is "undefined" in this case, I am inclined to fix it anyway and see if the problem goes away.
full stack:
2020-04-20T08:38:17.5145399Z Exception while emitting periodic batch from Serilog.Sinks.MSSqlServer.MSSqlServerSink: System.ArgumentException: Guid should contain 32 digits with 4 dashes (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx).Couldn't store <"7526f485-ec2d-4ec8-bd73-12a7d1c49a5d"> in UserId Column. Expected type is Guid.
---> System.FormatException: Guid should contain 32 digits with 4 dashes (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx).
at System.Guid.GuidResult.SetFailure(Boolean overflow, String failureMessageID)
at System.Guid.TryParseExactD(ReadOnlySpan`1 guidString, GuidResult& result)
at System.Guid.TryParseGuid(ReadOnlySpan`1 guidString, GuidResult& result)
at System.Guid..ctor(String g)
at System.Data.Common.ObjectStorage.Set(Int32 recordNo, Object value)
at System.Data.DataColumn.set_Item(Int32 record, Object value)
--- End of inner exception stack trace ---
at System.Data.DataColumn.set_Item(Int32 record, Object value)
at System.Data.DataRow.set_Item(DataColumn column, Object value)
at Serilog.Sinks.MSSqlServer.MSSqlServerSink.FillDataTable(IEnumerable`1 events)
at Serilog.Sinks.MSSqlServer.MSSqlServerSink.EmitBatchAsync(IEnumerable`1 events)
at Serilog.Sinks.PeriodicBatching.PeriodicBatchingSink.OnTick()
RESOLUTION
This issue was caused because someone created a log message with a placeholder that had the same name as our custom data column, but was passing in a string version of a guid instead of one typed as a guid.
Very simple example:
var badGuid = "7526f485-ec2d-4ec8-bd73-12a7d1c49a5d";
var badGuidConverted = Guid.Parse(badGuid); // just proving the guid is actually valid.
var goodGuid = Guid.NewGuid();
using (LogContext.PushProperty("UserId",goodGuid))
{
Log.Logger.Information("This is a problem with my other user {userid} that will crash serilog. This message will never end up in the database.", badGuid);
}
The quick fix is to edit the message template to change the placeholder from {userid} to something else.
Since our code was centralized around the place where the PushProperty occurs, I put some checks in there to monitor for this and throw a more useful error message in the future when someone does this again.
I don't see anything obvious in the specific code above that would cause the issue. The fact that you call PushProperty before setting up Serilog would be something I would change (i.e. set up Serilog first, then call PushProperty) but that doesn't seem to be the root cause of the issue you're having.
My guess, is that you have some code paths that are logging the UserId as a string, instead of a Guid. Serilog is expecting a Guid value type, so if you give it a string representation of a Guid it won't work and will give you that type of exception.
Maybe somewhere in the codebase you're calling .ToString on the UserId before logging? Or perhaps using string interpolation e.g. Log.Information("User is {UserId}", $"{UserId}");?
For example:
var badGuid = "7526f485-ec2d- 4ec8-bd73-12a7d1c49a5d";
LogContext.PushProperty("UserId", badGuid);
Log.Information(new FormatException("Foobar"), "This is a test");
Or even just logging a message with the UserId property directly:
var badGuid = "7526f485-ec2d-4ec8-bd73-12a7d1c49a5d";
Log.Information("The {UserId} is doing work", badGuid);
Both snippets above would throw the same exception you're having, because they use string values rather than real Guid values.

JNA: invalid memory access with callback function parameter (struct)

To lone travelers stumbling upon this: see comments for the answer.
...
Writing a Java wrapper for a native library. A device generates data samples and stores them as structs. Two native ways of accessing them: either you request one with a getSample(&sampleStruct) or you set a callback function. Now, here is what does work:
The polling method does fill the JNA Structure
The callback function is called after being set
In fact, I am currently getting the sample right from the callback function
The problem: trying to do anything with the callback argument, which should be a struct, causes an "invalid memory access". Declaring the argument as the Structure does this, so I declared it as a Pointer. Trying a Pointer.getInt(0) causes invalid memory access. So then I declared the argument as an int, and an int is delivered; in fact, it looks very much like the first field of the struct I am trying to get! So does it mean that the struct was at that address but disappeared before Java had time to access it?
This is what I am doing now:
public class SampleCallback implements Callback{
SampleStruct sample;
public int callback(Pointer refToSample) throws IOException{
lib.INSTANCE.GetSample(sample); // works no problem
adapter.handleSample(sample);
return 1;
} ...
But neither of these does:
public int callback(SampleStruct sample) throws IOException{
adapter.handleSample(sample);
return 1;
}
...
public int callback(Pointer refToSample) throws IOException{
SampleStruct sample = new SampleStruct();
sample.timestamp = refToSample.getInt(0);
...
adapter.handleSample(sample);
return 1;
}
Also, this does in fact deliver the timestamp,
public int callback(int timestamp) throws IOException{
System.out.println("It is " + timestamp + "o'clock");
return 1;
}
but I would really prefer the whole struct.
This is clearly not going to be a popular topic and I do have a working solution, so the description is not exactly full. Will copy anything else that might be helpful if requested. Gratitude prematurely extended.

Receive messages only from a specific DDS topic instance?

I'm using OpenDDS v3.6, and trying to send a message to a specific DDS peer, one of many. In the IDL, the message structure looks like the following:
module Test
{
#pragma DCPS_DATA_TYPE "Test::MyMessage"
#pragma DCPS_DATA_KEY "Test::MyMessage dest_id"
struct MyMessage {
short dest_id;
string txt;
};
};
My understanding is that because the data key is unique, this is a new instance of the topic being written to, and any further msgs written w/ the same data key send to this specific instance of the topic. My send code is as follows:
DDS::ReturnCode_t ret;
Test::MyMessage msg;
// populate msg
msg.dest_id = n;
DDS::InstanceHandle_t handle;
handle = msg_writer->register_instance(msg);
ret = msg_writer->write(msg, handle);
So now I need to figure out how to get the receiving peer to read only from this topic instance and not receive all the other messages being sent to other peers. I started with the following, but not sure how to properly select a specific topic instance.
DDS::InstanceHandle_t instance;
status = msg_dr->take_next_instance(spec, si, 1, DDS::ANY_SAMPLE_STATE,
DDS::ANY_VIEW_STATE, DDS::ANY_INSTANCE_STATE);
Any help much appreciated.
The easiest way to achieve what you are looking for is by using a ContentFilteredTopic. This class is a specialization of the TopicDescription class and allows you to specify an expression (like a SQL WHERE-clause) of the samples that you are interested in.
Suppose you want your DataReader to only receive samples with dest_id equal to 42, then the corresponding code for creating the ContentFilteredTopic would look something like
DDS::ContentFilteredTopic_var cft =
participant->create_contentfilteredtopic("MyTopic-Filtered",
topic,
"dest_id = 42",
StringSeq());
From there on, you create your DataReader using cft as the parameter for the TopicDescription. The resulting reader will look like a regular DataReader, except that it only receives the desired samples and nothing else. Since the field dest_id happens to be the field that identifies the instance, the end result is that you will only have one instance in your DataReader.
You can check out the DDS specification (section 7.1.2.3.3) or OpenDDS Developer's Guide (section 5.2) for more details.

Applying two transforms to a message on the send port

I have an urgent need to send a canonical message (M1) out of an orchestration and need to map the canonical message to another message (M2). The resulting message (M2) has to be Wrapped in another request message (M3) before sending it to a web service.
I can't perform the initial transform in the orchestration as I can only deal with the canonical schema internally.
Whats the best way to achieve this 2 stage transform outside of the orchestration?
Thanks in advance!
You could make a pipeline component that applies each map sequentially. Then configure the port to use a pipeline with this component.
private Stream ApplyMap(Stream originalStream, Type mapType)
{
var transform = TransformMetaData.For(mapType).Transform;
var argList = TransformMetaData.For(mapType).ArgumentList;
XmlReader input = XmlReader.Create(originalStream);
Stream outputStream = new VirtualStream();
using (var outputWriter = XmlWriter.Create(outputStream))
{
transform.Transform(new XPathDocument(input), argList, outputWriter, null);
}
outputStream.Flush();
outputStream.Position = 0;
XmlReader outputReader = XmlReader.Create(outputStream);
return outputReader;
}
Then in the pipeline component's Execute method:
Type mapType1 = Type.GetType("YourMapNamespace.Map1, YourAssemblyName,...");
Type mapType2 = Type.GetType("YourMapNamespace.Map2, YourAssemblyName,...");
Stream originalStream = inmsg.BodyPart.GetOriginalDataStream();
Stream mappedStream =
ApplyMap(
ApplyMap(originalStream, mapType1),
mapType2
);
inmsg.BodyPart.Data = mappedStream;
context.ResourceTracker.AddResource(mappedStream);
Note that this example does everything in memory so it could be a problem for large messages. I'll try to find a better example that uses streaming (or worse case, you can use VirtualStream to avoid keeping everything in memory)
If you can use the ESB Toolkit, the ideal approach would be to use an itinerary (Richard Seroter has a good article on that approach here). If that's not an option, here's an approach I've used in the past:
http://blogs.msdn.com/b/chrisromp/archive/2008/08/06/stacking-maps-in-biztalk-server.aspx

Resources