Can I include blank lines into the file when using YamlDotNet - parsing

Is there a way for me to include empty lines into the parsed YAML file using YamlDotNet? What I have currently when I parse a file like this:
node1: "1.0"
node2: "some text"
node3: "string"
I end up with this as the result:
node1: "1.0"
node2: "some text"
node3: "string"
Is there a way for me to configure the parser to not ignore blank lines?
Briefly, I'm using the YamlDotNet Parser class like so:
var input = File.OpenText(file);
var parser = new Parser(_input);
public bool Read()
{
Value = null;
Path = null;
var hasMore = _parser.MoveNext();
if (!hasMore)
{
return false;
}
parser.Current.Accept(this);
LineNumber = _parser.Current.Start.Line;
return true;
}
And in a separate class:
while (reader.Read())
{
}
EDIT:
This doesn't only happen with blank lines, it also happens when I have a line break after a dash:
Before:
-
name: Mark McGwire
hr: 65
avg: 0.278
After:
- name: Mark McGwire
hr: 65
avg: 0.278

Related

how to convert protobuf readable request to proto message

I am playing with google api for educational purpose to learn protobuf connection. But i am getting 400 Bad Request when i sending post request. I hope so my protobuf message have problem and don't understand how to i write 0: 0 message in proto file. I am using PHP Protobuf library to serialization data.
This is readable/decoded grpc post request data:
0: 0
0: 0
}
1: 0x80c2:0x3a2f
2: this is test
3: 5
7 {
9: nG86YfStD4Of9QPB_L_QDg:28441478
15: 20089
}
8 {
1 {
1: 5
}
}
9 {
1: 0
2: 1
}
11: 1
12: 1
14: 0
16 {
I created this proto messages by following above grpc request data:
syntax = "proto3";
enum firstZero {
COUNT = 0;
}
message submit {
string pid = 1;
string rtext = 2;
int32 str = 3;
.unknown_auto_data unknown = 7;
.unknown_auto_data_eight unknown_one = 8;
.unknown_auto_data_nine unknown_two = 9;
int32 unknown_int = 11;
int32 unknown_int_one = 12;
int32 unknown_int_two = 14;
string unknown_string = 16;
}
message unknown_auto_data {
string auto_data = 9;
int32 auto_int = 15;
}
message unknown_auto_data_eight {
.unknown_repeated_one unknown_re_one = 1;
}
message unknown_repeated_one {
int32 unknown_rep = 1;
}
message unknown_auto_data_nine {
int32 unknown_it = 1;
int32 unknown_itt = 2;
}
Is my proto messages are correct? or How will i write it?
Also my request model data in php:
class RequestModel {
public $jsonData = [
"pid" => "0x80c2:0x3a2f",
"rtext" => "Awesome, This is test.",
"str" => "5"
];
public function getData() {
return json_encode($this->jsonData);
}
}
Is there (online/software/method) any tools to decode grpc/protobuf request/response to readable data like above json like data?
\x01\x00\x00\x05\xc0\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03\xcdVM\x88\x1be\x18\xced\xdb\xed:E\xba.......
To convert binary protobuf data to a readable form, use protoc --decode_raw. The output will not have any field names, as those are not part of the wire format. Here is an example from https://github.com/pawitp/protobuf-decoder, which also provides an online-service to do the same thing.
echo -en "\x0a\x2f\x0a\x08\x4a\x6f\x68\x6e\x20\x44\x6f\
\x65\x10\x01\x1a\x10\x6a\x6f\x68\x6e\x40\x65\x78\x61\x6d\
\x70\x6c\x65\x2e\x63\x6f\x6d\x22\x0f\x0a\x0b\x31\x31\x31\
\x2d\x32\x32\x32\x2d\x33\x33\x33\x10\x01\x0a\x1e\x0a\x08\
\x4a\x61\x6e\x65\x20\x44\x6f\x65\x10\x02\x1a\x10\x6a\x61\
\x6e\x65\x40\x65\x78\x61\x6d\x70\x6c\x65\x2e\x63\x6f\x6d"\
| protoc --decode_raw
1 {
1: "John Doe"
2: 1
3: "john#example.com"
4 {
1: "111-222-333"
2: 1
}
}
1 {
1: "Jane Doe"
2: 2
3: "jane#example.com"
}
If you have the .proto file for your data, use protoc --decode instead.
echo -en "\x0a\x2f\x0a\x08\x4a\x6f\x68\x6e\x20\x44\x6f\
\x65\x10\x01\x1a\x10\x6a\x6f\x68\x6e\x40\x65\x78\x61\x6d\
\x70\x6c\x65\x2e\x63\x6f\x6d\x22\x0f\x0a\x0b\x31\x31\x31\
\x2d\x32\x32\x32\x2d\x33\x33\x33\x10\x01\x0a\x1e\x0a\x08\
\x4a\x61\x6e\x65\x20\x44\x6f\x65\x10\x02\x1a\x10\x6a\x61\
\x6e\x65\x40\x65\x78\x61\x6d\x70\x6c\x65\x2e\x63\x6f\x6d"\
| protoc --decode=AddressBook addresses.proto
addresses {
name: "John Doe"
id: 1
email: "john#example.com"
numbers {
number: "111-222-333"
id: 1
}
}
addresses {
name: "Jane Doe"
id: 2
email: "jane#example.com"
}
Where addresses.proto looks like this:
syntax = "proto3";
message AddressBook {
repeated Address addresses = 1;
}
message Address {
string name = 1;
int32 id = 2;
string email = 3;
repeated Number numbers = 4;
}
message Number {
string number = 1;
int32 id = 2;
}
You can also use protoc to encode from the textual representation to the binary variant:
echo 'addresses { name: "Mike", email: "mike#mike.us" }' | protoc --encode=AddressBook addresses.proto > addressbook.bin
cat addressbook.bin | protoc --decode=AddressBook addresses.proto
addresses {
name: "Mike"
email: "mike#mike.us"
}

Groovy multiline string interpolation whitespace

I am trying to generate some generic Groovy code for Jenkins but I seem to have trouble with multi line strings and extra white space. I've tried everything I could find by Googling but I can't seem to get it working.
My issue isn't related to simple multi line strings. I managed to trim white space by using the stripIndent() and stripMargin() methods for simple cases. My issue is caused by having interpolated methods inside my strings.
Groovy info: Groovy Version: 3.0.2 JVM: 13.0.2 Vendor: Oracle Corporation OS: Mac OS X
String method2(String tier, String jobName) {
return """
Map downstreamJobs = [:]
stage ("${jobName}-${tier}-\${region}_${jobName}") {
test
}
""".stripIndent().stripMargin()
}
static String simpleLog() {
return """
script {
def user = env.BUILD_USER_ID
}
""".stripIndent().stripMargin()
}
static String method1() {
return """\
import jenkins.model.Jenkins
currentBuild.displayName = "name"
${simpleLog()}
""".stripIndent().stripMargin()
}
String generateFullDeploymentPipelineCode() {
return """Text here
${method1()}
${method2("test1", "test2")}
""".stripIndent().stripMargin()
}
println(generateFullDeploymentPipelineCode())
This is what it prints(or writes to disk):
Text here
import jenkins.model.Jenkins
currentBuild.displayName = "name"
script {
def user = env.BUILD_USER_ID
}
Map downstreamJobs = [:]
stage ("test2-test1-${region}_test2") {
test
}
Why the extra space around the import lines? I know the indentation method is supposed to trim all white space according to the least number of leading spaces, so that's why we use backslash (example here https://stackoverflow.com/a/19882917/7569335).
That works for simple strings, but it breaks down once use start using interpolation. Not with regular variables, just when you interpolate an entire method.
as variant - use just stripMargin() and only once on a final string
String method2(String tier, String jobName) {
return """\
|Map downstreamJobs = [:]
|stage ("${jobName}-${tier}-\${region}_${jobName}") {
| test
|}
"""
}
static String simpleLog() {
return """\
|script {
| def user = env.BUILD_USER_ID
|}
"""
}
static String method1() {
return """\
|import jenkins.model.Jenkins
|currentBuild.displayName = "name"
${simpleLog()}
"""
}
String generateFullDeploymentPipelineCode() {
return """\
|Text here
${method1()}
${method2("test1", "test2")}
""".stripIndent().stripMargin()
}
println(generateFullDeploymentPipelineCode())
result:
Text here
import jenkins.model.Jenkins
currentBuild.displayName = "name"
script {
def user = env.BUILD_USER_ID
}
Map downstreamJobs = [:]
stage ("test2-test1-${region}_test2") {
test
}
another variant with trim() and stripIndent()
def method2(String tier, String jobName) {
return """
Map downstreamJobs = [:]
stage ("${jobName}-${tier}-\${region}_${jobName}") {
test
}
""".trim()
}
def simpleLog() {
return """
script {
def user = env.BUILD_USER_ID
}
""".trim()
}
def method1() {
return """
import jenkins.model.Jenkins
currentBuild.displayName = "name"
${simpleLog()}
""".trim()
}
def generateFullDeploymentPipelineCode() {
return """\
Text here
${method1()}
${method2("test1", "test2")}
""".stripIndent()
}
println(generateFullDeploymentPipelineCode())
When you insert a string through interpolation you only indent the first line of it. The following lines of the inserted string will be indented differently, which messes everything up.
Using some lesser-known members of GString (namely .strings[] and .values[]), we can align the indentation of all lines of each interpolated value.
String method2(String tier, String jobName) {
indented """
Map downstreamJobs = [:]
stage ("${jobName}-${tier}-\${region}_${jobName}") {
test
}
"""
}
String simpleLog() {
indented """\
script {
def user = env.BUILD_USER_ID
}
"""
}
String method1() {
indented """\
import jenkins.model.Jenkins
currentBuild.displayName = "name"
${simpleLog()}
"""
}
String generateFullDeploymentPipelineCode() {
indented """\
Text here
${method1()}
${method2("test1", "test2")}
"""
}
println generateFullDeploymentPipelineCode()
//---------- Move the following code into its own script ----------
// Function to adjust the indentation of interpolated values so that all lines
// of a value match the indentation of the first line.
// Finally stripIndent() will be called before returning the string.
String indented( GString templ ) {
// Iterate over the interpolated values of the GString template.
templ.values.eachWithIndex{ value, i ->
// Get the string preceding the current value. Always defined, even
// when the value is at the beginning of the template.
def beforeValue = templ.strings[ i ]
// RegEx to match any indent substring before the value.
// Special case for the first string, which doesn't necessarily contain '\n'.
def regexIndent = i == 0
? /(?:^|\n)([ \t]+)$/
: /\n([ \t]+)$/
def matchIndent = ( beforeValue =~ regexIndent )
if( matchIndent ) {
def indent = matchIndent[ 0 ][ 1 ]
def lines = value.readLines()
def linesNew = [ lines.head() ] // The 1st line is already indented.
// Insert the indentation from the 1st line into all subsequent lines.
linesNew += lines.tail().collect{ indent + it }
// Finally replace the value with the reformatted lines.
templ.values[ i ] = linesNew.join('\n')
}
}
return templ.stripIndent()
}
// Fallback in case the input string is not a GString (when it doesn't contain expressions)
String indented( String templ ) {
return templ.stripIndent()
}
Live Demo at codingground
Output:
Text here
import jenkins.model.Jenkins
currentBuild.displayName = "name"
script {
def user = env.BUILD_USER_ID
}
Map downstreamJobs = [:]
stage ("test2-test1-${region}_test2") {
test
}
Conclusion:
Using the indented function, a clean Groovy syntax for generating code from GString templates has been achieved.
This was quite a learning experience. I first tried to do it completely different using the evaluate function, which turned out to be too complicated and not so flexible. Then I randomly browsed through some posts from mrhaki blog (always a good read!) until I discovered "Groovy Goodness: Get to Know More About a GString". This was the key to implementing this solution.

How to write a map to a YAML file in Dart

I have a map of key value pairs in Dart. I want to convert it to YAML and write into a file.
I tried using YAML package from dart library but it only provides methods to load YAML data from a file. Nothing is mentioned on how to write it back to the YAML file.
Here is an example:
void main() {
var map = {
"name": "abc",
"type": "unknown",
"internal":{
"name": "xyz"
}
};
print(map);
}
Expected output:
example.yaml
name: abc
type: unknown
internal:
name: xyz
How to convert the dart map to YAML and write it to a file?
It's a bit late of a response but for anyone else looking at this question I have written this class. It may not be perfect but it works for what I'm doing and I haven't found anything wrong with it yet. Might make it a package eventually after writing tests.
class YamlWriter {
/// The amount of spaces for each level.
final int spaces;
/// Initialize the writer with the amount of [spaces] per level.
YamlWriter({
this.spaces = 2,
});
/// Write a dart structure to a YAML string. [yaml] should be a [Map] or [List].
String write(dynamic yaml) {
return _writeInternal(yaml).trim();
}
/// Write a dart structure to a YAML string. [yaml] should be a [Map] or [List].
String _writeInternal(dynamic yaml, { int indent = 0 }) {
String str = '';
if (yaml is List) {
str += _writeList(yaml, indent: indent);
} else if (yaml is Map) {
str += _writeMap(yaml, indent: indent);
} else if (yaml is String) {
str += "\"${yaml.replaceAll("\"", "\\\"")}\"";
} else {
str += yaml.toString();
}
return str;
}
/// Write a list to a YAML string.
/// Pass the list in as [yaml] and indent it to the [indent] level.
String _writeList(List yaml, { int indent = 0 }) {
String str = '\n';
for (var item in yaml) {
str += "${_indent(indent)}- ${_writeInternal(item, indent: indent + 1)}\n";
}
return str;
}
/// Write a map to a YAML string.
/// Pass the map in as [yaml] and indent it to the [indent] level.
String _writeMap(Map yaml, { int indent = 0 }) {
String str = '\n';
for (var key in yaml.keys) {
var value = yaml[key];
str += "${_indent(indent)}${key.toString()}: ${_writeInternal(value, indent: indent + 1)}\n";
}
return str;
}
/// Create an indented string for the level with the spaces config.
/// [indent] is the level of indent whereas [spaces] is the
/// amount of spaces that the string should be indented by.
String _indent(int indent) {
return ''.padLeft(indent * spaces, ' ');
}
}
Usage:
final writer = YamlWriter();
String yaml = writer.write({
'string': 'Foo',
'int': 1,
'double': 3.14,
'boolean': true,
'list': [
'Item One',
'Item Two',
true,
'Item Four',
],
'map': {
'foo': 'bar',
'list': ['Foo', 'Bar'],
},
});
File file = File('/path/to/file.yaml');
file.createSync();
file.writeAsStringSync(yaml);
Output:
string: "Foo"
int: 1
double: 3.14
boolean: true
list:
- "Item One"
- "Item Two"
- true
- "Item Four"
map:
foo: "bar"
list:
- "Foo"
- "Bar"
package:yaml does not have YAML writing features. You may have to look for another package that does that – or write your own.
As as stopgap, remember JSON is valid YAML, so you can always write out JSON to a .yaml file and it should work with any YAML parser.
I ran into the same issue and ended up hacking together a simple writer:
// Save the updated configuration settings to the config file
void saveConfig() {
var file = _configFile;
// truncate existing configuration
file.writeAsStringSync('');
// Write out new YAML document from JSON map
final config = configToJson();
config.forEach((key, value) {
if (value is Map) {
file.writeAsStringSync('\n$key:\n', mode: FileMode.writeOnlyAppend);
value.forEach((subkey, subvalue) {
file.writeAsStringSync(' $subkey: $subvalue\n',
mode: FileMode.writeOnlyAppend);
});
} else {
file.writeAsStringSync('$key: $value\n',
mode: FileMode.writeOnlyAppend);
}
});
}

protobuf text format parsing maps

This answer clearly shows some examples of proto text parsing, but does not have an example for maps.
If a proto has:
map<int32, string> aToB
I would guess something like:
aToB {
123: "foo"
}
but it does not work. Does anyone know the exact syntax?
I initially tried extrapolating from an earlier answer, which led me astray, because I incorrectly thought multiple k/v pairs would look like this:
aToB { # (this example has a bug)
key: 123
value: "foo"
key: 876 # WRONG!
value: "bar" # NOPE!
}
That led to the following error:
libprotobuf ERROR: Non-repeated field "key" is specified multiple times.
Proper syntax for multiple key-value pairs:
(Note: I am using the "proto3" version of the protocol buffers language)
aToB {
key: 123
value: "foo"
}
aToB {
key: 876
value: "bar"
}
The pattern of repeating the name of the map variable makes more sense after re-reading this relevant portion of the proto3 Map documentation, which explains that maps are equivalent to defining your own "pair" message type and then marking it as "repeated".
A more complete example:
proto definition:
syntax = "proto3";
package myproject.testing;
message UserRecord {
string handle = 10;
bool paid_membership = 20;
}
message UserCollection {
string description = 20;
// HERE IS THE PROTOBUF MAP-TYPE FIELD:
map<string, UserRecord> users = 10;
}
message TestData {
UserCollection user_collection = 10;
}
text format ("pbtxt") in a config file:
user_collection {
description = "my default users"
users {
key: "user_1234"
value {
handle: "winniepoo"
paid_membership: true
}
}
users {
key: "user_9b27"
value {
handle: "smokeybear"
}
}
}
C++ that would generate the message content programmatically
myproject::testing::UserRecord user_1;
user_1.set_handle("winniepoo");
user_1.set_paid_membership(true);
myproject::testing::UserRecord user_2;
user_2.set_handle("smokeybear");
user_2.set_paid_membership(false);
using pair_type =
google::protobuf::MapPair<std::string, myproject::testing::UserRecord>;
myproject::testing::TestData data;
data.mutable_user_collection()->mutable_users()->insert(
pair_type(std::string("user_1234"), user_1));
data.mutable_user_collection()->mutable_users()->insert(
pair_type(std::string("user_9b27"), user_2));
The text format is:
aToB {
key: 123
value: "foo"
}

What grammar is this?

I have to parse a document containing groups of variable-value-pairs which is serialized to a string e.g. like this:
4^26^VAR1^6^VALUE1^VAR2^4^VAL2^^1^14^VAR1^6^VALUE1^^
Here are the different elements:
Group IDs:
4^26^VAR1^6^VALUE1^VAR2^4^VAL2^^1^14^VAR1^6^VALUE1^^
Length of string representation of each group:
4^26^VAR1^6^VALUE1^VAR2^4^VAL2^^1^14^VAR1^6^VALUE1^^
One of the groups:
4^26^VAR1^6^VALUE1^VAR2^4^VAL2^^1^14 ^VAR1^6^VALUE1^^
Variables:
4^26^VAR1^6^VALUE1^VAR2^4^VAL2^^1^14^VAR1^6^VALUE1^^
Length of string representation of the values:
4^26^VAR1^6^VALUE1^VAR2^4^VAL2^^1^14^VAR1^6^VALUE1^^
The values themselves:
4^26^VAR1^6^VALUE1^VAR2^4^VAL2^^1^14^VAR1^6^VALUE1^^
Variables consist only of alphanumeric characters.
No assumption is made about the values, i.e. they may contain any character, including ^.
Is there a name for this kind of grammar? Is there a parsing library that can handle this mess?
So far I am using my own parser, but due to the fact that I need to detect and handle corrupt serializations the code looks rather messy, thus my question for a parser library that could lift the burden.
The simplest way to approach it is to note that there are two nested levels that work the same way. The pattern is extremely simple:
id^length^content^
At the outer level, this produces a set of groups. Within each group, the content follows exactly the same pattern, only here the id is the variable name, and the content is the variable value.
So you only need to write that logic once and you can use it to parse both levels. Just write a function that breaks a string up into a list of id/content pairs. Call it once to get the groups, and then loop through them calling it again for each content to get the variables in that group.
Breaking it down into these steps, first we need a way to get "tokens" from the string. This function returns an object with three methods, to find out if we're at "end of file", and to grab the next delimited or counted substring:
var tokens = function(str) {
var pos = 0;
return {
eof: function() {
return pos == str.length;
},
delimited: function(d) {
var end = str.indexOf(d, pos);
if (end == -1) {
throw new Error('Expected delimiter');
}
var result = str.substr(pos, end - pos);
pos = end + d.length;
return result;
},
counted: function(c) {
var result = str.substr(pos, c);
pos += c;
return result;
}
};
};
Now we can conveniently write the reusable parse function:
var parse = function(str) {
var parts = {};
var t = tokens(str);
while (!t.eof()) {
var id = t.delimited('^');
var len = t.delimited('^');
var content = t.counted(parseInt(len, 10));
var end = t.counted(1);
if (end !== '^') {
throw new Error('Expected ^ after counted string, instead found: ' + end);
}
parts[id] = content;
}
return parts;
};
It builds an object where the keys are the IDs (or variable names). I'm asuming as they have names that the order isn't significant.
Then we can use that at both levels to create the function to do the whole job:
var parseGroups = function(str) {
var groups = parse(str);
Object.keys(groups).forEach(function(id) {
groups[id] = parse(groups[id]);
});
return groups;
}
For your example, it produces this object:
{
'1': {
VAR1: 'VALUE1'
},
'4': {
VAR1: 'VALUE1',
VAR2: 'VAL2'
}
}
I don't think it's a trivial task to create a grammar for this. But on the other hand, a simple straight forward approach is not that hard. You know the corresponding string length for every critical string. So you just chop your string according to those lengths apart..
where do you see problems?

Resources