How to flatten message with mutable structure into protobuf? - parsing

My proto buf format is this:
package main;
message Test {
optional string id = 1;
optional string name = 2;
optional string age = 3;
}
Then I populate the protobuf files from the input in golang using the following code. str is already parsed.
test = &Test{
id: proto.String(str[0]),
name: proto.String(str[1]),
age: proto.String(str[2]),
},
One condition I have to handle is that one or more optional fields in the Test structure could be absent randomly and I do not know that in advance. How do I handle that in golang?
To give more context, the real data can look like these in the file:
id=1, name=peter, age=24
id=2, age=30
name=mary, age=31
id=100
name=bob
age=11

You could use a regular expression to change your input strings into valid JSON, the use the encoding/json package to parse it. This has the advantage of letting the json parser take care of everything for you. Here is the regex for your particular case.
Basically, the regex looks for field=value and replaces with "field" : "value" and wraps it in {} to create valid JSON. The commas are left as is.
https://play.golang.org/p/_EEdTB6sve
package main
import (
"encoding/json"
"errors"
"fmt"
"log"
"regexp"
)
var ins = []string{
`id=1, name=peter, age=24`,
`id=2, age=30`,
`name=mary, age=31`,
`id=100`,
`name=bob`,
`age=11`,
}
var ParseError = errors.New("bad parser input")
var Regex *regexp.Regexp
type Test struct {
ID string
Name string
Age string
}
func (t *Test) String() string {
return fmt.Sprintf("ID: %s, Name: %s, Age: %s", t.ID, t.Name, t.Age)
}
func main() {
var err error
Regex, err = regexp.Compile(`([^,\s]*)=([^,\s]*)`)
if err != nil {
log.Panic(err)
}
for _, v := range ins {
test, err := ParseLine(v)
if err != nil {
log.Panic(err)
}
fmt.Println(test)
}
}
func ParseLine(inp string) (*Test, error) {
JSON := fmt.Sprintf("{%s}", Regex.ReplaceAllString(inp, `"$1" : "$2"`))
test := &Test{}
err := json.Unmarshal([]byte(JSON), test)
if err != nil {
return nil, err
}
return test, nil
}
Here is what I believe to be a minimum working case for what you are after, though I am not familiar enough with protocol buffers to get the strings to print right... or even verify if they are correct. Note that this doesn't run in the playground.
package main
import (
"errors"
"fmt"
"log"
"regexp"
"github.com/golang/protobuf/jsonpb"
_ "github.com/golang/protobuf/proto"
)
var ins = []string{
`id=1, name=peter, age=24`,
`id=2, age=30`,
`name=mary, age=31`,
`id=100`,
`name=bob`,
`age=11`,
}
var ParseError = errors.New("bad parser input")
var Regex *regexp.Regexp
type Test struct {
ID *string `protobuf:"bytes,1,opt,name=id,json=id" json:"id,omitempty"`
Name *string `protobuf:"bytes,2,opt,name=name,json=name" json:"name,omitempty"`
Age *string `protobuf:"bytes,3,opt,name=age,json=age" json:"age,omitempty"`
}
func (t *Test) Reset() {
*t = Test{}
}
func (*Test) ProtoMessage() {}
func (*Test) Descriptor() ([]byte, []int) {return []byte{}, []int{0}}
func (t *Test) String() string {
return fmt.Sprintf("ID: %v, Name: %v, Age: %v", t.ID, t.Name, t.Age)
}
func main() {
var err error
Regex, err = regexp.Compile(`([^,\s]*)=([^,\s]*)`)
if err != nil {
log.Panic(err)
}
for _, v := range ins {
test, err := ParseLine(v)
if err != nil {
fmt.Println(err)
log.Panic(err)
}
fmt.Println(test)
}
}
func ParseLine(inp string) (*Test, error) {
JSON := fmt.Sprintf("{%s}", Regex.ReplaceAllString(inp, `"$1" : "$2"`))
test := &Test{}
err := jsonpb.UnmarshalString(JSON, test)
if err != nil {
return nil, err
}
return test, nil
}

Looks like you can write the parser for each line of your input something like the following.
NOTE: I didn't make the struct with proto values because as an external package, it can't be imported in the playground.
https://play.golang.org/p/hLZvbiMMlZ
package main
import (
"errors"
"fmt"
"strings"
)
var ins = []string{
`id=1, name=peter, age=24`,
`id=2, age=30`,
`name=mary, age=31`,
`id=100`,
`name=bob`,
`age=11`,
}
var ParseError = errors.New("bad parser input")
type Test struct {
ID string
Name string
Age string
}
func (t *Test) String() string {
return fmt.Sprintf("ID: %s, Name: %s, Age: %s", t.ID, t.Name, t.Age)
}
func main() {
for _, v := range ins {
t, err := ParseLine(v)
if err != nil {
fmt.Println(err)
} else {
fmt.Println(t)
}
}
}
func ParseLine(inp string) (*Test, error) {
splt := strings.Split(inp, ",")
test := &Test{}
for _, f := range splt {
fieldVal := strings.Split(strings.TrimSpace(f), "=")
switch strings.TrimSpace(fieldVal[0]) {
case "id":
test.ID = strings.TrimSpace(fieldVal[1])
case "name":
test.Name = strings.TrimSpace(fieldVal[1])
case "age":
test.Age = strings.TrimSpace(fieldVal[1])
default:
return nil, ParseError
}
}
return test, nil
}

Related

Dynamically parse yaml field to one of a finite set of structs in Go

I have a yaml file, where one field could be represented by one of possible kinds of structs. To simplify the code and yaml files, let's say I have these yaml files:
kind: "foo"
spec:
fooVal: 4
kind: "bar"
spec:
barVal: 5
And these structs for parsing:
type Spec struct {
Kind string `yaml:"kind"`
Spec interface{} `yaml:"spec"`
}
type Foo struct {
FooVal int `yaml:"fooVal"`
}
type Bar struct {
BarVal int `yaml:"barVal"`
}
I know that I can use map[string]interface{} as a type of Spec field. But the real example is more complex, and involves more possible struct types, not only Foo and Bar, this is why I don't like to parse spec into the field.
I've found a workaround for this: unmarshal the yaml into intermediate struct, then check kind field, and marshal map[string]interface{} field into yaml back, and unmarshal it into concrete type:
var spec Spec
if err := yaml.Unmarshal([]byte(src), &spec); err != nil {
panic(err)
}
tmp, _ := yaml.Marshal(spec.Spec)
if spec.Kind == "foo" {
var foo Foo
yaml.Unmarshal(tmp, &foo)
fmt.Printf("foo value is %d\n", foo.FooVal)
}
if spec.Kind == "bar" {
tmp, _ := yaml.Marshal(spec.Spec)
var bar Bar
yaml.Unmarshal(tmp, &bar)
fmt.Printf("bar value is %d\n", bar.BarVal)
}
But it requires additional step and consumes more memory (real yaml file could be bigger than in examples). Does some more elegant way exist to unmarshal yaml dynamically into a finite set of structs?
Update: I'm using github.com/go-yaml/yaml v2.1.0 Yaml parser.
For use with yaml.v2 you can do the following:
type yamlNode struct {
unmarshal func(interface{}) error
}
func (n *yamlNode) UnmarshalYAML(unmarshal func(interface{}) error) error {
n.unmarshal = unmarshal
return nil
}
type Spec struct {
Kind string `yaml:"kind"`
Spec interface{} `yaml:"-"`
}
func (s *Spec) UnmarshalYAML(unmarshal func(interface{}) error) error {
type S Spec
type T struct {
S `yaml:",inline"`
Spec yamlNode `yaml:"spec"`
}
obj := &T{}
if err := unmarshal(obj); err != nil {
return err
}
*s = Spec(obj.S)
switch s.Kind {
case "foo":
s.Spec = new(Foo)
case "bar":
s.Spec = new(Bar)
default:
panic("kind unknown")
}
return obj.Spec.unmarshal(s.Spec)
}
https://play.golang.org/p/Ov0cOaedb-x
For use with yaml.v3 you can do the following:
type Spec struct {
Kind string `yaml:"kind"`
Spec interface{} `yaml:"-"`
}
func (s *Spec) UnmarshalYAML(n *yaml.Node) error {
type S Spec
type T struct {
*S `yaml:",inline"`
Spec yaml.Node `yaml:"spec"`
}
obj := &T{S: (*S)(s)}
if err := n.Decode(obj); err != nil {
return err
}
switch s.Kind {
case "foo":
s.Spec = new(Foo)
case "bar":
s.Spec = new(Bar)
default:
panic("kind unknown")
}
return obj.Spec.Decode(s.Spec)
}
https://play.golang.org/p/ryEuHyU-M2Z
You can do this by implementing a custom UnmarshalYAML func. However, with the v2 version of the API, you would basically do the same thing as you do now and just encapsulate it a bit better.
If you switch to using the v3 API however, you get a better UnmarshalYAML that actually lets you work on the parsed YAML node before it is processed into a native Go type. Here's how that looks:
package main
import (
"errors"
"fmt"
"gopkg.in/yaml.v3"
)
type Spec struct {
Kind string `yaml:"kind"`
Spec interface{} `yaml:"spec"`
}
type Foo struct {
FooVal int `yaml:"fooVal"`
}
type Bar struct {
BarVal int `yaml:"barVal"`
}
func (s *Spec) UnmarshalYAML(value *yaml.Node) error {
s.Kind = ""
for i := 0; i < len(value.Content)/2; i += 2 {
if value.Content[i].Kind == yaml.ScalarNode &&
value.Content[i].Value == "kind" {
if value.Content[i+1].Kind != yaml.ScalarNode {
return errors.New("kind is not a scalar")
}
s.Kind = value.Content[i+1].Value
break
}
}
if s.Kind == "" {
return errors.New("missing field `kind`")
}
switch s.Kind {
case "foo":
var foo Foo
if err := value.Decode(&foo); err != nil {
return err
}
s.Spec = foo
case "bar":
var bar Bar
if err := value.Decode(&bar); err != nil {
return err
}
s.Spec = bar
default:
return errors.New("unknown kind: " + s.Kind)
}
return nil
}
var input1 = []byte(`
kind: "foo"
spec:
fooVal: 4
`)
var input2 = []byte(`
kind: "bar"
spec:
barVal: 5
`)
func main() {
var s1, s2 Spec
if err := yaml.Unmarshal(input1, &s1); err != nil {
panic(err)
}
fmt.Printf("Type of spec from input1: %T\n", s1.Spec)
if err := yaml.Unmarshal(input2, &s2); err != nil {
panic(err)
}
fmt.Printf("Type of spec from input2: %T\n", s2.Spec)
}
I suggest looking into the possibility of using YAML tags instead of your current structure to model this in your YAML; tags have been designed exactly for this purpose. Instead of the current YAML
kind: "foo"
spec:
fooVal: 4
you could write
--- !foo
fooVal: 4
Now you don't need the describing structure with kind and spec anymore. Loading this would look a bit different as you'd need a wrapping root type you can define UnmarshalYAML on, but it may be feasible if this is just a part of a larger structure. You can access the tag !foo in the yaml.Node's Tag field.

How the docker container id is generated

I wanted to know how the container id is generated so please provide the source code that provides the container id when the docker run is executed?
Here is a code snippet from docker daemon's function for creating Containers:
func (daemon *Daemon) newContainer(name string, config *runconfig.Config, imgID string) (*Container, error) {
var (
id string
err error
)
id, name, err = daemon.generateIDAndName(name)
if err != nil {
return nil, err
}
…
base := daemon.newBaseContainer(id)
…
base.ExecDriver = daemon.execDriver.Name()
return &base, err
}
So, the logic of creating ID and Name is in generateIDAndName function:
func (daemon *Daemon) generateIDAndName(name string) (string, string, error) {
var (
err error
id = stringid.GenerateNonCryptoID()
)
if name == "" {
if name, err = daemon.generateNewName(id); err != nil {
return "", "", err
}
return id, name, nil
}
if name, err = daemon.reserveName(id, name); err != nil {
return "", "", err
}
return id, name, nil
}
Here is stringid sources and the concrete method is generateID with false as input parameter:
func generateID(crypto bool) string {
b := make([]byte, 32)
var r io.Reader = random.Reader
if crypto {
r = rand.Reader
}
for {
if _, err := io.ReadFull(r, b); err != nil {
panic(err) // This shouldn't happen
}
id := hex.EncodeToString(b)
// if we try to parse the truncated for as an int and we don't have
// an error then the value is all numberic and causes issues when
// used as a hostname. ref #3869
if _, err := strconv.ParseInt(TruncateID(id), 10, 64); err == nil {
continue
}
return id
}
}
As you can see, the value is randomly generated with this random
// Reader is a global, shared instance of a pseudorandom bytes generator.
// It doesn't consume entropy.
var Reader io.Reader = &reader{rnd: Rand}

How can I compare two source code files/ ast trees?

I'm generating some source code using the templates package( is there a better method? )and part of the testing I need to check if the output matches the expected source code.
I tried a string comparison but it fails due the extra spaces / new lines generated by the templates package. I've also tried format.Source with not success. ( FAIL)
I tried to parse the ast of the both sources (see bellow) but the ast doesn't match either even if the code is basically same except the new lines / spaces. (FAIL)
package main
import (
"fmt"
"go/parser"
"go/token"
"reflect"
)
func main() {
stub1 := `package main
func myfunc(s string) error {
return nil
}`
stub2 := `package main
func myfunc(s string) error {
return nil
}`
fset := token.NewFileSet()
r1, err := parser.ParseFile(fset, "", stub1, parser.AllErrors)
if err != nil {
panic(err)
}
fset = token.NewFileSet()
r2, err := parser.ParseFile(fset, "", stub2, parser.AllErrors)
if err != nil {
panic(err)
}
if !reflect.DeepEqual(r1, r2) {
fmt.Printf("e %v, r %s, ", r1, r2)
}
}
Playground
Well, one simple way to achieve this is to use the go/printer library, that gives you better control of output formatting, and is basically like running gofmt on the source, normalizing both trees:
package main
import (
"fmt"
"go/parser"
"go/token"
"go/printer"
//"reflect"
"bytes"
)
func main() {
stub1 := `package main
func myfunc(s string) error {
return nil
}`
stub2 := `package main
func myfunc(s string) error {
return nil
}`
fset1 := token.NewFileSet()
r1, err := parser.ParseFile(fset1, "", stub1, parser.AllErrors)
if err != nil {
panic(err)
}
fset2 := token.NewFileSet()
r2, err := parser.ParseFile(fset1, "", stub2, parser.AllErrors)
if err != nil {
panic(err)
}
// we create two output buffers for each source tree
out1 := bytes.NewBuffer(nil)
out2 := bytes.NewBuffer(nil)
// we use the same printer config for both
conf := &printer.Config{Mode: printer.TabIndent | printer.UseSpaces, Tabwidth: 8}
// print to both outputs
if err := conf.Fprint(out1, fset1, r1); err != nil {
panic(err)
}
if err := conf.Fprint(out2, fset2, r2); err != nil {
panic(err)
}
// they should be identical!
if string(out1.Bytes()) != string(out2.Bytes()) {
panic(string(out1.Bytes()) +"\n" + string(out2.Bytes()))
} else {
fmt.Println("A-OKAY!")
}
}
Of course this code needs to be refactored to not look as stupid. Another approach is instead of using DeepEqual, create a tree comparison function yourself, that skips irrelevant nodes.
This was easier than I thought. All I had to do was to remove the empty new lines(after formatting). Below is the code.
package main
import (
"fmt"
"go/format"
"strings"
)
func main() {
a, err := fmtSource(stub1)
if err != nil {
panic(err)
}
b, err := fmtSource(stub2)
if err != nil {
panic(err)
}
if a != b {
fmt.Printf("a %v, \n b %v", a, b)
}
}
func fmtSource(source string) (string, error) {
if !strings.Contains(source, "package") {
source = "package main\n" + source
}
b, err := format.Source([]byte(source))
if err != nil {
return "", err
}
// cleanLine replaces double space with one space
cleanLine := func(s string)string{
sa := strings.Fields(s)
return strings.Join(sa, " ")
}
lines := strings.Split(string(b), "\n")
n := 0
var startLn *int
for _, line := range lines {
if line != "" {
line = cleanLine(line)
lines[n] = line
if startLn == nil {
x := n
startLn = &x
}
n++
}
}
lines = lines[*startLn:n]
// Add final "" entry to get trailing newline from Join.
if n > 0 && lines[n-1] != "" {
lines = append(lines, "")
}
// Make it pretty
b, err = format.Source([]byte(strings.Join(lines, "\n")))
if err != nil {
return "", err
}
return string(b), nil
}

How to parse a method declaration?

I'm trying to parse a method declaration. Basically I need to get the syntax node of the receiver base type (type hello) and the return types (notype and error). The ast package seems straightforward but for some reason I don't get the data I need (i.e. the fields are reported nil).
The only useful data seems provided only in Object -> Decl field which is of type interface{} so I don't think I can serialize it.
Any help would be appreciated. Code below:
package main
import (
"fmt"
"go/ast"
"go/parser"
"go/token"
)
func main() {
// src is the input for which we want to inspect the AST.
src := `
package mypack
// type hello is a cool type
type hello string
// type notype is not that cool
type notype int
// func printme is like nothing else.
func (x *hello)printme(s string)(notype, error){
return 0, nil
}
`
// Create the AST by parsing src.
fset := token.NewFileSet() // positions are relative to fset
f, err := parser.ParseFile(fset, "src.go", src, 0)
if err != nil {
panic(err)
}
// Inspect the AST and find our function
var mf ast.FuncDecl
ast.Inspect(f, func(n ast.Node) bool {
switch x := n.(type) {
case *ast.FuncDecl:
mf = *x
}
return true
})
if mf.Recv != nil {
fmt.Printf("\n receivers:")
for _, v := range mf.Recv.List {
fmt.Printf(",tag %v", v.Tag)
for _, xv := range v.Names {
fmt.Printf("name %v, decl %v, data %v, type %v",
xv.Name, xv.Obj.Decl, xv.Obj.Data, xv.Obj.Type)
}
}
}
}
Playground
To get the type you need to look at the Type attribute which could be an ast.StarExpr or an ast.Ident.
Here take a look at this :
package main
import (
"fmt"
"go/ast"
"go/parser"
"go/token"
)
func main() {
// src is the input for which we want to inspect the AST.
src := `
package mypack
// type hello is a cool type
type hello string
// type notype is not that cool
type notype int
// printme is like nothing else.
func (x *hello)printme(s string)(notype, error){
return 0, nil
}
`
// Create the AST by parsing src.
fset := token.NewFileSet() // positions are relative to fset
f, err := parser.ParseFile(fset, "src.go", src, 0)
if err != nil {
panic(err)
}
// Inspect the AST and find our function
var mf ast.FuncDecl
ast.Inspect(f, func(n ast.Node) bool {
switch x := n.(type) {
case *ast.FuncDecl:
mf = *x
}
return true
})
if mf.Recv != nil {
for _, v := range mf.Recv.List {
fmt.Print("recv type : ")
switch xv := v.Type.(type) {
case *ast.StarExpr:
if si, ok := xv.X.(*ast.Ident); ok {
fmt.Println(si.Name)
}
case *ast.Ident:
fmt.Println(xv.Name)
}
}
}
}

Unmarshalling XML with (xpath)conditions

I'm trying to unmarshall some XML which is structured like the following example:
<player>
<stat type="first_name">Somebody</stat>
<stat type="last_name">Something</stat>
<stat type="birthday">06-12-1987</stat>
</player>
It's dead easy to unmarshal this into a struct like
type Player struct {
Stats []Stat `xml:"stat"`
}
but I'm looking to find a way to unmarshal it into a struct that's more like
type Player struct {
FirstName string `xml:"stat[#Type='first_name']"`
LastName string `xml:"stat[#Type='last_name']"`
Birthday Time `xml:"stat[#Type='birthday']"`
}
is there any way to do this with the standard encoding/xml package?
If not, can you give me a hint how one would split down such a "problem" in go? (best practices on go software architecture for such a task, basically).
thank you!
The encoding/xml package doesn't implement xpath, but does have a simple set of selection methods it can use.
Here's an example of how you could unmarshal the XML you have using encoding/xml. Because the stats are all of the same type, with the same attributes, the easiest way to decode them will be into a slice of the same type. http://play.golang.org/p/My10GFiWDa
var doc = []byte(`<player>
<stat type="first_name">Somebody</stat>
<stat type="last_name">Something</stat>
<stat type="birthday">06-12-1987</stat>
</player>`)
type Player struct {
XMLName xml.Name `xml:"player"`
Stats []PlayerStat `xml:"stat"`
}
type PlayerStat struct {
Type string `xml:"type,attr"`
Value string `xml:",chardata"`
}
And if it's something you need to transform often, you could do the transformation by using an UnamrshalXML method: http://play.golang.org/p/htoOSa81Cn
type Player struct {
XMLName xml.Name `xml:"player"`
FirstName string
LastName string
Birthday string
}
func (p *Player) UnmarshalXML(d *xml.Decoder, start xml.StartElement) error {
for {
t, err := d.Token()
if err == io.EOF {
break
} else if err != nil {
return err
}
if se, ok := t.(xml.StartElement); ok {
t, err = d.Token()
if err != nil {
return err
}
var val string
if c, ok := t.(xml.CharData); ok {
val = string(c)
} else {
// not char data, skip for now
continue
}
// assuming we have exactly one Attr
switch se.Attr[0].Value {
case "first_name":
p.FirstName = val
case "last_name":
p.LastName = val
case "birthday":
p.Birthday = val
}
}
}
return nil
}

Resources