Is there any way to serialize and de-serialize objects (such as
pydrake.trajectories.PiecewisePolynomial, Expression ...) using pickle
or some other way?
It does not complain when I serialize it, but when trying to load from file
it complains:
TypeError: pybind11_object.__new__(pydrake.trajectories.PiecewisePolynomial) is not safe, use object.__new__()
Is there a list of classes you would like to serialize / pickle?
I can create an issue for you, or you can create one if you have a list already in mind.
More background:
Pickling for pybind11 (which is what pydrake uses) has to be defined manually:
https://pybind11.readthedocs.io/en/stable/advanced/classes.html#pickling-support
At present, we don't have a roadmap in Drake to serialize everything, so it's a per-class basis at present.
For example, for pickling RigidTransform: issue link and PR link
A simpler pickling example for CameraInfo: PR link
(FTR, if an object is easily recoverable from it's construction arguments, it should be trivial to define pickling.
Related
What is recommended naming scheme for avro types, so that schema evolution works with backward and forward compatibility and schema imports? How do you name your types? How many Schema.Parser instances do you use? One per schema, one global, or any other scheme?
The namespace / type names don't need a special scheme for naming to address compatibility.
If you need to rename something, that's what aliases are for
From what I've seen, using a parser more than once per schema causes some issues with state maintained by the parser
So technically you have 2 options, each has it's own benefits and drawbacks:
A) do include version identifier into namespace or type name
B) do NOT include version identifier into namespace or type name
Explanation: If you want to use schema evolution, you need not to include version number, as both confluent schema registry and simple object encoding does use namespaces, and uses some sort of hash/modified crc as schema fingerprint. When deserializing bytes, you have to know writer schema, and you can then evolve it into reader schema. These two need not to have same name, as schema resolution does not use namespace or type name. (https://avro.apache.org/docs/current/spec.html#Schema+Resolution) On the otherhand, Schema.Parser cannon parse more than 1 schema, which does have same Name, which is fully qualified type of schema, ie namespace.name. So it depends on your usecase, which one do you want to use, both can be used.
ad A) if you do include version identifier, you will be able to parse both(or all) version using same Schema.Parser, which means, that for example these schemas will be processable together in maven-avro-plugin (sorry I do not remember, whether I tested it in single configuration only, or if I did use multiple configurations also, you have to check it yourself). Another benefit is, that you can reference same type in different versions if needed. Drawback is, that after each version upgrade, the namespace and/or type name changes, and you would have to change imports in project. Schema resolution between writer and reader schema should work, and hopefully it will.
ad B) if you do not include version identifier, only one version could be compiled by avro-maven-plugin into java files, and you cannot have one global Schema.Parser instance in project. Why you would like to have just one global instance? It would be helpful if you don't follow bad&frequest advices to use top-level union to define multiple types in one avsc file. Well, maybe it's needed in confluent registry, but if you don't use that one, you definitely don't have to use top-level union. One can use schema imports, when Schema.Parser need to process all imports first and then finally the actual type. If you use these imports, then you have to use one Schema.Parser instance for each group of type+its imports. It's little bit declarational hassle, but it relieves you from having top-level union, which has issues on its own, and it's incorrect in principle. But if your project don't need multiple versions of same schema accessible at the same time, it's probably better than A) variant, as you don't have to change imports. Also there is opened possibility of composition of schemas if you use imports. As all versions have same namespace, you can pass arbitrary version to Schema.Parser. So if there is some a-->b association in types, one can use v2 b and use it with v3 a. Not sure if that is typical usecase, but it's possible.
I'm working with IIB v9 mxsd message definitions. I'd like to define one of the XML elements to be of type xsd:anyType. However, in the list of types I can choose from, only anySimpleType and anyUri are possible (besides all other types like string, integer, etc.).
How can I get around this limitation?
The XMLNSC parser supports the entire XML Schema specification, including xs:any and xs:anyType. In IIBv9 you should create a Library and import your xsds into it. Link your Application to the Library and the XMLNSC parser will find and use the model. You do not need to specify the name of the Library in the node properties; the XSD model will be automatically available to the entire application.
You do not need to use a message set at all in IIBv9 and later versions.
The mxsd file format is used only by the MRM (not DFDL) parser.
You shouldn't use an MXSD to model your XML data, use a normal XSD.
MXSD is for modelling data for the DFDL parser, but you should use the XMLNSC parser for XML messages and define them in XSDs, in which you can use anyType.
As far as I know DFDL doesn't support anyType.
I'm trying to use different coders for the same class for two different scenarios:
Reading from JSON input files - using data = TextIO.Read.from(options.getInput()).withCoder(new Coder1())
Elsewhere in the job I want the class to be persisted using SerializableCoder using data.setCoder(SerializableCoder.of(MyClass.class)
It works locally, but fails when run in the cloud with
Caused by: java.io.StreamCorruptedException: invalid stream header: 7B227365.
Is it a supported scenario? The reason to do this in the first place is to avoid read/write of JSON format, and on the other hand make reading from input files more efficient (UTF-8 parsing is part of the JSON reader, so it can read from InputStream directly)
Clarifications:
Coder1 is my coder.
The other coder is a SerializableCoder.of(MyClass.class)
How does the system choose which coder to use? The two formats are binary incompatible, and it looks like due to some optimization, the second coder is used for data format which can only be read by the first coder.
Yes, using two different coders like that should work. (With the caveat that the coder in #2 will only be used if the system choses to persist 'data' instead of optimizing it into surround computations.)
Are you using your own Coders or ones provided by the Dataflow SDK? Quick caveat on TextIO -- because it uses newlines to encode element boundaries, you'll get into trouble if you use a coder that produces encoded values containing something that can be mistaken for a newline. You really should only use textual encodings within TextIO. We're hoping to make that clearer in the future.
I'm currently writing some functions that are related to lists that I could possibly be reused.
My question is:
Are there any conventions or best practices for organizing such functions?
To frame this question, I would ideally like to "extend" the existing lists module such that I'm calling my new function the following way: lists:my_funcion(). At the moment I have lists_extensions:my_function(). Is there anyway to do this?
I read about erlang packages and that they are essentially namespaces in Erlang. Is it possible to define a new namespace for Lists with new Lists functions?
Note that I'm not looking to fork and change the standard lists module, but to find a way to define new functions in a new module also called Lists, but avoid the consequent naming collisions by using some kind namespacing scheme.
Any advice or references would be appreciated.
Cheers.
To frame this question, I would ideally like to "extend" the existing lists module such that I'm calling my new function the following way: lists:my_funcion(). At the moment I have lists_extensions:my_function(). Is there anyway to do this?
No, so far as I know.
I read about erlang packages and that they are essentially namespaces in Erlang. Is it possible to define a new namespace for Lists with new Lists functions?
They are experimental and not generally used. You could have a module called lists in a different namespace, but you would have trouble calling functions from the standard module in this namespace.
I give you reasons why not to use lists:your_function() and instead use lists_extension:your_function():
Generally, the Erlang/OTP Design Guidelines state that each "Application" -- libraries are also an application -- contains modules. Now you can ask the system what application did introduce a specific module? This system would break when modules are fragmented.
However, I do understand why you would want a lists:your_function/N:
It's easier to use for the author of your_function, because he needs the your_function(...) a lot when working with []. When another Erlang programmer -- who knows the stdlb -- reads this code, he will not know what it does. This is confusing.
It looks more concise than lists_extension:your_function/N. That's a matter of taste.
I think this method would work on any distro:
You can make an application that automatically rewrites the core erlang modules of whichever distribution is running. Append your custom functions to the core modules and recompile them before compiling and running your own application that calls the custom functions. This doesn't require a custom distribution. Just some careful planning and use of the file tools and BIFs for compiling and loading.
* You want to make sure you don't append your functions every time. Once you rewrite the file, it will be permanent unless the user replaces the file later. Could use a check with module_info to confirm of your custom functions exist to decide if you need to run the extension writer.
Pseudo Example:
lists_funs() -> ["myFun() -> <<"things to do">>."].
extend_lists() ->
{ok, Io} = file:open(?LISTS_MODULE_PATH, [append]),
lists:foreach(fun(Fun) -> io:format(Io,"~s~n",[Fun]) end, lists_funs()),
file:close(Io),
c(?LISTS_MODULE_PATH).
* You may want to keep copies of the original modules to restore if the compiler fails that way you don't have to do anything heavy if you make a mistake in your list of functions and also use as source anytime you want to rewrite the module to extend it with more functions.
* You could use a list_extension module to keep all of the logic for your functions and just pass the functions to list in this function using funName(Args) -> lists_extension:funName(Args).
* You could also make an override system that searches for existing functions and rewrites them in a similar way but it is more complicated.
I'm sure there are plenty of ways to improve and optimize this method. I use something similar to update some of my own modules at runtime, so I don't see any reason it wouldn't work on core modules also.
i guess what you want to do is to have some of your functions accessible from the lists module. It is good that you would want to convert commonly used code into a library.
one way to do this is to test your functions well, and if their are fine, you copy the functions, paste them in the lists.erl module (WARNING: Ensure you do not overwrite existing functions, just paste at the end of the file). this file can be found in the path $ERLANG_INSTALLATION_FOLDER/lib/stdlib-{$VERSION}/src/lists.erl. Make sure that you add your functions among those exported in the lists module (in the -export([your_function/1,.....])), to make them accessible from other modules. Save the file.
Once you have done this, we need to recompile the lists module. You could use an EmakeFile. The contents of this file would be as follows:
{"src/*", [verbose,report,strict_record_tests,warn_obsolete_guard,{outdir, "ebin"}]}.
Copy that text into a file called EmakeFile. Put this file in the path: $ERLANG_INSTALLATION_FOLDER/lib/stdlib-{$VERSION}/EmakeFile.
Once this is done, go and open an erlang shell and let its pwd(), the current working directory be the path in which the EmakeFile is, i.e. $ERLANG_INSTALLATION_FOLDER/lib/stdlib-{$VERSION}/.
Call the function: make:all() in the shell and you will see that the module lists is recompiled. Close the shell.
Once you open a new erlang shell, and assuming you exported you functions in the lists module, they will be running the way you want, right in the lists module.
Erlang being open source allows us to add functionality, recompile and reload the libraries. This should do what you want, success.
For my project, I need to store translations in the database, for which I've implemented doctrine data source. However, I would like to leave standard translations (sf_admin and messages) in xml and keep them under source control. Is it possible to have 2 i18n instances that use different data sources? Or maybe one instance that can load data from different sources according to dictionary name?
I don't think there is a solution that doesn't require overriding sfI18n. An sfMessageSource_Aggregate exists but it seems nearly impossible to configure factories.yml to initialize it correctly.
You probably need to implement your own sfI18n::createMessageSource, that constructs the Aggregate passing the different sources in the constructor.