I am writing on a Xtext grammar that uses XExpressions and also operates on Eclasses. Now I want to also be able to access Eclasses from the XExpression, for example I write an expression like this:
Eclass1.attribute1 = Eclass2.attribute1
I would like to know, how I can use the Eclass from within the XExpression?
Grammar
grammar org.xtext.example.mydsl.Mydsl with
org.eclipse.xtext.xbase.Xbase
import "http://www.eclipse.org/emf/2002/Ecore" as ecore
generate mydsl "http://www.xtext.org/example/mydsl/Mydsl"
Model:
(operations += Operation)*;
terminal ATTR : ID ('.' ID)+;
Operation:
'operation' left=[ecore::EClass|ATTR] 'and' right=
[ecore::EClass|ATTR] 'defined' 'as' condition=XExpression
;
Inferrer/ Infer method
def dispatch void infer(Model element, IJvmDeclaredTypeAcceptor acceptor, boolean isPreIndexingPhase) {
acceptor.accept(element.toClass("example.mydsl")) [
for (operation : element.operations) {
left = operation.left
right = operation.right
if (left.eIsProxy()) {
left = EcoreUtil.resolve(left, operation) as EClass
}
if (right.eIsProxy()) {
right = EcoreUtil.resolve(right, operation) as EClass
}
//field for right class left out, but works the same
members += left.toField(left.name,typeRef(left.EPackage.name+"."+left.name))
members += operation.toMethod("conditionExpr",
typeRef(Void.TYPE)) [
body = operation.condition
]
}
]
}
RuntimeModule
class MyDslRuntimeModule extends AbstractMyDslRuntimeModule {
def Class<? extends ImplicitlyImportedFeatures> bindImplicitlyImportedTypes() {
return MyImportFeature
}
}
MyImportFeature
class MyImportFeature extends ImplicitlyImportedFeatures{
override protected getStaticImportClasses() {
(super.getStaticImportClasses() + #[PackageFromWorkSpace]).toList
}
}
I Am not sure if i get your question.
Ususally EMF generates constants for EAttributes so if you want to access the attributes themselfs
so you could either do
MyDslPackage.Literals.GREETING__NAME
or
MyDslPackage.eINSTANCE.getGreeting_Name()
can you give some more hints on what you actually want to do
update: here is a snippet on how to get a java class from a reference to an eclass
Thingy:{
val EClass eclazz = f.clazz
val uri = EcorePlugin.getEPackageNsURIToGenModelLocationMap(true).get(eclazz.EPackage.nsURI)
val rs = new ResourceSetImpl
val r = rs.getResource(uri, true)
r.load(null)
val p = r.contents.head
if (p instanceof GenModel) {
val genClass = p.findGenClassifier(eclazz)
if (genClass instanceof GenClass) {
println(genClass.qualifiedInterfaceName)
members+=f.toField(eclazz.name, genClass.qualifiedInterfaceName.typeRef)
}
}
}
Related
The question: how do i configure Xtext and Xbase in order to use in my DSL file (the one with DSL extension, ".myx") classes that are not yet generated by JvmModelInferrer?
Here is the language grammar:
grammar org.xtext.example.mydsl.MyX with org.eclipse.xtext.xbase.Xbase
generate myX "http://www.xtext.org/example/mydsl/MyX"
import "http://www.eclipse.org/xtext/xbase/Xbase" as xbase
Model:
expressions+=CommonExpression*;
CommonExpression:
Anime | AnimeResource
;
AnimeResource:
'AnimeRes' name=ID '{'
(args+=FullJvmFormalParameter)*
'}'
;
Anime:
'watch' name=ID body=XBlockExpression
;
Here is what i want to achive (test.myx):
AnimeRes Resource {
}
watch Watcher {
val someStub = Resource.create()
}
So the dsl file looks like there is a static method defined for Resource class. But in reality, there must be additional parameters that should be passed to Resource, they are purely boilerplate in my case, that's why I don't want to pass them into "create" each time.
How I want the generated file look like to achieve that:
package test;
public class Model {
private int id= 0;
public static class Resource {
private int id;
public Resource(final int id) {
this.id = id;
}
}
public class ResourceCreator {
public Resource create() {
return new Resource(id /* the creator is inner non-static class */));
}
}
public ResourceCreator Resource = new ResourceCreator();
}
That way I'm kind of cheating. I have a variable that has the name of the class, and in the client code it looks like they use static method when they are really just using a builder that is named like the class. Here is the JvmModelInferrer to make similarly looking file:
import org.eclipse.xtext.xbase.jvmmodel.AbstractModelInferrer
import org.xtext.example.mydsl.myX.Model
import org.eclipse.xtext.xbase.jvmmodel.IJvmDeclaredTypeAcceptor
import org.eclipse.xtext.naming.QualifiedName
import com.google.inject.Inject
import org.eclipse.xtext.xbase.jvmmodel.JvmTypesBuilder
import org.xtext.example.mydsl.myX.AnimeResource
import org.eclipse.xtext.common.types.JvmVisibility
import org.xtext.example.mydsl.myX.Anime
class MyXJvmModelInferrer extends AbstractModelInferrer {
#Inject extension JvmTypesBuilder
def dispatch void infer(Model element, IJvmDeclaredTypeAcceptor acceptor, boolean isPreIndexingPhase) {
acceptor.accept(element.toClass(QualifiedName.create("test", "Model"))) [
for (expression : element.expressions) {
switch (expression ) {
AnimeResource: {
members += expression.toClass(expression.name) [
static = true
visibility = JvmVisibility.PUBLIC
val _members = members
expression.args.forEach [
_members += expression.toField(name, parameterType) [
static = false
visibility = JvmVisibility.PUBLIC
]
]
members += expression.toField("id", typeRef(int))
members += expression.toConstructor [
val _parameters = parameters
expression.args.forEach [
_parameters += it.toParameter(name, parameterType)
]
_parameters += expression.toParameter("id", typeRef(int))
body = '''
«FOR param : parameters»this.«param.name» = «param.name»;
«ENDFOR»
'''
]
]
members += expression.toField("id", typeRef(int))
members += element.toClass(expression.name + "Creator") [
static = false
visibility = JvmVisibility.PUBLIC
members += element.toMethod("create", typeRef(expression.name)) [
val parameters = parameters
expression.args.forEach [
parameters += it.toParameter(name, parameterType)
]
body = '''
return new «expression.name»(«FOR param : parameters»«IF parameters.indexOf(param) < parameters.size - 1», «ENDIF»«ENDFOR»id);
'''
]
]
members += expression.toField(expression.name, typeRef(expression.name + "Creator")) [
visibility = JvmVisibility.PUBLIC
initializer = '''
new «expression.name + "Creator"»()
'''
]
}
Anime: {
members += expression.toMethod(expression.name, typeRef(void)) [
body = expression.body
]
}
}
}
]
}
}
The problem that I've faced with this approach:
So it seems that some linking fails but I can not understand what I should do to fix that and what bindings I should override and how.
Any help would be appreciated.
UPD. Updated the description with compilable ModelInferrer (sorry). The problem happens when I try to use XBlockExpression of watch block to generate Java code for a method inside of Model class. So if I have such DSL file:
AnimeRes Resource {
}
watch Watcher {
val some = Resource.create()
}
AND also use the Anime branch in the Inferrer, the described problem happens.
If I have the same file and do not use the Anime branch (commented out like this):
// Anime: {
// members += expression.toMethod(expression.name, typeRef(void)) [
// body = expression.body
// ]
// }
then there is no problem but I need to generate that method.
you need to use proper names for the inner types
members += element.toClass(expression.name + "Creator") [
static = false
visibility = JvmVisibility.PUBLIC
members += element.toMethod("create", typeRef("test.Model$"+expression.name)) [
val parameters = parameters
expression.args.forEach [
parameters += it.toParameter(name, parameterType)
]
body = '''
return new «expression.name»(«FOR param : parameters»«IF parameters.indexOf(param) < parameters.size - 1», «ENDIF»«ENDFOR»id);
'''
]
]
members += expression.toField(expression.name, typeRef("test.Model$"+expression.name + "Creator")) [
visibility = JvmVisibility.PUBLIC
initializer = '''
new «expression.name + "Creator"»()
'''
]
I am struggling to validate (non-duplication) globally, across multiple files that do not explicitly reference each other.
Consider the standard initally-generated grammar
grammar org.xtext.example.mydsl.MyDsl with org.eclipse.xtext.common.Terminals
generate myDsl "http://www.xtext.org/example/mydsl/MyDsl"
Model:
greetings+=Greeting*;
Greeting:
'Hello' name=ID '!';
It is simple to validate that no file contains greeting for the same name.
package org.xtext.example.mydsl.validation
import org.eclipse.xtext.validation.Check
import org.xtext.example.mydsl.myDsl.Greeting
import org.xtext.example.mydsl.myDsl.Model
import org.xtext.example.mydsl.myDsl.MyDslPackage
class MyDslValidator extends AbstractMyDslValidator {
public static val LOCALLY_DUPLICATE_NAME = 'LOCALLY_DUPLICATE_NAME'
#Check
def checkGreetingLocallyUnique(Greeting greeting) {
for(greeting_ : (greeting.eContainer as Model).greetings) {
if(greeting!==greeting_ && greeting.name==greeting_.name) {
warning('Greeting duplication',
MyDslPackage.Literals.GREETING__NAME,
LOCALLY_DUPLICATE_NAME)
}
}
}
}
I do not understand how to validate non-duplication against all the files known to the global-index.
The stub of the method is
#Check
def checkGreetingGloballyUnique(Greeting greeting) {
for(greeting_ : /*???*/ ) {
if(greeting!==greeting_ && greeting.name==greeting_.name) {
warning('Global Greeting duplication',
MyDslPackage.Literals.GREETING__NAME,
GLOBALLY_DUPLICATE_NAME)
}
}
}
How do I get access to the global index from within the validator?
the easiest way for a local duplicate validation is to enable it in the workflow and regenerate the language (this does not check locally though)
validator = {
composedCheck = "org.eclipse.xtext.validation.NamesAreUniqueValidator"
}
to search the index
#Inject
IContainer.Manager containermanager;
#Inject
ResourceDescriptionsProvider resourceDescriptionsProvider;
public .... getAllEntitiesFor( EObject eObject ) {
....
IResourceDescriptions resourceDescriptions = resourceDescriptionsProvider.getResourceDescriptions( eObject.eResource() );
IResourceDescription resourceDescription = resourceDescriptions.getResourceDescription( eObject.eResource().getURI() );
List<IContainer> visiblecontainers = containermanager.getVisibleContainers( resourceDescription, resourceDescriptions );
for (IContainer container : visiblecontainers) {
for (IEObjectDescription eobjectDescription : container.getExportedObjects()) {
EObject eObjectOrProxy = eobjectDescription.getEObjectOrProxy();
.....
}
}
....
}
After much hacking, I obtained the following.
public static val GLOBALLY_DUPLICATE_NAME = 'GLOBALLY_DUPLICATE_NAME'
#com.google.inject.Inject
IResourceDescriptions iResourceDescriptions
#Inject
Provider<XtextResourceSet> resourceSetProvider;
#Check
def checkGreetingGloballyUnique(Greeting greeting) {
for (resourceDescriptions : iResourceDescriptions.allResourceDescriptions) {
for (_greetingDescription : resourceDescriptions.getExportedObjectsByType(MyDslPackage.Literals.GREETING)) {
val _greeting = resourceSetProvider.get.getEObject(_greetingDescription.EObjectURI, true) as Greeting
// don't use equality, ALWAYS not equal!!
if (greeting.eResource.URI != _greeting.eResource.URI) {
// this means distinct files, all greetings in same file have same uri
if (greeting.name == _greeting.name) {
warning('Global greeting duplication', MyDslPackage.Literals.GREETING__NAME,
LOCALLY_DUPLICATE_NAME)
}
}
}
}
}
Rewrite, based on #Christian Dietrich's comments, I have the following solution.
#Inject
IContainer.Manager containerManager;
#com.google.inject.Inject
IResourceDescriptions resourceDescriptions
#Inject
Provider<XtextResourceSet> resourceSetProvider;
#Check
def checkGreetingGloballyUnique(Greeting greeting) {
var greeting_description = resourceDescriptions.getResourceDescription(greeting.eResource.URI)
var visibleContainers = containerManager.getVisibleContainers(greeting_description, resourceDescriptions)
for (visibleContainer : visibleContainers) {
for (_greetingDescription : visibleContainer.getExportedObjectsByType(MyDslPackage.Literals.GREETING)) {
val _greeting = resourceSetProvider.get.getEObject(_greetingDescription.EObjectURI, true) as Greeting
// don't use equality, ALWAYS greeting != _greeting !!
if (greeting.eResource.URI != _greeting.eResource.URI) {
// this means distinct files, all greetings in same file have same uri
if (greeting.name == _greeting.name) {
warning('Global greeting duplication', MyDslPackage.Literals.GREETING__NAME,
GLOBALLY_DUPLICATE_NAME)
}
}
}
}
}
I have a DSL that includes blocks that need to be wrapped as methods returned inside an anonymous class created by the generated code. For example:
model {
task {
val x = 2*5;
Math.pow(2, x)
}
}
should compile to (note task becoming an instance of Runnable, with the body of the task becoming the body of the Runnable.run() method):
import java.util.Collection;
#SuppressWarnings("all")
public class MyFile {
public Collection<Runnable> tasks() {
ArrayList<Runnable> tasks = new ArrayList<>();
tasks.add(getTask0());
return tasks;
}
public static Runnable getTask0() {
Runnable _runnable = new Runnable() {
public void run() {
final int x = (2 * 5);
Math.pow(2, x);
}
}
return _runnable;
}
}
Following the discussion in this question, I was able to get this particular example to work. (Github repo includes unit tests.) But I had to do it by representing the Task element in the grammar as a sequence of XExpressions (source), which my XbaseCompiler subclass had to iterate over (source).
Instead, it would have been nice to be able to just have Task contain an XBlockExpression in a property action, and then in the compiler just do doInternalToJavaStatement(expr.action, it, isReferenced). My sense is that this is really the "right" solution in my case, but when I tried it, this would result in an empty body of the generated run method, as if the block was not processed at all. What's going on, and am I missing some required bits of setup/wiring things together/bindings that are necessary for this to work?
you ususally try to avoid that by using a better inference strategy e.g.
Grammar
Model:
{Model}"model" "{"
vars+=Variable*
tasks+=Task*
"}"
;
Variable:
"var" name=ID ":" type=JvmParameterizedTypeReference
;
Task:
{Task} "task" content=XBlockExpression
;
Inferrer
class MyDslJvmModelInferrer extends AbstractModelInferrer {
#Inject extension JvmTypesBuilder
def dispatch void infer(Model element, IJvmDeclaredTypeAcceptor acceptor, boolean isPreIndexingPhase) {
acceptor.accept(element.toClass("test.Model2")) [
for (v : element.vars) {
members+=v.toField(v.name, v.type.cloneWithProxies) [
]
}
var i = 0;
for (t : element.tasks) {
val doRunName = "doRun"+i
members += t.toMethod("task"+i, Runnable.typeRef()) [
body = '''
return new «Runnable» () {
public void run() {
«doRunName»();
}
};
'''
]
members += t.toMethod(doRunName, Void.TYPE.typeRef()) [
body = t.content
]
i = i + 1
}
]
}
}
and that basically is it.
you may follow https://bugs.eclipse.org/bugs/show_bug.cgi?id=481992
If you really want to adapt the xbase typesystem that may be a lot more of work e.g. (just covering a minimal case)
Grammar
Model:
{Model}"model" "{"
vars+=Variable*
tasks+=Task*
"}"
;
Variable:
"var" name=ID ":" type=JvmParameterizedTypeReference
;
Task:
{Task} "task" content=XTaskContent
;
XTaskContent returns xbase::XExpression:
{XTaskContent} block=XBlockExpression
;
Inferrer
class MyDslJvmModelInferrer extends AbstractModelInferrer {
#Inject extension JvmTypesBuilder
def dispatch void infer(Model element, IJvmDeclaredTypeAcceptor acceptor, boolean isPreIndexingPhase) {
acceptor.accept(element.toClass("test.Model")) [
for (v : element.vars) {
members+=v.toField(v.name, v.type.cloneWithProxies) [
]
}
var i = 0;
for (t : element.tasks) {
members += t.toMethod("task"+i, Runnable.typeRef()) [
body = t.content
]
i = i + 1
}
]
}
}
Type Computer
class MyDslTypeComputer extends XbaseTypeComputer {
override computeTypes(XExpression expression, ITypeComputationState state) {
if (expression instanceof XTaskContent) {
_computeTypes(expression as XTaskContent, state);
} else {
super.computeTypes(expression, state)
}
}
protected def void _computeTypes(XTaskContent object, ITypeComputationState state) {
state.withExpectation(getPrimitiveVoid(state)).computeTypes(object.block)
state.acceptActualType(getTypeForName(Runnable, state), ConformanceFlags.CHECKED_SUCCESS )
}
}
Compiler
class MyDslCompiler extends XbaseCompiler {
override protected internalToConvertedExpression(XExpression obj, ITreeAppendable appendable) {
if (obj instanceof XTaskContent) {
appendable.append("new ").append(Runnable).append("() {").newLine
appendable.increaseIndentation
appendable.append("public void run()").newLine
reassignThisInClosure(appendable, null)
internalToJavaStatement(obj.block, appendable, false)
appendable.newLine
appendable.decreaseIndentation
appendable.newLine.append("}")
} else {
super.internalToConvertedExpression(obj, appendable)
}
}
}
Bindings
class MyDslRuntimeModule extends AbstractMyDslRuntimeModule {
def Class<? extends ITypeComputer> bindITypeComputer() {
return MyDslTypeComputer
}
def Class<? extends XbaseCompiler> bindXbaseCompiler() {
return MyDslCompiler
}
}
I have a certain part of my XText grammer that defines a block for classes that shall print all its expressions. The XText grammar part for this looks as follows:
Print:
{Print}
'print' '{'
print += PrintLine*
'}';
PrintLine:
obj += XExpression;
Now I use the following inferrer code to create a print() method:
Print: {
members += feature.toMethod('print', typeRef(void)) [
body = '''
«FOR printline : feature.print»
System.out.println(«printline.obj»);
«ENDFOR»
'''
]
}
Ok, I go ahead and test it with the following code in a class:
print {
"hallo"
4
6 + 7
}
And the result is the following:
public void print() {
System.out.println([org.eclipse.xtext.xbase.impl.XStringLiteralImpl#20196ba8 (value: hallo)]);
System.out.println([org.eclipse.xtext.xbase.impl.XNumberLiteralImpl#7d0b0f7d (value: 4)]);
System.out.println([<XNumberLiteralImpl> + <XNumberLiteralImpl>]);}
Of course, I was hoping for:
public void print() {
System.out.println("hallo");
System.out.println(4);
System.out.println(6+7);
}
I understand that I might have to call the compiler somehow in the inferrer for «printline.obj», but I am really not sure how.
i think you are doing this on a wrong basis. this sounds to me like an extension of xbase, not only a simple use.
import "http://www.eclipse.org/xtext/xbase/Xbase" as xbase
Print:
{Print}
'print'
print=XPrintBlock
;
XPrintBlock returns xbase::XBlockExpression:
{xbase::XBlockExpression}'{'
expressions+=XPrintLine*
'}'
;
XPrintLine returns xbase::XExpression:
{PrintLine} obj=XExpression
;
Type Computer
class MyDslTypeComputer extends XbaseTypeComputer {
def dispatch computeTypes(XPrintLine literal, ITypeComputationState state) {
state.withNonVoidExpectation.computeTypes(literal.obj)
state.acceptActualType(getPrimitiveVoid(state))
}
}
Compiler
class MyDslXbaseCompiler extends XbaseCompiler {
override protected doInternalToJavaStatement(XExpression obj, ITreeAppendable appendable, boolean isReferenced) {
if (obj instanceof XPrintLine) {
appendable.trace(obj)
appendable.append("System.out.println(")
internalToJavaExpression(obj.obj,appendable);
appendable.append(");")
appendable.newLine
return
}
super.doInternalToJavaStatement(obj, appendable, isReferenced)
}
}
XExpressionHelper
class MyDslXExpressionHelper extends XExpressionHelper {
override hasSideEffects(XExpression expr) {
if (expr instanceof XPrintLine || expr.eContainer instanceof XPrintLine) {
return true
}
super.hasSideEffects(expr)
}
}
JvmModelInferrer
def dispatch void infer(Print print, IJvmDeclaredTypeAcceptor acceptor, boolean isPreIndexingPhase) {
acceptor.accept(
print.toClass("a.b.C") [
members+=print.toMethod("demo", Void.TYPE.typeRef) [
body = print.print
]
]
)
}
Bindings
class MyDslRuntimeModule extends AbstractMyDslRuntimeModule {
def Class<? extends ITypeComputer> bindITypeComputer() {
MyDslTypeComputer
}
def Class<? extends XbaseCompiler> bindXbaseCompiler() {
MyDslXbaseCompiler
}
def Class<? extends XExpressionHelper> bindXExpressionHelper() {
MyDslXExpressionHelper
}
}
I'm trying to use a custom Coder so that I can do some transforms, but I'm having trouble getting the PCollection to use my custom coder, and I suspect (???) it's because it's wrapped in a KV. Specifically:
Pipeline p = Pipeline.create ...
p.getCoderRegistry().registerCoder(MyClass.class, MyClassCoder.class);
...
PCollection<String> input = ...
PCollection<KV<String, MyClass>> t = input.apply(new ToKVTransform());
When I try to run something like this, I get a java.lang.ClassCastException and a stacktrace that includes a SerializableCoder instead of MyClassCoder like I would expect.
[error] at com.google.cloud.dataflow.sdk.coders.SerializableCoder.decode(SerializableCoder.java:133)
[error] at com.google.cloud.dataflow.sdk.coders.SerializableCoder.decode(SerializableCoder.java:50)
[error] at com.google.cloud.dataflow.sdk.coders.KvCoder.decode(KvCoder.java:95)
[error] at com.google.cloud.dataflow.sdk.coders.KvCoder.decode(KvCoder.java:42)
I see that the answer to another, somewhat related question (Using TextIO.Write with a complicated PCollection type in Google Cloud Dataflow) says to map everything to strings, and use that to pass stuff around PCollections. Is that really the recommended way??
(Note: the actual code is in Scala, but I'm pretty sure it's not a Scala <=> Java issue so I've translated it into Java here.)
Update to include Scala code and more background:
So this is the actual exception itself (should have included this at the beginning):
java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.HashMap$SerializationProxy to field com.example.schema.Schema.keyTypes of type scala.collection.immutable.Map in instance of com.example.schema.Schema
Where com.example.schema.Schema is:
case class Schema(id: String, keyTypes: Map[String, Type])
And lastly, the SchemaCoder is:
class SchemaCoder extends com.google.cloud.dataflow.sdk.coders.CustomCoder[Schema] {
def decode(inputStream: InputStream, context: Context): Schema = {
val ois = new ObjectInputStream(inputStream)
val id: String = ois.readObject().asInstanceOf[String]
val javaMap: java.util.Map[String, Type] = ois.readObject().asInstanceOf[java.util.Map[String, Type]]
ois.close()
Schema(id, javaMap.asScala.toMap)
}
def encode(schema: Schema, outputStream: OutputStream, context: Context): Unit = {
val baos = new ByteArrayOutputStream()
val oos = new ObjectOutputStream(baos)
oos.writeObject(schema.id)
val javaMap: java.util.Map[String, Type] = schema.keyTypes.asJava
oos.writeObject(javaMap)
oos.close()
val encoded = new String(Base64.encodeBase64(baos.toByteArray()))
outputStream.write(encoded.getBytes())
}
}
====
Edit2: And here's what ToKVTransform actually looks like:
class SchemaExtractorTransform extends PTransform[PCollection[String], PCollection[Schema]] {
class InferSchemaFromStringWithKeyFn extends DoFn[String, KV[String, Schema]] {
override def processElement(c: DoFn[String, KV[String, Schema]]#ProcessContext): Unit = {
val line = c.element()
inferSchemaFromString(line)
}
}
class GetFirstFn extends DoFn[KV[String, java.lang.Iterable[Schema]], Schema] {
override def processElement(c: DoFn[KV[String, java.lang.Iterable[Schema]], Schema]#ProcessContext): Unit = {
val idAndSchemas: KV[String, java.lang.Iterable[Schema]] = c.element()
val it: java.util.Iterator[Schema] = idAndSchemas.getValue().iterator()
c.output(it.next())
}
}
override def apply(inputLines: PCollection[String]): PCollection[Schema] = {
val schemasWithKey: PCollection[KV[String, Schema]] = inputLines.apply(
ParDo.named("InferSchemas").of(new InferSchemaFromStringWithKeyFn())
)
val keyed: PCollection[KV[String, java.lang.Iterable[Schema]]] = schemasWithKey.apply(
GroupByKey.create()
)
val schemasOnly: PCollection[Schema] = keyed.apply(
ParDo.named("GetFirst").of(new GetFirstFn())
)
schemasOnly
}
}
This problem doesn't reproduce in Java; Scala is doing something differently with types that breaks Dataflow coder inference. To work around this, you can call setCoder on a PCollection to set its Coder explicitly, such as
schemasWithKey.setCoder(KvCoder.of(StringUtf8Coder.of(), SchemaCoder.of());
Here's the Java version of your code, just to make sure that it's doing approximately the same thing:
public static class SchemaExtractorTransform
extends PTransform<PCollection<String>, PCollection<Schema>> {
class InferSchemaFromStringWithKeyFn extends DoFn<String, KV<String, Schema>> {
public void processElement(ProcessContext c) {
c.output(KV.of(c.element(), new Schema()));
}
}
class GetFirstFn extends DoFn<KV<String, java.lang.Iterable<Schema>>, Schema> {
private static final long serialVersionUID = 0;
public void processElement(ProcessContext c) {
c.output(c.element().getValue().iterator().next());
}
}
public PCollection<Schema> apply(PCollection<String> inputLines) {
PCollection<KV<String, Schema>> schemasWithKey = inputLines.apply(
ParDo.named("InferSchemas").of(new InferSchemaFromStringWithKeyFn()));
PCollection<KV<String, java.lang.Iterable<Schema>>> keyed =
schemasWithKey.apply(GroupByKey.<String, Schema>create());
PCollection<Schema> schemasOnly =
keyed.apply(ParDo.named("GetFirst").of(new GetFirstFn()));
return schemasOnly;
}
}