Is xml header necessary in ssml request in the SSML Specification? - xml-parsing

I wonder if we strictly need to specify xml header in our requests to ssml synthesizers like the following:
<?xml version=\"1.0\"?><speak>hello world</speak>
or does the SSML standard also allow this?:
<speak>hello world</speak>

IN SSML specification there is a section https://www.w3.org/TR/speech-synthesis/#S2.1 that says:
A legal stand-alone Speech Synthesis Markup Language document must
have a legal XML Prolog [XML ยง2.8].
For XML 1.0 both <?xml version=\"1.0\"?><speak>hello world</speak> and <speak>hello world</speak> are well-formed, so it's not strickly needed to specify a xml header.

Azure TTS: Yes, xml header required
Google TTS: No

Related

OpenAPI 3, file download, content-type when content is not known in advance [duplicate]

Do I have to specify a MIME type if the uploaded file has no extension?
In other words is there a default general MIME type?
You can use application/octet-stream for unknown types.
RFC 2046 states in section 4.5.1:
The "octet-stream" subtype is used to
indicate that a body contains
arbitrary binary data.
RFC resources:
We should use RFC-7231 (HTTP/1.1 Semantics and Content) as reference instead of RFC-2046 (Media Types) because question was clearly about HTTP Content-Type.
Also RFC-2046 does not clearly define unknown types but RFC-7231 does.
Short answer:
Do not send MIME type for unknown data.
To be more clear: Do not use Content-Type header at all.
References:
RFC-7231Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content3.1.1.5. Content-Type
A sender that generates a message containing a payload body SHOULD
generate a Content-Type header field in that message unless the
intended media type of the enclosed representation is unknown to the
sender.
That section clearly tells you to leave it out if you don't know it for sure.
It also tells that receiver could assume that type is application/octet-stream but thing is that it might also be something else.
What's different then?
RFC-20464.5.1. Octet-Stream Subtype
The recommended action for an implementation that receives an
"application/octet-stream" entity is to simply offer to put the data
in a file, with any Content-Transfer-Encoding undone, or perhaps to
use it as input to a user-specified process.
And, as already stated above:
RFC-72313.1.1.5. Content-Type
If a Content-Type header field is not present, the recipient
MAY either assume a media type of "application/octet-stream"
([RFC2046], Section 4.5.1) or examine the data to determine its type.
Conclusion:
If you define it as "application/octet-stream" then you are telling that you know that it is "application/octet-stream".
If you don't define it then you are telling that you don't know what it is and leave decision to receiver and receiver could then check if it walks like duck and...
I prefer application/unknown, but result will be surely the same as application/octet-stream

Apache Tika Server - Request Header Parameters?

The Apache Tika Server provides a Rest API to extract text from a document. It is also possible to set specific request header parameters like X-Tika-PDFOcrStrategy. e.g:
$ curl -T test/Dokument01.pdf http://localhost:9998/tika --header "X-Tika-PDFOcrStrategy: ocr_only"
From a lot of different documents about tika I found these documented additional header parameters:
X-Tika-OCRLanguage: eng
X-Tika-PDFextractInlineImages: true | false
X-Tika-PDFOcrStrategy: ocr_only | ocr_and_text_extraction
X-Tika-OCRoutputType: hocr
But there seems to be no documentation about how to use the X-Tika-.....? header parameters or which parameters are supported and which not.
For example I wonder if it is possible to overwrite the ImageType mode or the DPI with something like:
X-Tika-PDFocrImageType: rgb
X-Tika-PDFocrDPI: 100
My question is: Which header parameters are supported and which naming convention did these params follow?
The code that handles the X-Tika-OCR and X-Tika-PDF headers is TikaResource.processHeaderConfig.
Those header suffixes and values are then mapped onto the TesseractOCRConfig and PDFParserConfig configuration objects via reflection.
So, to see what X-Tika headers you can set, look up the options on the config class you want to tweak things on (Tesseract or PDF), then build the name, then set the header. If you are not sure what the option does, or what values it takes, look at the JavaDocs for the underlying setter method that will get called.
For eg setExtractInlineImages on PDF, that maps to X-Tika-PDFextractInlineImages

Add CDATA Section in Existing XML String in Swift 3.0/ObjectiveC

I have xml/xml string as response that I am getting from one API, Now I need to send this xml in another API with CDATA section. So my question is, How Can I add CDATA section in existing XML or XML String.
This scenario is in Soap API in Swift 3.0.
Below is the xml
Code
<tem:InputSTR><![CDATA[ This is where you put your xml ]]></tem:InputSTR>
Please provide some code next time you post some question it will be more easy others to help you.

Spring-WS WSA header contains hardcoded mustUnderstand attribute

When sending WSA headers with Spring-WS, the wsa:To field always contains the attribute mustUnderstand="true". By looking at the source code, I found that this attribute is hardcoded in AbstractAddressingVersion.java. Based on the W3 standard the mustUnderstand attribute is not mandatory I think.
Is there a reason why Spring-WS hardcodes it? We have difficulties when integrating Spring-WS with some other SOAP stacks because of this attribute.
If you file a JIRA here, we can make it customizable.

Link to text file (resource) in Javadoc

I did my seach but couldn't find the right answer... How can I use link to a resource text file in Javadoc. {#link easywords.txt} doesn't work. Easy words doesn't work neither.
Try Easy words instead.
A Link should be a URL.
The browser may think D would be protocol to handle requests.
For literature: http://en.wikipedia.org/wiki/File_URI_scheme

Resources