I know that SocialEngine stores language files as CSV files in application/languages. The common format in the CSV files is as follows:
"Source word"; "Translated word"
But, this sometimes gets very complicated, especially when special characters are used in some parts, e.g.:
"Total Credits : %s";"Total Credits : %s"
"_EMAIL_SITEGROUP_BADGEREQUEST_APPROVED_EMAIL_TITLE";"Group Badge Request Approved"
"Video conversion failed. Please try uploading %1$sagain%2$s.";"Video conversion failed. Please try uploading %1$sagain%2$s."
"{item:$subject} replied to a comment on {item:$owner}\'\'s page offer {item:$object:$title}: {body:$body}";"{item:$subject} replied to a comment on {item:$owner}\'\'s page offer {item:$object:$title}: {body:$body}"
"3%s Level Category:";"3%s Level Category:"
"I have read and agree to the <a href='javascript:void(0);' onclick=window.open('%s','mywindow','width=500,height=500')>terms of service</a>.";"I have read and agree to the <a href='javascript:void(0);' onclick=window.open('%s','mywindow','width=500,height=500')>terms of service</a>."
%s
-this is variable, and this means vareables are few:
%1$, %2$s
and so on... any number in X:
%X$
This is the key (in your case):
"Total Credits : %s"
This is delimeter:
;
and this is your translation:
"Total Credits : %s"
Cheers;)
Without worrying about any of that, you can use this plugin: Language Translator / Multilingual Plugin
Related
I tried to parse SEC company filings from sec.gov. Starting from fb 10-Q index.htm let's look at a complete text submission filing like complete submission text filing. It has a structure like:
<SEC-DOCUMENT>
<SEC-HEADER>
<ACCEPTANCE-DATETIME>"some content" This tag is not closed.
"some lines resembling yaml markup"
These are indented lines with a
"key": "value" structure.
</SEC-HEADER>
<DOCUMENT>
.
.
some content
.
.
</DOCUMENT>
"several DOCUMENT tags" ...
</SEC-DOCUMENT>
I tried to figure out the structure of the <SEC-HEADER> tag and found some information under Public Dissemination
Service (PDS) Technical
Specification (pdf) and concluded that the content of the header should be SGML.
Nevertheless, I am clueless about the formatting, since there are no angle brackets, and the keys - value paires are separated by colons like key: value instead of <key>value</key>. In the pdf link I could not find anything about colons.
Question: Is the <SEC-HEADER> tag valid SGML? If it is, how to parse it?
I'd be glad at any help.
The short answer is no. The <SEC-HEADER> tag in the raw filing is not a valid SGML.
However, it is my understanding that this section in the raw filing is parsed automatically from the header file <accession_num>.hdr.sgml, which does follow SGML. This header file can be found in the same directory as the raw filing (i.e., the <accession_num>.txt file).
I use a REGEX of the form: ^<(.+?)>(.+?)$ (with re.MULTILINE option) to capture each (tag, value) tuple and get the results directly in a dict().
I believe the only tag in that file that has a closing tag is the </FILER> tag, where there could be multiple filers in each filing. You can first extract those using a REGEX of the form: <FILER>(.+?)</FILER> and then employ the same REGEX as above to get the inner tags for each filer.
Note that other than 'FILER', there could be other tags, representing different relations of the entities to the filing. Those are 'ISSUER', 'SUBJECT COMPANY', 'FILED BY', 'FILED FOR', 'SERIAL COMPANY', 'REPORTING OWNER'.
We are looking at certain .yml files that store translation of a Rails application. For example, the structure of en.yml is as follows:
en:
blog:
left_navigation:
list_topic: "Blog topics blah blah"
articles:
show:
by_author: "By %{author}"
number:
currency:
format:
separator: "."
delimiter: ","
format: "%u%n"
admin:
blog:
topics:
form:
topic_name: "Topic name"
topic_parent: "Parent topic"
save: "Save"
cancel: "Cancel"
As part of our team's translation procedure, we have translators translate changes in Excel, then reexport the new .yml file via macros. The relevant code is:
...
FilePathAndName = ExportFolder & ExportLang & ".yml"
...
Set xmlFile = fsObject.createtextfile(FilePathAndName, True, True)
maxline = TargetRange.Rows.Count
i = 1
For Each mCell In TargetRange
line = mCell.Value
xmlFile.write line & IIf(i = TargetRange.Rows.Count, vbNullString, vbCrLf)
i = i + 1
Next mCell
xmlFile.Close
...
However, SublimeText cannot see FileDiff when looking at both files together under "Open Folder" interface. Similar, when loaded to Source Tree it says that the 2 files are not identical, but it fails to show the file difference.
We need this file difference to verify that the translations were done with correct syntax. Could somebody help us?
Thanks for your ideas #Odatnurd and #barrowc
The issue comes from the encoding of the second (destination file). Apparently it needed to be UTF-8. So I just resaved this file in Sublime Text with Encoding = UTF-8 and FileDiff worked again.
So a final question is, how do I make xml.File and fsObject in Excel VBA save the file in UTF-8 format?
I am scripting with DM and would like to read hdf5 file format.
I borrowed Tore Niermann's gms_HDF5_Plug-In (hdf5_GMS2X_amd64.dll) and his CMD_import_hdf5.s script. It use h5_read_dataset(filename, datapath) to read a image dataset.
I am trying to figure out the way to read a string info stored in the same file. I am particular interested to read the angle stored in string as shown in this figure.Demonstrated string to read. The h5_read_dataset(filename, datapath) function doesn't work for reading string.
There is a help file (hdf5_plugin.chm) with a list of functions but unfortunately I can't open them to see more info.
hdf5_plugin.chm showing the function list.
I suppose the right function to read strings should be something like h5_read_attr() or h5_info() but I didn't test them out. DM always says the two functions doesn't exist.
After reading out the angle by string, I will also need a bit help to convert the string to a double datatype.
Thank you.
Converting String to Number is done with the Val() command.
There is no integer/double/float concept for variables in DM-script, all are just number. ( This is different for images, where you can define the numeric type. Also: For file-inport/export a type differntiation can be made using the taggroup streaming commands in the other answer. )
Example script:
string numStr = "1.234e-2"
number num = val( numStr )
ClearResults()
Result( "\n As string:" + numStr )
Result( "\n As value:" + num )
Result( "\n As value, formatted:" + Format(num,"%3.2f") )
Potential answer regarding the .chm files: When you download (or email) .chm files in Windows, the OS classifies them as "potentially dagerouse" (because it could contain executable HTML code, I think). As a result, these files can not be shown by default. However, you can right-click these files and "unblock" them in the file properties.
Example:
I think this will be most likely a question specific to that plugin and not general DM scripting. So it might be better to contact the plugin-author directly.
The alternative (not good) solution would be to "rewrite" your own HDF5 file-reader, if you know the file-format. For this you would need the "Streaming" commands of the DM script language and browse through the (binary?) source file to the apropriate file location. The starting point for reading on this in the F1 help documentation would be here:
I'm practicing extracting data from an XML site and I'm using Nokogiri to read and parse. I need to analyze the data but for now, I'm just trying to get an output with no success.
I have the following code:
require 'nokogiri'
require 'open-uri'
doc = Nokogiri::HTML(open("http://www.ibiblio.org/xml/examples/shakespeare/macbeth.xml"))
doc.xpath('//PERSONA').each do |char_element|
puts char_element.text
end
I'm simply trying to read the characters off the XML website, but I'm not getting any results when I run it in the terminal. I also tried just writing a simple xpath call such as the one below:
doc.xpath("//PERSONA")
or
doc.xpath("PLAY TITLE")
And I get either an error or it simply acts as if nothing was entered.
I have put a simple function to test it so I know it's reading it. Can anyone tell me what I'm doing wrong?
You're trying to read XML file as a HTML one.
Please try that example:
doc = Nokogiri::XML(open("http://www.ibiblio.org/xml/examples/shakespeare/macbeth.xml"))
doc.xpath('//PERSONA').each{|ce| p ce.text }
"DUNCAN, king of Scotland."
"MALCOLM"
"DONALBAIN"
"MACBETH"
"BANQUO"
"MACDUFF"
"LENNOX"
"ROSS"
"MENTEITH"
"ANGUS"
"CAITHNESS"
"FLEANCE, son to Banquo."
"SIWARD, Earl of Northumberland, general of the English forces."
"YOUNG SIWARD, his son."
"SEYTON, an officer attending on Macbeth."
"Boy, son to Macduff. "
"An English Doctor. "
"A Scotch Doctor. "
"A Soldier."
"A Porter."
"An Old Man."
"LADY MACBETH"
"LADY MACDUFF"
"Gentlewoman attending on Lady Macbeth. "
"HECATE"
"Three Witches."
"Apparitions."
"Lords, Gentlemen, Officers, Soldiers, Murderers, Attendants, and Messengers. "
Please be sure you're using Nokogiri::XML instead of Nokogiri::HTML
I have to work with some SWI-Prolog code that opens a new stream (which creates a file on the file system) and pours some data in. The generated file is read somewhere else later on in the code.
I would like to replace the file stream with a string stream in Prolog so that no files are created and then read everything that was put in the stream as one big string.
Does SWI-Prolog have string streams? If so, how could I use them to accomplish this task? I would really appreciate it if you could provide a small snippet. Thank you!
SWI-Prolog implements memory mapped files. Here is a snippet from some old code of mine, doing both write/read
%% html2text(+Html, -Text) is det.
%
% convert from html to text
%
html2text(Html, Text) :-
html_clean(Html, HtmlDescription),
new_memory_file(Handle),
open_memory_file(Handle, write, S),
format(S, '<html><head><title>html2text</title></head><body>~s</body></html>', [HtmlDescription]),
close(S),
open_memory_file(Handle, read, R, [free_on_close(true)]),
load_html_file(stream(R), [Xml]),
close(R),
xpath(Xml, body(normalize_space), Text).
Another option is using with_output_to/2 combined with current_output/1:
write_your_output_to_stream(Stream) :-
format(Stream, 'example output\n', []),
format(Stream, 'another line', []).
str_out(Codes) :-
with_output_to(codes(Codes), (
current_output(Stream),
write_your_output_to_stream(Stream)
)).
Usage example:
?- portray_text(true), str_out(C).
C = "example output
another line"
Of course, you can choose between redirecting output to atom, string, list of codes (as per example above), etc., just use the corresponding parameter to with_output_to/2:
with_output_to(atom(Atom), ... )
with_output_to(string(String), ... )
with_output_to(codes(Codes), ... )
with_output_to(chars(Chars), ... )
See with_output_to/2 documentation:
http://www.swi-prolog.org/pldoc/man?predicate=with_output_to/2
Later on, you could use open_string/2, open_codes_stream/2 and similar predicates to open string/list of codes as an input stream to read data.