How to convert asn.1 erlang notation into asn.1 value notation - erlang

I want to receive aligned per encoded asn.1 message and decode it to asn.1 value notation. Is there any tools available? Erlang has support for encoding and decoding, and reading value notation from file, but decoding only gives erlang, not value notation.
'S1AP':decode('S1AP-PDU', [32,17,0,23,0,0,2,0,105,0,11,0,0,98,242,33,0,0,195,92,0,51,0,87,64,1,25]).
{ok,{successfulOutcome,{'SuccessfulOutcome',17,reject,{'S1SetupResponse',[{'ProtocolIE-Field',105,reject,[{'ServedGUMMEIsItem',["bò!"],["Ã\\"],["3"],asn1_NOVALUE}]},{'ProtocolIE-Field',87,ignore,25}]}}}}
How to continue with the above code. I would like to get the PDU like here http://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One#Example or below (taken from wiki site)
myQuestion FooQuestion ::= {
trackingNumber 5,
question "Anybody there?"
}

You need to include autogenerated hrl files that contain record definitions for your asn protocol data. They must be either in the same folder as asn1 source or in ../include. After that you'll be able to use record syntax myQuestion#'FooQuestion'{trackingNumber=TrackingNumber, question=Question} to pattern-match data.

Related

Why is the following piece of Lua code, completely valid?

From my Lua knowledge (and according to what I have read in Lua manuals), I've always been under impression that an identifier in Lua is only limited to A-Z & a-z & _ & digits (and can not start using a digit nor be a reserved keyword i.e. local local = 123).
And now I have run into some (obfuscated) Lua program which uses all kind of weird characters for an identifier:
https://i.imgur.com/HPLKMxp.png
-- Most likely, copy+paste won't work. Download the file from https://tknk.io/7HHZ
print(_VERSION .. " " .. (jit and "JIT" or "non-JIT"))
local T = {}
T.math = T.math or {}
T.math.​â®â€‹âŞâ®â€‹­ď»żâ€Śâ€­âŽ­ = math.sin
T.math.â¬â€‹â­â¬â­â«â®â€­â€¬ = math.cos
for k, v in pairs(T.math) do print(k, v) end
Output:
Lua 5.1 JIT
â¬â€‹â­â¬â­â«â®â€­â€¬ function: builtin#45
​â®â€‹âŞâ®â€‹­ď»żâ€Śâ€­âŽ­ function: builtin#44
It is unclear to me, why is this set of characters allowed for an identifier?
In other words, why is it a completely valid Lua program?
Unlike some languages, Lua is not really defined by a formal specification, one which covers every contingency and entirely explains all of Lua's behavior. Something as simple as "what character set is a Lua file encoded in" isn't really explain in Lua's documentation.
All the docs say about identifiers is:
Names (also called identifiers) in Lua can be any string of letters, digits, and underscores, not beginning with a digit and not being a reserved word.
But nothing ever really says what a "letter" is. There isn't even a definition for what character set Lua uses. As such, it's essentially implementation-dependent. A "letter" is... whatever the implementation wants it to be.
So, let's say you're writing a Lua implementation. And you want users to be able to provide Unicode-encoded strings (that is, strings within the Lua text). Lua 5.3 requires this. But you also don't want them to have to use UTF-16 encoding for their files (also because lua_load gets sequences of bytes, not shorts). So your Lua implementation assumes the byte sequence it gets in lua_load is encoded in UTF-8, so that users can write strings that use Unicode characters.
When it comes to writing the lexer/parser part of this implementation, how do you handle this? The simplest, easiest way to handle UTF-8 is to... not handle UTF-8. Indeed, that's the whole point of that encoding. Since everything that Lua defines with specific symbols are encoded in ASCII, and ASCII text is also UTF-8 text with the same meaning, you can basically treat a UTF-8 string like an ASCII string. For in-Lua strings, you just copy the sequence of bytes between the start and end characters of the string.
So how do you go about lexing identifiers? Well, you could ask the question above. Or you could ask a much simpler question: is the character a space, control character, digit, or symbol? A "letter" is merely something that isn't one of those.
Lua defines what things it considers to be "symbols". ASCII can tell you what is a control character, space, and a digit. In such an implementation, any UTF-8 code unit with a value outside of ASCII is a letter. Even if technically, those code units decode into something Unicode thinks of as a "symbol", your lexer just threats it as a letter.
This simple form of UTF-8 lexing gives you fast performance and low memory overhead. You don't have to decode UTF-8 into Unicode codepoints, and you don't need a giant Unicode table to tell you whether a codepoint is a "symbol" or "space" or whatever. And of course, it's also something that would naturally fall out of many ASCII-based Lua implementations.
So most Lua implementations will do it this way, if only by accident. Doing something more would require deliberate effort.
It also allows a user to use Unicode character sequences as identifiers. That means that someone can easily write code in their native language (outside of keywords).
But it also means that obfuscators have lots of ways to create "identifiers" that are just strings of nonsensical bytes. Indeed, because there are multiple ways in Unicode to "spell" the same apparent Unicode string (unless you examine the bytes directly), obfuscators can rig up identifiers that appear when rendered in a text editor to all be the same text, while actually being different strings.
To clarify there is only one identifier T
T.math is sugar syntax for T["math"] this also extends to the obfuscate strings. It is perfectly valid to have a key contain any characters or even start with a number.
Now being able to use the . rather then [ ] does not work with a string that don't conform to the identifier's limitations. See Nicol Bolas' answer for a great break down of those limitations.

Cross Platform Url Encoding for Query Strings

There are multiple classes and functions in different Programming Languages for encoding and decoding strings to be URL friendly. For example
in java
URLEncoder.encode(String, String)
in PHP
urlencode ( string $str )
And ...
My question is, If I UrlEncode a String in java, can I expect the other different UrlDecoders in other Languages decode to the same original sting?
I'm creating a Service that needs to encode some Base64 value in query string and I have no idea who are serving to.
Please consider the only option I have here seems to be the query string. I can't use xml or json or HTTP headers Since I need this to be in a url to be redirected.
I looked around and there were some questions exactly like this but non of them had a proper answer.
I appreciate so much for any acknowledge or any solutions.
EDIT:
For example in PHP Manual there is this description:
Returns a string in which all non-alphanumeric characters except -_. have been replaced with a percent (%) sign followed by two hex digits and spaces encoded as plus (+) signs. It is encoded the same way that the posted data from a WWW form is encoded, that is the same way as in application/x-www-form-urlencoded media type. This differs from the » RFC 3986 encoding (see rawurlencode()) in that for historical reasons, spaces are encoded as plus (+) signs.
That sounds it does not follow the RFC
It sounds url encoders can use various algorithms in different Programming Languages.
But one should look for the encoding schema for every function. For example one of them could be
application/x-www-form-urlencoded
looking into JAVA Url Encoder:
Translates a string into application/x-www-form-urlencoded format using a specific encoding scheme. This method uses the supplied encoding scheme to obtain the bytes for unsafe characters.
Also looking into PHP's
that is the same way as in application/x-www-form-urlencoded media type
So if you are looking for a Cross Platform Url Encoding you should tell your users what is the format of your encoder.
This way, they can found the appropriate Decoder or otherwise they can implement their own.
After some investigation, sounds application/x-www-form-urlencoded is the most popular among others.

Cannot get expected result for Spring4D cryptography examples

The Spring4D library has cryptography classes, however I cannot get them to work as expected. I'm probably using them incorrectly, however lack of any examples makes it difficult.
For example on the website https://quickhash.com/hash-sha256-online, I can hash the word "test" to generate the following hash:
9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08
Using the Spring4D library, the following code produces a different hash:
CreateSHA256.ComputeHash('test').ToString;
results in:
9EFEA1AEAC9EDA04A892885A65FDAE0E6D9BE8C9FC96DA76D31B929262E12B1D
Upper/lower case aside, it is a different hash altogether. I know must be doing something wrong, but again there's no examples of use so I'm stuck on how to do this.
Hashing algorithms operate on binary data, typically represented using byte arrays.
Unfortunately, both of the resources you have used offer the ability to hash text. In order to hash text, you first need to convert from text to binary. To do so requires a choice of encoding. And neither method makes it clear what that choice is.
When I use this Delphi code:
LowerCase(CreateSHA256.ComputeHash(TEncoding.UTF8.GetBytes('test')).ToString)
I get the same hash as appears in your question.
I urge you never to attempt to encrypt/hash text and instead regard these operations as operating on binary. Always use an explicit encoding and then encrypt/hash the array of bytes that the encoding produced.
I've picked the UTF-8 encoding here, because it is a full Unicode encoding, and tends to be efficient in terms of space. However, I don't think your online encoder uses UTF-8. In fact I've no idea what encoding it uses, it is unclear on the matter. This is of course the same old issue of text being different from binary.
In my opinion it is a design flaw of the Delphi library that you use that it allows you to hash text without an explicit choice of encoding. If this library must offer a function that hashes text, then it should require the caller to supply an extra TEncoding parameter.
There is no conversion going on internally so it hashes the UnicodeString which is at least 2 bytes per character.
If you want the same result as on the page you have to use UTF8Encode or directly pass as AnsiString.
However I tried some strings that contained different unicode characters and the page returned a different result. So I am not quite sure how they treat the strings there. I guess it's a codepage thing.
Edit: If you use this page http://www.xorbin.com/tools/sha256-hash-calculator it generates the same hash as TSHA256 with UTF8Encode.
Which type of string are you using? Do you use AnsiString or WideString (Unicode string). Delphi 2009 and Newer are using WideString by default.
Why is string type inportant? All hasging algorithm operates on raw bytes data so it is omportant if each character of your string is stored in one Byte of memory (AnsiString) or multiple Bytes of memory (WideString).

Convert unicode characters to their respective language letter in ios

I've unicode character text for indian language(telugu) like this
పురాణాలు
I'm getting the above text from database to an xml file format. I'm reading the xml file and
when i am printing the text it is showing as పురాణాలు
Is there any way print the text as it is without any encoded character type &#...?
How are you parsing the XML? A proper XML parser should decode the numeric references.
I'm guessing that you are attempting to hand parse an XML document instead of relying on NSXMLParser. If so, you really should use an XML parser. Bad Guess on my part, it's likely that the entities are being double encoded.
To answer your question directly, Objective C HTML escape/unescape shows how to decode entities with a quick and dirty method.

Using preprocessing function with identifier parser in FParsec?

I am using the identifier parser from FParsec to parse the names of variables and functions, which are normally a mixture of Unicode and ASCII characters. But sometimes I have escaped Unicode characters in the beginning (like \u03C0) or within the identifier (like swipe_board\u003A_b). I still can make them parseable using isAsciiIdStart and isAsciiIdContinue options, but I can't define my own custom function for pre-processing before normalization. What could be a solution here?
The identifier parser internally first parses a string and then passes it to an IdentifierValidator instance for validation. Since the C# IdentifierValidator class is publicly accessible (though not documented), you could easily adapt the identifier parser to your needs (by making the initial string parsing step also recognize the escapes).
The identifier parsing is a bit complicated due to support for UTF-16 surrogate pairs, normalization and the Unicode XID character category, which is not natively supported on .NET.
Maybe you only need to support ASCII or UCS-2 identifiers specified in term of character categories supported by CharUnicodeInfo.GetUnicodeCategory, in which case you could probably implement the parsing and validation in just one step using many1Satisfy2 or many1Chars2.

Resources