Int32.ParseInt throws FormatException after web post - asp.net-mvc

Update
I've found the problem, the exception came from a 2nd field on the same form which indeed should have prompted it (because it was empty)... I was looking at an error which I thought came from trying to parse one string, when in fact it was from trying to parse another string... Sorry for wasting your time.
Original Question
I'm completely dumbfounded by this problem. I am basically running int.Parse("32") and it throws a FormatException. Here's the code in question:
private double BindGeo(string value)
{
Regex r = new Regex(#"\D*(?<deg>\d+)\D*(?<min>\d+)\D*(?<sec>\d+(\.\d*))");
Regex d = new Regex(#"(?<dir>[NSEW])");
var numbers = r.Match(value);
string degStr = numbers.Groups["deg"].ToString();
string minStr = numbers.Groups["min"].ToString();
string secStr = numbers.Groups["sec"].ToString();
Debug.Assert(degStr == "32");
var deg = int.Parse(degStr);
var min = int.Parse(minStr);
var sec = double.Parse(secStr);
var direction = d.Match(value).Groups["dir"].ToString();
var result = deg + (min / 60.0) + (sec / 3600.0);
if (direction == "S" || direction == "W") result = -result;
return result;
}
My input string is "32 19 17.25 N"
The above code runs on a .NET 4 web hosting service (aspspider) on an ASP.NET MVC 3 web application (with Razor as its view engine).
Note the assersion of degStr == "32" is valid! Also when I take the above code and run it in a console application it works just fine. I've scoured the web for an answer, nothing...
Any ideas?
UPDATE (stack trace)
[FormatException: Input string was not in a correct format.]
System.Number.StringToNumber(String str, NumberStyles options, NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal) +9586043
System.Number.ParseInt32(String s, NumberStyles style, NumberFormatInfo info) +119
System.Int32.Parse(String s) +23
ParkIt.GeoModelBinder.BindGeo(String value) in C:\MyProjects\ParkIt\ParkIt\GeoBinder.cs:42
Line 42 is var deg = int.Parse(degStr); and note that the exception is in System.Int32.Parse (not in System.Double as was suggested).

You are wrongly thinking that it is the following line that is throwing the exception:
int.Parse("32")
This line is unlikely to ever throw an exception.
In fact it is the following line:
var sec = double.Parse(secStr);
In this case secStr = "17.25";.
The reason for that is that your hosting provider uses a different culture in which the . is not a decimal separator.
You have the possibility to specify the culture in your web.config file:
<globalization culture="en-US" uiCulture="en-US" />
If you don't do that, then auto is used. This means that the culture could be set based on the client browser preferences (which are sent with each request using the Accept-Language HTTP header).
Another possibility is to specify the culture when parsing:
var sec = double.Parse(secStr, CultureInfo.InvariantCulture);
This way you know for sure that . is the decimal separator for the invariant culture.

Testing this (via PowerShell):
PS [64] E:\dev #43> '32 19 17.25 N' -match "\D*(?\d+)\D*(?\d+)\D*(?\d+(\.\d*))"
True
PS [64] E:\dev #44> $Matches
Name Value
---- -----
sec 17.25
deg 32
min 19
1 .25
0 32 19 17.25
So the regex is working with all three named captures getting a value, all of which will parse OK (ie. it isn't something like \d matching something like U+0660: ARABIC-INDIC DIGIT ZERO that Int32.Parse doesn't handle).
But you do not check that the regex actually makes a match.
Therefore I suspect that the value passed to the function is not the input you expect. Put a breakpoint (or logging) at the start of the function and get the actual value of value.
I think what is happening is:
Value isn't what you think it is.
The regex fails to match.
The captures are empty
Int32.Parse("") is throwing (just confirmed: it throws a FormatException "Input string was not in a correct format.")
Adendum: Just noted you comment on the assertion.
If things seem contradictory go back to basics: at least one of your assumptions is wrong eg. there could be an off by one in the exception's line number (an edit to the file before going to that line number: very easy to do).
Stepping through with a debugger in this case is by far the easiest approach. On every expression check everything.
If you cannot use a debugger then try and remove that restriction, if not how about IntelliTrace? Othewrwise use some kind of logging (if you app doesn't have it, add it as you'll need it in the future for things like this).

try remove non unicode ( if any - non-visible) chars from string :
string s = "søme string";
s = Regex.Replace(s, #"[^\u0000-\u007F]", string.Empty);
edit
also - try to see its hex values to see where it is doing exceptio n :
BitConverter.ToString(buffer);
this will show you the hex values so you can verify...
also paste its value so we can see it.

It turns out that this is a non-question. The problem was that the exception came from a 2nd field on the same form which indeed should have prompted it (because it was empty)... I was looking at an error which I thought came from trying to parse one string, when in fact it was from trying to parse another string...
Sorry for wasting your time.

Related

Wierd URL Encoding/Decoding for non English Characters

How and why a non-English word is converted to weird characters like پاکستان to پاکستان, is there any way back to get پاکستان from پاکستان. It happens in browser shown code and received requests at server
Background:
I get lot of requests at my Non-English content (urdu) website with urls like
پاکستان
I tried to know what that means but search engines don't help. I tried things like
Decode this 'mystring'
What ecoding is this 'mystring'
I thought it might be corrupted/spam url, from this link
Weird characters in URL
Problem explanation/example
But when I viewed one my js file in browser (while having look on working js file). It is showing me same wired characters in browser, even at localhost
'pakistan': {'eng': 'Pakistan', 'ur': 'پاکستان'},
//But actually source code for above line is following
'pakistan': {'eng': 'Pakistan', 'ur': 'پاکستان'},
But in browser its showing me following for same line,
My knowledge
I only know about Encoding/Decoding, which seems unrelated here with best of my knowledge as?
encodeURI and decodeURI in JS or quote and unquote in python and same for other languages. But what they do for me is only
`پاکستان` to `%D9%BE%D8%A7%DA%A9%D8%B3%D8%AA%D8%A7%D9%86` and vise versa
Why needed?
I don't want to miss the requests received with those malformed urls, there must be some things to undo as all browsers chrome/firefox/edge showing those characters same, If their translation/conversion method and result is same then there should be some technique available to reverse it as well
Thanks to Giacomo Catenazzi and then I be greatful to the following answer
How to decode cp1252 string?
A very custom and still imperfect solution to my problem.
This algo needs to be improved Only by experiment I came to know, this algo works as its not working for me when string is long or including - (hyphens)
So I made changes according to my requirement and its working fair enough, so that I could guess what the actual string was.
import re, itertools
from lxml.builder import unicode
def specific_my_required_processing(received_string):
starting_characters_in_encoded_string_in_my_case = ['Ø', 'Ã', 'Ù', 'Ù', 'Ú']
arr = received_string.split('-')
res = []
missed = []
for string_item in arr:
decoded_string = guess_decode_string_without_hyphens(string_item)
if decoded_string and decoded_string[:1] not in starting_characters_in_encoded_string_in_my_case:
res.append(decoded_string)
else:
missed.append({string_item: decoded_string})
resulting_urdu_string = '-'.join(res)
print('\n\nResult', resulting_urdu_string)
print('\nCould not be decoded', missed)
def guess_decode_string_without_hyphens(s):
encodings = ['cp1251', 'cp1252', 'utf8']
for steps in range(2, 10, 2):
for encs in itertools.product(encodings, repeat=steps):
r = s
try:
for enc in encs:
r = r.encode(enc) if isinstance(r, unicode) else r.decode(enc)
except (UnicodeEncodeError, UnicodeDecodeError) as e:
continue
if re.match(u'^[\w\sа-яА-Я]+$', r):
res = str(r)
print('Encoding => ', encs, ' Conversion = ' + s + ' => ' + res)
return res
sample_encoded_string = 'اسلام-آباد-Ûائیکورٹ-ای-ÙˆÛŒ-ایم-قانون-سازی-کالعدم-قرار-دینے-Ú©ÛŒ-درخواست-نامکمل-قرار'
specific_my_required_processing(sample_encoded_string)

How to remove non-ascii char from MQ messages with ESQL

CONCLUSION:
For some reason the flow wouldn't let me convert the incoming message to a BLOB by changing the Message Domain property of the Input Node so I added a Reset Content Descriptor node before the Compute Node with the code from the accepted answer. On the line that parses the XML and creates the XMLNSC Child for the message I was getting a 'CHARACTER:Invalid wire format received' error so I took that line out and added another Reset Content Descriptor node after the Compute Node instead. Now it parses and replaces the Unicode characters with spaces. So now it doesn't crash.
Here is the code for the added Compute Node:
CREATE FUNCTION Main() RETURNS BOOLEAN
BEGIN
DECLARE NonPrintable BLOB X'0001020304050607080B0C0E0F101112131415161718191A1B1C1D1E1F7F808182838485868788898A8B8C8D8E8F909192939495969798999A9B9C9D9E9FA0A1A2A3A4A5A6A7A8A9AAABACADAEAFB0B1B2B3B4B5B6B7B8B9BABBBCBDBEBFC0C1C2C3C4C5C6C7C8C9CACBCCCDCECFD0D1D2D3D4D5D6D7D8D9DADBDCDDDEDFE0E1E2E3E4E5E6E7E8E9EAEBECEDEEEFF1F2F3F4F5F6F7F8F9FAFBFCFDFEFF';
DECLARE Printable BLOB X'20202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020202020';
DECLARE Fixed BLOB TRANSLATE(InputRoot.BLOB.BLOB, NonPrintable, Printable);
SET OutputRoot = InputRoot;
SET OutputRoot.BLOB.BLOB = Fixed;
RETURN TRUE;
END;
UPDATE:
The message is being parsed as XML using XMLNSC. Thought that would cause a problem, but it does not appear to be.
Now I'm using PHP. I've created a node to plug into the legacy flow. Here's the relevant code:
class fixIncompetence {
function evaluate ($output_assembly,$input_assembly) {
$output_assembly->MRM = $input_assembly->MRM;
$output_assembly->MQMD = $input_assembly->MQMD;
$tmp = htmlentities($input_assembly->MRM->VALUE_TO_FIX, ENT_HTML5|ENT_SUBSTITUTE,'UTF-8');
if (!empty($tmp)) {
$output_assembly->MRM->VALUE_TO_FIX = $tmp;
}
// Ensure there are no null MRM fields. MessageBroker is strict.
foreach ($output_assembly->MRM as $key => $val) {
if (empty($val)) {
$output_assembly->MRM->$key = '';
}
}
}
}
Right now I'm getting a vague error about read only messages, but before that it wasn't working either.
Original Question:
For some reason I am unable to impress upon the senders of our MQ
messages that smart quotes, endashes, emdashes, and such crash our XML
parser.
I managed to make a working solution with SQL queries, but it wasted
too many resources. Here's the last thing I tried, but it didn't work
either:
CREATE FUNCTION CLEAN(IN STR CHAR) RETURNS CHAR BEGIN
SET STR = REPLACE('–',STR,'–');
SET STR = REPLACE('—',STR,'—');
SET STR = REPLACE('·',STR,'·');
SET STR = REPLACE('“',STR,'“');
SET STR = REPLACE('”',STR,'”');
SET STR = REPLACE('‘',STR,'&lsqo;');
SET STR = REPLACE('’',STR,'’');
SET STR = REPLACE('•',STR,'•');
SET STR = REPLACE('°',STR,'°');
RETURN STR;
END;
As you can see I'm not very good at this. I have tried reading about
various ESQL string functions without much success.
So in ESQL you can use the TRANSLATE function.
The following is a snippet I use to clean up a BLOB containing non-ASCII low hex values so that it then be cast into a usable character string.
You should be able to modify it to change your undesired characters into something more benign. Basically each hex value in NonPrintable gets translated into its positional equivalent in Printable, in this case always a full-stop i.e. x'2E' in ASCII. You'll need to make your BLOB's long enough to cover the desired range of hex values.
DECLARE NonPrintable BLOB X'000102030405060708090A0B0C0D0E0F101112131415161718191A1B1C1D1E1F202122232425262728292A2B2C2D2E2F303132333435363738393A3B3C3D3E3F';
DECLARE Printable BLOB X'2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E2E';
SET WorkBlob = TRANSLATE(WorkBlob, NonPrintable, Printable);
BTW if messages with invalid characters only come in every now and then I'd probably specify BLOB on the input node and then use something similar to the following to invoke the XMLNSC parser.
CREATE LASTCHILD OF OutputRoot DOMAIN 'XMLNSC'
PARSE(InputRoot.BLOB.BLOB CCSID InputRoot.Properties.CodedCharSetId ENCODING InputRoot.Properties.Encoding);
With the exception terminal wired up you can then correct the BLOB's of any messages containing parser breaking invalid characters before attempting to reparse.
Finally my best wishes as I've had a number of battles over the years with being forced to correct invalid message content in the "Integration Layer" after all that's what it's meant to do.

F# non-literal printf format strings - how to make them passable as parameters?

I would like to use non-literal strings for the "format" parameter of a logging type function, as shown here:
// You need to make c:\testDir or something similar to run this.....
//
let csvFile = #"c:\testDir\foo.csv"
open System.IO
let writer file (s:string) =
use streamWriter = new StreamWriter(file, true)
streamWriter.WriteLine(s)
// s
let log format = Printf.ksprintf (writer csvFile) format
let oneString format = (Printf.StringFormat<string->string> format)
let format = oneString "(this does not %s)"
//log format "important string"
log "this works %s" "important string"
My first attempt used a literal string, and the above fragment should work fine for you if you create the directory it needs or similar.
After discovering that you can't just "let bind" a format string, I then learned about Printf.StringFormatand more details about Printf.ksprintf, but I am obviously missing something, because I can't get them to work together with my small example.
If you comment out the last line and reinstate its predecessor, you will see a compiler error.
Making the function writer return a string almost helped (uncomment its last line), but that then makes log return a string (which means every call now needs an ignore).
I would like to know how to have my format strings dynamically settable within the type checked F# printf world!
Update
I added the parameter format to log to avoid a value restriction error that happens if log is not later called as it is in my example. I also change fmt to format in oneString.
Update
This is a different question from this one. That question does not show a function argument being passed to Printf.StringFormat (a minor difference), and it does not have the part about Printf.ksprintf not taking a continuation function that returns unit.
I thought I had found a solution with:
let oneString format = (Printf.StringFormat<string->string,unit> format)
this compiles, but there is a runtime error. (The change is the ,unit)

action script 2.0 simple string comparing with conditional statement

So I have this crazy problem with comparing 2 strings in ActionScript 2.0
I have a global variable which holds some text (usually it is "statistic") from xml feed
_root.var_name = fields.firstChild.attributes.value;
when I trace() it it gives me the expected message
trace(_root.var_name); // echoes "statistik"
and when I try to use it in conditional statement the rest of code is not being executed because comparing :
if(_root.overskrift == "statistik"){
//do stuff
}
returns false!
I tried also with:
if(_root.overskrift.equals("statistik"))
but with the same result.
Any input will be appreciated.
Years later but on a legacy project I had the same problem and the solution was in a comment to this unanswered question. So let's make it an anwer:
Where var a:String="some harmless but in my case long string" and var b:String="some harmless but in my case long string" but (a==b) -> false the following works just fine: (a.indexOf(b)>=0) -> true. If the String were not found indexOf would give -1. Why the actual string comparison fails I coudn't figure out either. AS2 is ancient and almost obsolete...

readByteSync - is this behavior correct?

stdin.readByteSync has recently been added to Dart.
Using stdin.readByteSync for data entry, I am attempting to allow a default value and if an entry is made by the operator, to clear the default value. If no entry is made and just enter is pressed, then the default is used.
What appears to be happening however is that no terminal output is sent to the terminal until a newline character is entered. Therefore when I do a print() or a stdout.write(), it is delayed until newline is entered.
Therefore, when operator enters first character to override default, the default is not cleared. IE. The default is "abc", data entered is "xx", however "xxc" is showing on screen after entry of "xx". The "problem" appears to be that no "writes" to the terminal are sent until newline is entered.
While I can find an alternative way of doing this, I would like to know if this is the way readByteSync should or must work. If so, I’ll find an alternative way of doing what I want.
// Example program //
import 'dart:io';
void main () {
int iInput;
List<int> lCharCodes = [];
print(""); print("");
String sDefault = "abc";
stdout.write ("Enter data : $sDefault\b\b\b");
while (iInput != 10) { // wait for newline
iInput = stdin.readByteSync();
if (iInput == 8 && lCharCodes.length > 0) { // bs
lCharCodes.removeLast();
} else if (iInput > 31) { // ascii printable char
lCharCodes.add(iInput);
if (lCharCodes.length == 1)
stdout.write (" \b\b\b\b chars cleared"); // clear line
print ("\nlCharCodes length = ${lCharCodes.length}");
}
}
print ("\nData entered = ${new String.fromCharCodes(lCharCodes).trim()}");
}
Results on Command screen are :
c:\Users\Brian\dart-dev1\test\bin>dart testsync001.dart
Enter data : xxc
chars cleared
lCharCodes length = 1
lCharCodes length = 2
Data entered = xx
c:\Users\Brian\dart-dev1\test\bin>
I recently added stdin.readByteSync and readLineSync, to easier create small scrips reading the stdin. However, two things are still missing, for this to be feature-complete.
1) Line mode vs Raw mode. This is basically what you are asking for, a way to get a char as soon as it's printed.
2) Echo on/off. This mode is useful for e.g. typing in passwords, so you can disable the default echo of the characters.
I hope to be able to implement and land these features rather soon.
You can star this bug to track the development of it!
This is common behavior for consoles. Try to flush the output with stdout.flush().
Edit: my mistake. I looked at a very old revision (dartlang-test). The current API does not provide any means to flush stdout. Feel free to file a bug.

Resources