How to Decrypt Lua Script? - lua
can anyone help decrypt this lua script? normally on luaR i can decrypt it by changing ENV to print, but this is LuaT so text keep encrypted. thanks a lot.
key=[[XeeRazZ Uno]];qgfxnxzdeswjwctpuhbkihpgcmxdchpqzmdvxvmuiqgnypwidfhwhmfobvgigjwasftsbedtlzhogvfcocpyjcimqigfutbmxceozfkeltgvkvlocdaznqsjwtrarybe='om jangan decrypt aku :((';ujndwzpfsnalrwdckkoevcoyrpmbrfufdhayxzxzivrdkgdckirgigvjtwcyctmdjuphcqfwnxdvibzbbjltgjhxwjmxnpjjhtigodtjnbmeuhmsruaviunejrqopojr='Obfuscator Ini Milik Seseorang';qcjvresyxgxosdzkzpaakrpgspbecxcoyjemlwcikqhcgbjidblmcuxiojqosdlznufhprqyrjftvfcryxfjrrijvvwzztkcfmiskrjitjffveecrbjchlkzpioxupkw='Kamu Nyari Load?';rsfckenzbozyclztjwokgmgwkmospmxdjvzpjbrgogugfjqyraunuvspwontqjlkkgwkoeyuwthhdsmojdljakdmtqdkuewuhzmfgtgclbssmdktgadkjhkebhvhgbyt='Saya Tak Ragu Ingin Nembak Gay People';esegzizfcapckncplydgaurzketcnrdyvoeheooihrbjuausihoiujopwmytwybamsbjfxacfvjxapwkmeobbkruohskoojcvmjkzunajbkgzjustcuxytblqenuzjad="Soeharto is first indonesian president. Jokowi is seventh indonesian's president, Itadori Yuuji is one of main character in Jujutsu Kaisen Anime, Kento Nanami is Side Character On Jujutsu Kaisen Anime. Lava is 1 of the most dangerous liquid in the world (cap)";zsqzpfmjxeuztvjmavlicncaupafsoiswtntsjhcswzgmvkjrfhhmkjkahlfomdhajurivmsdmxzvoecvatmnamsriglzgeucrlsqlkimsshbzvzndgjqykufmowenjw={ 1,176,3,278,293,286,275,292,281,287,286,208,276,290,287,288,216,281,276,217,186,208,208,208,208,259,277,286,276,256,273,275,283,277,292,216,226,220,208,210,273,275,292,281,287,286,300,276,290,287,288,268,286,300,281,292,277,285,249,244,300,210,222,222,281,276,217,186,208,208,208,208,259,284,277,277,288,216,229,224,224,217,186,277,286,276,186,186,278,293,286,275,292,281,287,286,208,288,216,296,220,208,297,220,208,281,276,217,186,208,208,288,283,292,208,237,208,299,301,186,208,208,288,283,292,222,292,297,288,277,208,237,208,227,186,208,208,288,283,292,222,288,296,208,237,208,296,186,208,208,288,283,292,222,288,297,208,237,208,297,186,208,208,288,283,292,222,296,208,237,208,247,277,292,252,287,275,273,284,216,217,222,288,287,291,264,186,208,208,288,283,292,222,297,208,237,208,247,277,292,252,287,275,273,284,216,217,222,288,287,291,265,186,208,208,288,283,292,222,294,273,284,293,277,208,237,208,281,276,186,208,208,259,277,286,276,256,273,275,283,277,292,258,273,295,216,278,273,284,291,277,220,208,288,283,292,217,186,277,286,276,186,186,278,293,286,275,292,281,287,286,208,288,287,291,216,217,186,208,208,290,277,292,293,290,286,208,299,296,208,237,208,285,273,292,280,222,278,284,287,287,290,216,247,277,292,252,287,275,273,284,216,217,222,288,287,291,264,223,223,227,226,217,220,208,297,208,237,208,285,273,292,280,222,278,284,287,287,290,216,247,277,292,252,287,275,273,284,216,217,222,288,287,291,265,223,223,227,226,217,301,186,277,286,276,186,186,278,293,286,275,292,281,287,286,208,275,280,277,275,283,216,281,276,217,186,208,208,278,287,290,208,271,220,208,281,286,294,208,281,286,208,288,273,281,290,291,216,247,277,292,249,286,294,277,286,292,287,290,297,216,217,217,208,276,287,186,208,208,208,208,281,278,208,281,286,294,222,281,276,208,237,237,208,281,276,208,292,280,277,286,186,208,208,208,208,290,277,292,293,290,286,208,281,286,294,222,273,285,287,293,286,292,186,208,208,277,286,276,186,277,286,276,186,290,277,292,293,290,286,208,224,186,277,286,276,186,186,278,293,286,275,292,281,287,286,208,291,273,297,216,292,277,283,291,217,186,208,208,259,277,286,276,256,273,275,283,277,292,216,226,220,208,210,273,275,292,281,287,286,300,281,286,288,293,292,268,286,300,292,277,296,292,300,210,222,222,292,277,283,291,217,186,208,208,259,284,277,277,288,216,226,224,224,217,186,277,286,276,186,186,278,293,286,275,292,281,287,286,208,280,292,216,217,186,208,208,278,287,290,208,283,220,294,208,281,286,208,288,273,281,290,291,216,247,277,292,260,281,284,277,291,216,217,217,208,276,287,186,208,208,208,208,281,278,208,294,222,278,279,208,237,237,208,288,290,287,294,208,273,286,276,208,294,222,290,277,273,276,297,280,273,290,294,277,291,292,208,237,237,208,292,290,293,277,208,292,280,277,286,186,208,208,208,208,208,208,246,281,286,276,256,273,292,280,216,294,222,296,220,208,294,222,297,217,186,208,208,208,208,208,208,281,278,208,216,294,222,296,221,288,287,291,216,217,222,296,217,208,238,208,227,208,287,290,208,294,222,297,208,302,237,208,288,287,291,216,217,222,297,208,292,280,277,286,186,208,208,208,208,208,208,208,208,276,284,208,237,208,231,224,224,186,208,208,208,208,208,208,277,284,291,277,186,208,208,208,208,208,208,208,208,276,284,208,237,208,276,277,284,273,297,186,208,208,208,208,208,208,277,286,276,186,208,208,208,208,208,208,259,284,277,277,288,216,276,284,217,186,208,208,208,208,208,208,288,216,294,222,296,220,208,294,222,297,220,208,225,232,217,186,208,208,208,208,208,208,259,284,277,277,288,216,225,229,224,217,186,208,208,208,208,277,286,276,186,208,208,277,286,276,186,277,286,276,186,186,278,293,286,275,292,281,287,286,208,278,216,217,186,208,208,281,278,208,288,275,273,284,284,216,278,293,286,275,292,281,287,286,216,217,186,208,208,280,292,216,217,186,208,208,277,286,276,217,208,237,237,208,278,273,284,291,277,208,292,280,277,286,186,208,208,208,208,291,273,297,216,210,259,275,290,281,288,292,208,286,297,273,208,274,277,290,280,277,286,292,281,208,291,277,286,276,281,290,281,210,217,186,208,208,208,208,278,216,217,186,208,208,277,286,276,186,277,286,276,186,186,186,245,276,281,292,260,287,279,279,284,277,216,210,246,273,291,292,208,244,290,287,288,208,268,210,223,291,277,292,275,287,293,286,292,268,210,210,220,208,292,290,293,277,217,186,245,276,281,292,260,287,279,279,284,277,216,210,253,287,276,246,284,297,210,220,208,285,287,276,278,284,297,217,186,291,273,297,216,210,223,291,277,292,275,287,293,286,292,208,224,210,217,186,278,216,217};local nau = 'load'; function gbsaxemtchrfzsriztwlolxczsqrqehtoofnmpoyavocgfgwgrhflolxgtnjqbrbuvxoexderedvoeixqvaiganciekqocmhzpvamveavedkwpxipfixmznrlchvopum(...) local urlzcnfadlobljlpyqchqcpnavhopcnzymqrpzcwliwrizxhvkhjhzqqfjlmxsijihvxlswvhsdifvpkbvhnhdpezgxxcnlbccnaixvtxfzkrtltraexdyebnxbbmnrx='';for puxbrquqyxdncfjgbonqwdfhyghdbnkcqbqguqjofttngnbklfgsbzswisyhcglsnwjbrajzzbzrgabrufapznqrwxiwwprgcishikiiogypdihvdilsdsqmjwxdmzwk=1, #zsqzpfmjxeuztvjmavlicncaupafsoiswtntsjhcswzgmvkjrfhhmkjkahlfomdhajurivmsdmxzvoecvatmnamsriglzgeucrlsqlkimsshbzvzndgjqykufmowenjw do if puxbrquqyxdncfjgbonqwdfhyghdbnkcqbqguqjofttngnbklfgsbzswisyhcglsnwjbrajzzbzrgabrufapznqrwxiwwprgcishikiiogypdihvdilsdsqmjwxdmzwk>3 then urlzcnfadlobljlpyqchqcpnavhopcnzymqrpzcwliwrizxhvkhjhzqqfjlmxsijihvxlswvhsdifvpkbvhnhdpezgxxcnlbccnaixvtxfzkrtltraexdyebnxbbmnrx=urlzcnfadlobljlpyqchqcpnavhopcnzymqrpzcwliwrizxhvkhjhzqqfjlmxsijihvxlswvhsdifvpkbvhnhdpezgxxcnlbccnaixvtxfzkrtltraexdyebnxbbmnrx.._ENV['\115\116\114\105\110\103']['\99\104\97\114']((zsqzpfmjxeuztvjmavlicncaupafsoiswtntsjhcswzgmvkjrfhhmkjkahlfomdhajurivmsdmxzvoecvatmnamsriglzgeucrlsqlkimsshbzvzndgjqykufmowenjw[puxbrquqyxdncfjgbonqwdfhyghdbnkcqbqguqjofttngnbklfgsbzswisyhcglsnwjbrajzzbzrgabrufapznqrwxiwwprgcishikiiogypdihvdilsdsqmjwxdmzwk]-zsqzpfmjxeuztvjmavlicncaupafsoiswtntsjhcswzgmvkjrfhhmkjkahlfomdhajurivmsdmxzvoecvatmnamsriglzgeucrlsqlkimsshbzvzndgjqykufmowenjw[2]));end end;local tolan = 'loadstring';_ENV[_ENV['\115\116\114\105\110\103']['\99\104\97\114'](ujndwzpfsnalrwdckkoevcoyrpmbrfufdhayxzxzivrdkgdckirgigvjtwcyctmdjuphcqfwnxdvibzbbjltgjhxwjmxnpjjhtigodtjnbmeuhmsruaviunejrqopojr:lower():sub(18,18):byte(),qgfxnxzdeswjwctpuhbkihpgcmxdchpqzmdvxvmuiqgnypwidfhwhmfobvgigjwasftsbedtlzhogvfcocpyjcimqigfutbmxceozfkeltgvkvlocdaznqsjwtrarybe:lower():sub(1,1):byte(),rsfckenzbozyclztjwokgmgwkmospmxdjvzpjbrgogugfjqyraunuvspwontqjlkkgwkoeyuwthhdsmojdljakdmtqdkuewuhzmfgtgclbssmdktgadkjhkebhvhgbyt:lower():sub(-9,-9):byte(),esegzizfcapckncplydgaurzketcnrdyvoeheooihrbjuausihoiujopwmytwybamsbjfxacfvjxapwkmeobbkruohskoojcvmjkzunajbkgzjustcuxytblqenuzjad:lower():sub(21,21):byte())](urlzcnfadlobljlpyqchqcpnavhopcnzymqrpzcwliwrizxhvkhjhzqqfjlmxsijihvxlswvhsdifvpkbvhnhdpezgxxcnlbccnaixvtxfzkrtltraexdyebnxbbmnrx)(); end;gbsaxemtchrfzsriztwlolxczsqrqehtoofnmpoyavocgfgwgrhflolxgtnjqbrbuvxoexderedvoeixqvaiganciekqocmhzpvamveavedkwpxipfixmznrlchvopum(zsqzpfmjxeuztvjmavlicncaupafsoiswtntsjhcswzgmvkjrfhhmkjkahlfomdhajurivmsdmxzvoecvatmnamsriglzgeucrlsqlkimsshbzvzndgjqykufmowenjw);
_ENV['\115\116\114\105\110\103']['\99\104\97\114'] resolves to string.char. It receives a set of characters, picked from the dummy strings above. The characters spell load. So _ENV["load"]. Replace it by print. The code is not further obfuscated or compiled. This is one of the worst obfuscator I've seen so far.
Related
Character Encoding not resolved
I have a text file with unknown character formatting, below is a snapshot \216\175\217\133\217\136\216\185 \216\167\217\132\217\133\216\177\216\163\216\169 \216\163\217\130\217\136\217\137 \217\134\217\129\217\136\216\176\216\167\217\139 \217\133\217\134 \216\167\217\132\217\130\217\136\216\167\217\134\217\138\217\134 Anyone has an idea how can I convert it to normal text?
This is apparently how Lua stores strings. Each \nnn represents a single byte where nnn is the byte's value in decimal. (A similar notation is commonly used for octal, which threw me off for longer than I would like to admit. I should have noticed that there were digits 8 and 9 in the data!) This particular string is just plain old UTF-8. $ perl -ple 's/\\(\d{3})/chr($1)/ge' <<<'\216\175\217\133\217\136\216\185 \216\167\217\132\217\133\216\177\216\163\216\169 \216\163\217\130\217\136\217\137 \217\134\217\129\217\136\216\176\216\167\217\139 \217\133\217\134 \216\167\217\132\217\130\217\136\216\167\217\134\217\138\217\134' دموع المرأة أقوى نفوذاً من القوانين You would obviously get a similar result simply by printing the string from Lua, though I'm not familiar enough with the language to tell you how exactly to do that. Post scriptum: I had to look this up for other reasons, so here's how to execute Lua from the command line. lua -e 'print("\216\175\217\133\217\136\216\185 \216\167\217\132\217\133\216\177\216\163\216\169 \216\163\217\130\217\136\217\137 \217\134\217\129\217\136\216\176\216\167\217\139 \217\133\217\134 \216\167\217\132\217\130\217\136\216\167\217\134\217\138\217\134")'
ASCII values not appearing in mailer but appearing in local machine
I want to use a En-dash with ASCII value –. I am using haml and did the coding as do this = "–".html_safe task so as to appear as "do this -- task". In the place of double dash I need an EN-dash. The above code is working fine in my local machine. when I am sending a mail with the above text to the recipent, he is seeing it as do this – task. Can anyone help me in how to make it appear as the En-dash in the mail?
ASCII character codes utilize concluding semicolons (;) to delimit characters for interpolation. Add a concluding semicolon to your en-dash ASCII code: = "–".html_safe
list of garbage characters like ’
I am using librets to retrieve data form my RETS Server. Somehow librets Encoding method is not working and I am receiving some weird characters in my output. I noticed characters like '’' is replaced with ’. I am unable to find a fix for librets so i decided to replace such garbage characeters with actual values after downloading data. What I need is a list of such garbage string and their equivalent characters. I googled for this but not found any resource. Can anyone point me to the list of such garbage letters and their actual values or a piece of code which can generate such letter. thanx
Search for the term "UTF-8", because that's what you're seeing. UTF-8 is a way of representing Unicode characters as a sequence of bytes. ("Unicode characters" are the full range of letters and symbols used all in human languages.) Typically, one Unicode character becomes 1, 2, or 3 bytes in UTF-8. When those bytes (numbers from 0 to 255) are displayed using the character set normally used by Windows, they appear as "garbage" -- in this case, 3 "garbage letters" which are really the 3 bytes of a UTF-8 encoding. In your example, you started with the smart quote character ’. Its representation in Unicode is the number 8217, or U+2019 (2019 is the hexadecimal for 8217). (Search for "Unicode" for a complete list of Unicode characters and their numbers.) The UTF-8 representation of the number 8217 is the three byte sequence 226, 128, 153. And when you display those three bytes as characters, using the Windows "CP-1252" character encoding (the ordinary way of displaying text on Windows in the USA), they appear as ’. (Search for "CP-1252" to see a table of bytes and characters.) I don't have any list for you. But you could make one if you wrote a program in a language that has built-in support for Unicode and UTF-8. All I can do is explain what you are seeing. If there is a way to tell librets to use UTF-8 when downloading, that might automatically solve your problem. I don't know anything about librets, but now that you know the term "UTF-8" you might be able to make progress.
Question reminder: "...I noticed characters like '’' is replaced with ’... i decided to replace such garbage characeters with actual values after downloading data. What I need is a list of such garbage string and their equivalent characters." Strictly dealing with this part: "What I need is a list of such garbage string and their equivalent characters." Using php, you can generate these characters and their equivalence. Working with all 1,111,998 Unicode points or 109,449 Utf8 symbols is impractical. You may use the ASCII range in the following loop between € and Ă or another range that is more relevant to your context. <?php for ($i=128; $i<258; $i++) $tmp1 .= "<tr><td>".htmlentities("&#$i;")."</td><td>".html_entity_decode("&#".$i.";",ENT_NOQUOTES,"utf-8")."</td><td>&#".$i.";</td></tr>"; echo "<table border=1> <tr><td>&#</td><td>"Garbage"</td><td>symbol</td></tr>"; echo $tmp1; echo "</table>"; ?> From experience, in an ASCII context, most "garbage" symbols originate in the range € to ā + (seldom) ῁ to ‶. In order for the "garbage" symbols to display, the html page charset must be set to iso-1 or whichever other charset that caused the problem in the first place. They will not show if the charset is set to utf-8. <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> . "i decided to replace such garbage characeters with actual values after downloading data" You CANNOT undo the "garbage" with php utf8_decode(), which would actually create more "garbage" on already "garbage". But, you may use the simple and fast search and replace php str_replace() function. First, generate 2 arrays for each set of "garbage" symbols you wish to replace. The first array is the Search term: <?php //ISO 8859-1 (Latin-1) special chars are found in the range 128 to 257 $tmp1 = "\$SearchArr = array("; for ($i=128; $i<258; $i++) $tmp1 .= "\"".html_entity_decode("&#".$i.";",ENT_NOQUOTES,"utf-8")."\", "; $tmp1 = substr($tmp1,0,strlen($tmp1)-2);//erases last comma $tmp1 .= ");"; $tmp1 = htmlentities($tmp1,ENT_NOQUOTES,"utf-8"); ?> The second array is the replace term: <?php //Adapt for your relevant range. $tmp2 = "\$ReplaceArr = array(\n"; for ($i=128; $i<258; $i++) $tmp2 .= "\"&#".$i.";\", "; $tmp2 = substr($tmp2,0,strlen($tmp2)-2);//erases last comma $tmp2 .= ");"; echo $tmp1."\n<br><br>\n"; echo $tmp2."\n"; ?> Now, you've got 2 arrays that you can copy and paste to use and reuse to clean any of your infected strings like this: $InfectedString = str_replace($SearchArr,$ReplaceArr,$InfectedString); Note: utf8_decode() is of no help for cleaning up "garbage" symbols. But, it can be used to prevent further contamination. Alternatively a mb_ function can be useful.
How can we eliminate junk value in field?
I have some csv record which are variable in length , for example: 0005464560,45667759,ZAMTR,!To ACC 12345678,DR,79.85 0006786565,34567899,ZAMTR,!To ACC 26575443,DR,1000 I need to seperate each of these fields and I need the last field which should be a money. However, as I read the file, and unstring the record into fields, I found that the last field contain junk value at the end of itself. The amount(money) field should be 8 characters, 5 digit at the front, 1 dot, 2 digit at the end. The values from the input could be any value such as 13.5, 1000 and 354.23 . "FILE SECTION" FD INPUT_FILE. 01 INPUT_REC PIC X(66). "WORKING STORAGE SECTion" 01 WS_INPUT_REC PIC X(66). 01 WS_AMOUNT_NUM PIC 9(5).9(2). 01 WS_AMOUNT_TXT PIC X(8). "MAIN SECTION" UNSTRING INPUT_REC DELIMITED BY "," INTO WS_ID_1, WS_ID_2, WS_CODE, WS_DESCRIPTION, WS_FLAG, WS_AMOUNT_TXT MOVE WS_AMOUNT_TXT(1:8) TO WS_AMOUNT_NUM(1:8) DISPLAY WS_AMOUNT_NUM From the display, the value is rather normal: 345.23, 1000, just as what are, however, after I wrote the field into a file, here is what they become: 79.85^M^#^# 137.35^M^# I have inspect the field WS_AMOUNT_NUM, which came from the field WS_AMOUNT_TXT, and found that ^# is a kind of LOW-VALUE. However, I cannot find what is ^M, it is not a space, not a high-value.
I am guessing, but it looks like you may be reading variable length records from a file into a fixed length COBOL record. The junk at the end of the COBOL record is giving you some grief. Hard to say how consistent that junk is going to be from one read to the next (data beyond the bounds of actual input record length are technically undefined). That junk ends up being included in WS_AMOUNT_TXT after the UNSTRING There are a number of ways to solve this problem. The suggestion I am giving you here may not be optimal, but it is simple and should get the job done. The last INTO field, WS_AMOUNT_TXT, in your UNSTRING statement is the one that receives all of the trailing junk. That junk needs to be stripped off. Knowing that the only valid characters in the last field are digits and the decimal character, you could clean it up as follows: PERFORM VARYING WS_I FROM LENGTH OF WS_AMOUNT_TXT BY -1 UNTIL WS_I = ZERO IF WS_AMOUNT_TXT(WS_I:1) IS NUMERIC OR WS_AMOUNT_TXT(WS_I:1) = '.' MOVE ZERO TO WS_I ELSE MOVE SPACE TO WS_AMOUNT_TXT(WS_I:1) END-IF END-PERFORM The basic idea in the above code is to scan from the end of the last UNSTRING output field to the beginning replacing anything that is not a valid digit or decimal point with a space. Once a valid digit/decimal is found, exit the loop on the assumption that the rest will be valid. After cleanup use the intrinsic function NUMVAL as outlined in my answer to your previous question to convert WS_AMOUNT_TXT into a numeric data type. One final piece of advice, MOVE SPACES TO INPUT_REC before each READ to blow away data left over from a previous read that might be left in the buffer. This will protect you when reading a very "short" record after a "long" one - otherwise you may trip over data left over from the previous read. Hope this helps. EDIT Just noticed this answer to your question about reading variable length files. Using a variable length input record is a better approach. Given the actual input record length you can do something like: UNSTRING INPUT_REC(1:REC_LEN) INTO... Where REC_LEN is the variable specified after OCCURS DEPENDING ON for the INPUT_REC file FD. All the junk you are encountering occurs after the end of the record as defined by REC_LEN. Using reference modification as illustrated above trims it off before UNSTRING does its work to separate out the individual data fields. EDIT 2: Cannot use reference modification with UNSTRING. Darn... It is possible with some other COBOL dialects but not with OpenVMS COBOL. Try the following: MOVE INPUT_REC(1:REC_LEN) TO WS_BUFFER UNSTRING WS_BUFFER INTO... Where WS_BUFFER is a working storage PIC X variable long enough to hold the longest input record. When you MOVE a short alpha-numeric field to a longer one, the destination field is left justified with spaces used to pad remaining space (ie. WS_BUFFER). Since leading and trailing spaces are acceptable to the NUMVAL fucnction you have exactly what you need. I have a reason for pushing you in this direction. Any junk that ends up at the trailing end of a record buffer when reading a short record is undefined. There is a possibility that some of that junk just might end up being a digit or a decimal point. Should this occur, the cleanup routine I originally suggested would fail. EDIT 3: There are no ^# in the resulting WS_AMOUNT_TXT, but still there are a ^M Looks like the file system is treating <CR> (that ^M thing) at the end of each record as data. If the file you are reading came from a Windows platform and you are now reading it on a UNIX platform that would explain the problem. Under Windows records are terminated with <CR><LF> while on UNIX they are terminated with <LF> only. The UNIX file system treats <CR> as if it were part of the record. If this is the case, you can be pretty sure that there will be a single <CR> at the end of every record read. There are a number of ways to deal with this: Method 1: As you already noted, pre-edit the file using Notepad++ or some other tool to remove the <CR> characters before processing through your COBOL program. Personally I don't think this is the best way of going about it. I prefer to use a COBOL only solution since it involves fewer processing steps. Method 2: Trim the last character from each input record before processing it. The last character should always be <CR>. Try the following if you are reading records as variable length and have the actual input record length available. SUBTRACT 1 FROM REC_LEN MOVE INPUT_REC(1:REC_LEN) TO WS_BUFFER UNSTRING WS_BUFFER INTO... Method 3: Treat <CR> as a delimiter when UNSTRINGing as follows: UNSTRING INPUT_REC DELIMITED BY "," OR x"0D" INTO WS_ID_1, WS_ID_2, WS_CODE, WS_DESCRIPTION, WS_FLAG, WS_AMOUNT_TXT Method 4: Condition the last receiving field from UNSTRING by replacing trailing non digit/non decimal point characters with spaces. I outlined this solution a litte earlier in this question. You could also explore the INSPECT statement using the REPLACING option (Format 2). This should be able to do pretty much the same thing - just replace all x"00" by SPACE and x"0D" by SPACE. Where there is a will, there is a way. Any of the above solutions should work for you. Choose the one you are most comfortable with.
^M is a carriage return. Would Google Refine be useful for rectifying this data?
.gsub erroring with non-regular character 194
I've seen this posted a couple of times but none of the solutions seem to work for me so far... I'm trying to remove a spurious  character from a string... e.g. "myÂstring here Â$100" ..but it should be my string here $100 I've tried: string.gsub(/\194/,'') string.gsub(194.chr,'') string.delete 194.chr All of these still leave the  intact.. Any thoughts?
By default, Rails supports UTF-8. You can use your favorite editor to write a gsub call using the proper character you want to replace, as in: "myÂstring here Â$100".gsub(/Â/,"") If this does not work as well, you might be having an encoding error somewhere on your stack, probably on your HTML document. Try running rails console, extract somehow that string (if it comes from the Model, try to perform a find on the containing class) and run the gsub. It won't solve your problem, but you'll get a clue to where exactly the problem may lie.
Looks like a character encoding problem to me. For every Unicode code point in the range U+0080..U+00BF inclusive, the UTF-8 encoding is a two-byte sequence, 0xC2 (194 decimal) and the numeric value the code point. For example, a non-breaking space--U+00A0--becomes 0xC2 0xA0. Was there another extra character in there, that you already removed? At any rate, gsub(/\194/,'') is wrong. \nnn is supposed to be an octal escape, but the number is in its decimal form. 194 in octal is \302.
"myÂstring here Â$100".gsub("Â","") # "mystring here $100" Is that what you meant?