Change encoding on a per file or per extension basis - character-encoding

I'm using Microsoft Visual Studio Express 2012 for Web. It seems that every file which I open with it gets encoded into UTF-8. For most files which are going to be web-facing, that's fine. However, I have files in my projects that are specifically for build purposes (e.g., .bat files), which must be encoded in ANSI.
Are there any configuration settings in VS to either designate on a per file or a per extension basis the encoding? Or, if not specify the encoding, at least disable the auto-conversion to UTF-8?

Open the problematic file in Visual Studio and...
On the File menu, click Advanced Save Options.
In the Encoding dropdown, select Unicode (UTF-8 … or the encoding you require.
Click OK.
Also see:
how to change source file encoding in csharp project (visual studio / msbuild machine)?

An option to handle the encoding of all files of a given extension on a per open basis can be configured in the Options dialog. See MSDN page on Options, Text Editor, File Extension.
Navigate to Tools > Options > Text Editor > File Extension.
For the bat extension, I selected Source Code (Text) Editor with Encoding. The with Encoding part means that the user will be given options as to what encoding to use when opening the file. The default in this mode is Auto-detect, which preserves the ANSI encoding, if that is what the file already uses. Otherwise, one can explicitly designate it for the individual file.
Unfortunately, it doesn't seem to remember the setting last used when opening a file, and will thus prompt for an encoding setting every time a file is opened.

I had code conversion problems width VS studio 2012 as well. Namely, I had non ansi compliant characters in strings ind my .js files and unreadable was outputted to the browsers html page.
I figured out that accept script files (like .js) VS 2012 creates all files in UTF-8.
*The problem is with the suggestion bellow to change the defaults in the options dialog resulted in that the syntax highlighting and intelisense stopped working in all .js files.*
So my workaround solution know is that I convert my .js files with notepad++ to utf-8 without BOM.
In this way my "unusual" chars are appearing well in browsers and the intelisense is working fine as well.

Related

Visual Studio 2017 - Force file encoding on every file opened

I have a project in Visual Studio 2017 Enterprise with PHP Tools installed. Every project file has ISO-8859-2 encoding (codepage 28592 in VS terminology), and everything works well until I try to open any file. VS wrongfully guesses the encoding as Windows-1250 with no visible way to override this project-wide. This leads to national diacritics being loaded wrongfully and requires reopening every file with encoding manually specified.
Any ideas (other than manually specifying encoding every file opening) how to force the encoding to ISO-8859-2 per every opened file in this project - without manual intervention?
(To reduce any concerns why this particular encoding is used: the program has to comply with the national norm PN-T-42118, specifying this encoding as mandatory. This norm must be complied with at all times in the whole project - a design requirement.)
In VS 2017 it is typically controlled by an EditorConfig file with the charset property.

WiX localization garbled words under installer properties tab

I'm trying to do localization in WiX installer. How can I fix the garbled words shown below in the installer properties? The language that I defined is Japanese.
Windows Installer doesn't officially support codepage 65001 for UTF-8 -- mostly because of UI problems like this. Try using codepage 932 for ja-JP strings. Also, make sure you're setting the Package/#SummaryCodepage attribute (the .wxl file's code page sets Product/#Codepage).

Setting the source file character encoding for Mono's xbuild

I'm generating C# source code which is being built by both VS2010 and Mono's xbuild (2.10.2.0). This generally works very well, I've only had a single compatibility issue so far, and in that case I was using a 'feature' that is clearly specified as undefined behaviour (so mea culpa).
Now I'm running into an issue where I have special characters in a string literal in the C# source code. I'm generating the source files in UTF-8, the character I'm testing with is a German sharp s: 0xC39F. This is written to a file in latin1 by the code, where it ends up as 0xDF when the executable is built with VS (that's the one I want) and as 0xC33F when built with xbuild.
It does not seem to matter whether I run the executable with the .NET or with the Mono CLR, as far as I can see.
My current suspicion is that xbuild is not reading the source code as UTF-8, so the compiled code already has the wrong character in the string literal. Is there a way to explicitly tell it to? I couldn't find anything on xbuild /? and the xbuild documentation isn't particularly comprehensive. If I just missed the right page where this is documented, just a link is sufficient, of course.
All experiments have been performed on Win7 x64.
EDIT 1: To clarify, I've used a hex editor to confirm that the character in the source code file is really 0xC39F, the character written when compiled with VS2010 is 0xDF and the character written when compiled with xbuild is 0xC33F.
You'll need to modify the .csproj file(s) and add a <CodePage> element to the <PropertyGroup> section.
You should be able to use Visual Studio or MonoDevelop to do this for you, as well.
In MonoDevelop, if you right-click on a project and select the "Options" menu item, you can then go to the Build/General section and there will be a "Compiler Code Page" field which you can use to select "UTF-8".
FWIW, this is what MD outputs when I select UTF-8:
<CodePage>65001</CodePage>
So you can just copy/paste that into the <PropertyGroup>

Indy FTP TransferType

I'm using the IdFTP (Indy 10) component to download some files (zip and txt) from a remote location. Before getting each file I set the TransferType to binary.
IdFTP.TransferType := ftBinary;
IdFTP.Get(ASource, ADest, AOverwrite);
I expect that both text and binary files can be downloaded using the binary mode. However it looks like text files contents is messed up while zip files are downloaded correctly. If I set the TransferType to ASCII for text files it works as expected. How can I detect which TransferType to set for a given file? Is there a common denominator or auto setting?
I don't see how the Binary flag can mess up transferred files. Binary type means the server transfers the files without any processing, as is.
The only thing that an FTP server should use the ASCII flag for, is to correctly handle the end of line in text files, usually (1) either only Line Feed in Unix or (2)Carriage Return + Line Feed in Windows. But nowadays most text editors handle both in either system.
So the safest is use only ASCII flag for very well known text files, probably only files with a .txt extension, and use Binary flag for all the others.
When in doubt, rule it out (!) - try transferring the files from the server using the Windows commandline FTP program, and see if text files still come out wrong. The program will transfer binary (command BIN) or text (command ASCII). If you transfer files with this and they still arrive differently to your expectation, then something is being done at the server end*. If they arrive fine, then either you (or Indy) are doing something. :-)
*In what way are the text files messed up? If you're transferring unicode text files, you might be better off transferring them as BINary anyway. I must admit that, as #unknown (yahoo) said, in most cases you should probably stick to BIN mode.
I guess it would also depend on how you are viewing the text file, ANSI or WideChar as to whether the text is messed up or not.

How can I set the default file format in the Delphi IDE to UTF8?

Delphi 2009 sets the default file format for new source code files to ANSI, this makes the source code platform-dependent.
Even for a new XSD file created in the IDE, which by default starts with this line
<?xml version="1.0" encoding="UTF-8" ?>
Delphi sets the file format to ANSI (this looks like a bug, for new XML and XSLT documents UTF8 is selected by default).
Is there a hidden option to set the default file format for source code files?
In fact this blog post from 2004 mentions a hidden IDE option.
It states that you can set a default file filter in the registry to make UTF-8 the default encoding in Delphi 8. This still works under Tokyo! Clearly, you have to adapt the path of the registry key to recent versions like this:
Windows Registry Editor Version 5.00
[HKEY_CURRENT_USER\Software\Embarcadero\BDS\19.0\Editor]
"DefaultFileFilter"="Borland.FileFilter.UTF8ToUTF8"
After setting this value Delphi will encode new units in UTF-8 with BOM.
Right-click on your source code in the Delphi 2009 IDE, and select File Format. Then choose UTF-8. Hope that helps.
Although the answer from MBulli should be still relevant, since version 10.4 of Delphi (as far as I remember) it is possible to change the default encoding within the IDE.
Go to Tools > Options and choose User interface > Editor from the navigation area.
You will find the Default file encoding setting down below.
If you install UTF8ize Plugin (english translation of his page & latest version) to your IDE, when you edit any file within the IDE, the plugin set the file's codepage to UTF8 automatically.
(FYI: The author creates many useful plugins. I posted some of his plugins with image here, but my post was deleted by the modelator. I just wanted to know his useful plugins, but yes, it's off topic here. sorry.)
AFAIK, there is no IDE-wide setting for specifying the default file format.

Resources