Compile Delphi unit with SSE (fma) - delphi

In Free-Pascal you can determine if the code is compiled using SSE2/3/64 instructions via the conditional defines from
https://www.freepascal.org/docs-html/current/prog/prog.html#QQ2-333-379,
Table G.3: Possible FPU defines when compiling using FPC
FPUSSE2 SSE 2 instructions on Intel I386 and higher.
FPUSSE3 SSE 3 instructions on Intel I386 and higher, AMD64.
FPUSSE64 SSE64 FPU on Intel I386 and higher, AMD64.
I know the Delphi 64-bit compilers use SSE in the Win RTL, but my question is:
Is there a known method in Delphi to check at compile time, if a unit is compiled with SSE instructions, especially if a*b + c is computed with hardware fma instructions?

Is there a known method in Delphi to check at compile time, if a unit is compiled with SSE instructions?
On Intel platforms, if the CPUX64 conditional is defined then the compiler generates floating point code using SSE instructions. Otherwise, x87 instructions are generated.
No Delphi compiler generates code using FMA instructions. The floating point codegen used by dcc64 has not changed materially since its initial release in XE2.

Related

What is the use of NEXTGEN compiler conditional?

While working on a project we were having 4000+ warnings.
To remove some of those I found one compiler Directive as NEXTGEN.
After Using this directive I found that there is a much more minimize in the warnings to 257.
I want to know if we have any issues in using the compiler directive. Are there any drawback of this directive for my windows application.
I am using Delphi 10.
on Site of Embarcadero I found very less information.
Can anyone tell me something about the same?
Delphi's NEXTGEN conditional symbol marks the next-generation ARC compilers. The Windows and OSX compilers are not NEXTGEN compilers. The iOS and Android compilers are NEXTGEN. Initial release of Linux compiler in 10.2 Tokyo had NEXTGEN defined, but since 10.3 Rio it does not.
Any code compiled for Windows that is marked with NEXTGEN will be ignored in current compilers.
See Conditional symbols:
Defined for compilers (such as the Delphi mobile compilers) that use
"next-generation" language features, such as 0-based strings.
New in XE4/iOS
Update: 10.4 Sydney
NEXTGEN symbol has been removed from all compilers, along with AUTOREFCOUNT and WEAKINSTREF symbols.
The NEXTGEN conditional symbol is defined by the compiler. It is defined, for instance, for the mobile compilers that use ARC. It is not defined for the traditional Windows and Mac OS compilers.
You must not define it in your code. You are compiling your code with a traditional compiler, not a NEXTGEN compiler. Whatever is responsible for these compiler warnings, defining NEXTGEN is not the solution.

AsyncPro and 64bit

I am running Delphi XE8 and have the GetIt AsyncPro for VCL 1.0 installed. It works fine when I compile my application for 32 bit but fails for 64 bit.
The failure is:
[dcc64 Error] OoMisc.pas(2771): E2065 Unsatisfied forward or external declaration: 'Trim'
When I open OoMisc.pas is see:
{$IFNDEF Win32}
function Trim(const S : string) : string;
{$ENDIF}
The Trim function does not seem to be defined. The unit does have SysUtils in its uses clause.
AsyncPro supports only Win32 platform. It cannot be used as-is for Win64 bit.
It contains plenty of 32bit inline ASM code that would have to be replaced either by Pascal code or ported to 64bit ASM code. Besides that part there might be other incompatibilities with Win64 bit platform.
Converting 32-bit Delphi Applications to 64-bit Windows - Inline Assembly Code
If your application contains inline assembly (ASM) code, you need to
examine the ASM code and make the following changes: Mixing of
assembly statements with Pascal code is not supported in 64-bit
applications. Replace assembly statements with either Pascal code or
functions written completely in assembly.
Porting assembly code from IA-32 to Intel 64 cannot be done by simply
copying the code. Consider the architecture specifics, such as the
size of pointers and aligning. You may also want to consult the
processor manual for new instructions. If you want to compile the same
code for different architectures, use conditional defines. See Using
Conditional Defines for Cross-Platform Code in "Using Inline Assembly
Code."
RAD Studio supports Intel x86 through SSE4.2 and AMD 3dNow, and for
x64, Intel/AMD through SSE4.2.
Using Inline Assembly Code
Update:
There is Win64 port of AsyncPro provided by Johan Bontes:
I have a version for Win64 on my Github:
https://github.com/JBontes/AsyncPro
It compiles, but I have not been
able to test it comprehensivly. Feel free to file an issue if you get
stuck anywhere.
I bet that is a relic from Delphi 1 when Win32 was used to distinguish from Win16. You may safely remove those lines.
I converted AsyncPro to XE8 but it only supports Win32.
OoMisc.pas had a Trim function, that was removed from the implementation part. However somebody forgot to remove it from the interface part. That didn't hurt for x32 because it was inside of the $IFNDEF.
Win32 is not defined for x64, so the compiler will complain. The solution for this particular issue is to delete the following 3 lines that were intended for Delphi 1.0.
{$IFNDEF Win32}
function Trim(const S : string) : string;
{$ENDIF}
Of course that does not make AsyncPro compatible with x64 as there will be other issues.

how to use SSE instruction in the x64 architecture in c++?

Currently I am using Visual C++ inline assembly to embed some core function using SSE; however I juts realised that inline assembly is not supported in x64 mode.
How can I use SSE when I build my software in x64 architecture?
The modern method to use assembly instructions in C/C++ is to use intrinsics. Intrinsics have several advantages over inline assembly such as:
You don't have to worry about 32-bit and 64-bit mode.
You don't need to worry about registers and register spilling.
No need to worry AT&T and Intel Syntax.
No need to worry about calling conversions.
The compiler can optimize intrinsics further which it won't do with inline assembly.
Intrinsics are compatible (for the most intrinsics) with GCC, MSVC, ICC, and Clang.
I also like intrinsics because it's easy to emulate hardware with them for example to prepare for AVX512.
You can find the list of Intrinsics MSVC supports here. Intel has better information on intrinsics as well which agrees mostly with MSVC's intrinsics.
But sometimes you still need or want inline assembly. In my opinion it's really stupid that Microsoft does not allow inline assembly in 64-bit mode. This means they have to define intrinsics for several things that other compilers can still do with inline assembly. One example is CPUID. Visual Studio has an intrinsic for CPUID but GCC still uses inline assembly. Another example is adc. For a long time MSVC had no intrinsic for adc but now it appears they do.
Additionally, because they have to create intrinsics for everything it causes confusion. They have to create an intrinsic for mulx but the Intel's documentation for this is wrong. They also have to create intrinics for adcx and adox as well but their documentation disagrees with Intel's and the generated assembly shows that no intrinsic produces adox. So once again the programmer is left waiting for an intrinsic for adox. If they had just allowed inline assembly then there would be no problem.
But back to SSE. With few exceptions, e.g. _mm_set_epi64x in 32-bit mode on MSVC (I don't know if that's been fixed) the SSE/AVX/AVX2 intrinsics work as expected with MSVC, GCC, ICC, and Clang.

If we can run 32 bit executables on 64 bit Windows, why can't we convert it?

WoW64 makes it possible to run 32 bit applications on 64 bit Windows. If the conversion from 32 bit instructions to 64 bit instructions can be made at runtime, why can't we convert the executable itself to 64 bit?
That is because WoW64 doesn't convert 32bit instructions to 64bit.
Your 32bit executable is run in 32 bit mode by switching your CPU to compatibility mode. There are some conversions inlined for API/driver calls, but most of the code is not converted.
*This is only true for the x86-64 architecture. The IA-64 architecture doesn't support this, so WoW actually converts your code to 64bit, but with a significant performance penalty.
*.NET code compiled to MSIL is JITted to the correct architecture. Probably the same happens with other architectures with an Intermediate Language like Java, but i'm no expert there.
Yes, Wow64 lets you run 32-bit programs but they still run in 32-bit mode - no code alterations are performed. Such automatic translation would be impossible for a native application.
The number one problem is that native applications have no annotations explaining what the code does. Just one example: the compiler compiles pointer manipulation code and uses a 32-bit register to hold this pointer value on 32-bit platform and emits bare machine code for that - the runtime will have no idea that this was a pointer and it needs to be placed in 64-bit register on 64-bit platform.
Managed environments such as Java and .NET can deal with it - the compiler emits "intermediate language" code with necessary annotations that is then compiled for the target platform before the code is first run.

Will compiling a DLL in Delphi 7 on a 64bit OS result in a 64bit DLL?

As the title suggests!
I'm trying to get a 64bit dll
No.
Nope. Delphi 7 was released in 2002; the first AMD64 processor was released in 2003. No way Delphi 7 knows how to generate 64-bit code.
All released versions of Delphi following the 16 bit Delphi 1 emit 32 bit targets. At the moment your options are:
Wait until the upcoming 64 bit Delphi release. We anticipate this some time this year, but your port will be non-trivial.
Port to FreePascal. Again, a non-trivial port.
Port to a completely different language: even more work than porting to Free Pascal.
Carry on running 32 bit code.
Compiling a program means to translate your source files into CPU opcodes (and something more, it has to generate a executable image that can work on the OS it was designed for, respecting the OS ABI - Application Binary Interface). Each type of CPU has its own set of opcodes, and even if the Intel x86 architecture has many similarities among 16, 32 and 64 bit opcodes, there are enough differences and the ABI is anyway different.
Creating a 64 bit exe/dll means to generate 64 bit opcodes using also the new 64 bit ABI, and to do that a compiler must be written to "know" them, what a compiler can do is defined by how the compiler itself is written, not by the system it is run on. Delphi 7 compiler "doesn't know" about 64 bit CPUs and exe/dll ABI, and thereby can't generate it. This is true as well up to Delphi XE. The next version should be the first one to come with a 64 bit compiler, you can wait for it, or if you're in a hurry there are some partially compatible compilers like FPC.

Resources