clang(libclang) performance of parse/reparse Translation Unit - parsing

I have a question about parsing performance, when using clang(libclang) and particulary function clang_parseTranslationUnit and clang_reparseTranslationUnit.
I'm trying to optimize the process, but I'm really out of ideas already.
The situation is following - I have a .cpp source, that includes a lot of header files. This headers change very seldom. However the .cpp source changes a lot and I need to reparse it often.
So, there is a possibility to "preparse/precompile" all the headers and create .pch file, and then use it when parsing .cpp.
However, the problem is that, I can use only one .pch.
So, I need to create a .pch from all the included headers.
However, later, when I include some other header file, I need to reparse all the headers, even though they hadn't changed at all.
Also, this is problem, that I need explicitly know, what headers are included in the .cpp (this is not very convenient, as this would mean, I have to scan at least for includes myself, and then create a .pch and then use it, when parsing the .cpp source).
Is there any other option to optimize the process? I hoped that, when I use clang_parseTranslationUnit and later clang_reparseTranslationUnit, the parsing will be optimized in this way actually (at least all the headers, that hadn't changed, do not need to be reparsed again). But, it doesn't work like that.

Related

Does the amount of files imported in a bridging header affect compile times?

I have a theory, but I don't know how to test it. We have a fairly large iOS project of about 200 Swift files and 240 obj-C files (and an equal amount of header files). We're still on Swift 1.2, which means that quite regularly, the entire project gets rebuilt.
I've noticed that each .swift file takes about 4-6 seconds to compile; in other projects this is at most 2.
Now, I've noticed that in the build output, warnings generated in header files get repeated for every .swift file, which makes me believe the swift compiler will re-parse all headers included in the bridging header. Since we have ~160 import statements in the bridging header, this kinda adds up.
So, basic questions:
Does the size of our bridging header impact build times?
Is there any way to optimize this, so it parses the headers only once?
Does Swift 2 have this same issue?
Any other tricks to optimize this? Besides rewriting everything in Swift, that's kinda too labor-intensive a project for us to undertake at this time.
Does the size of our bridging header impact build times?
Absolutely. The more files included in your bridging header, the more time it takes the compiler to parse them. This is what a Precompiled Header attempted to fix. PCH files have been phased out in favor of Modules.
Is there any way to optimize this, so it parses the headers only once?
To be honest I don't know, it depends on your source files and dependencies.
Does Swift 2 have this same issue?
Yes, but compiler optimization is much better in the newer versions of Xcode and Swift. Again, stressing Modules instead of Precompiled Header files here. I should note that it is possible to pass a pch file directly into clang, but that's rarely a good idea.
If you can, I'd experiment with using a pch header in the hybrid project. I'd also consider creating precompiled libraries or static frameworks to prevent the constant rebuilding of classes. There's a great WWDC video from 2013 which introduces modules, I highly recommend watching it.
References:
Modules and Precompiled Headers
WWDC 2014 Session 404 - Advances in Objective-C
Why isn't ProjectName-Prefix.pch created automatically in Xcode 6?
I can only talk from the experience I have at my previous workplace, meaning some things might have changed. Also, I'm not sure if this helps your specific case, since you mix Objective C and Swift which I have never done, but the theory is still sound.
In short, yes, the size of the bridging header affects compile times and you're correct that it parses it once for every file/inclusion.
The correct way to optimise this seemed to be to split the project up into modules (also called "frameworks" at some point) because each module is compiled individually and thus not recompiled if nothing has changed.

Import all headers through a single header?

I see in many frameworks, they have something like a Globals.h which simply just imports all header files. Making it very easy elsewhere to just import that single Globals.h header to access everything.
I'm not using any specific framework, I'm just working on a standard, relatively simple project. I'm doing a lot of importing and I started to think can't I just apply this technique?
Can I? Or will it lead to potential problems? Like recursive importing? Just wondered if there were any particular methods or situations you would use this in or things to watch out for?
Thanks.
Doing this inside a framework makes sense. You don't want the end user to import a bunch of classes so you instruct him to import just one. That header will take care of the rest.
Doing something similar inside a project can help you simplify things if you always import a.h b.h and c.h together, creating a header file called abAndC.h would also make sense.
As a side note I always import my Constants.h in my pch file to avoid importing it through my project.
Importing everything everywhere doesn't make sense to me. If that file won't be included in the pch file it can harm a compile time and moreover you expose so many class details everywhere - that should be considered as a bad practice. The file with a global imports is a good idea if it contains only a very popular headers (I would put there imports for classes that occurs in, let's say, 30% classes or so). Then I would include that global importing header into pch file as it not only makes it visible everywhere in the app but also helps to reduce compiling time. Moreover remember to use modules for standard libraries wherever it is possible (it also reduces compile time not only in case of including from pch file). Except the popular classes I wouldn't expose implementation details in header file if it is not necessary - instead use forward declaration.

What is the best practice for using precompiled headers in a modern C++Builder application?

I am currently migrating a large RAD Studio 2010 project to XE4. As part of this, I am recreating many of the project files. I would like to take the opportunity to ensure we are using the best possible mechanism for precompiled headers, since there seem to be a few ways to do it.
Right now we are compiling for 32-bit only but will use the 64-bit compiler in future.
Here's what we're currently doing in 2010, and why I'm unsure about what to do in XE4:
In RAD Studio 2010
We have a file PchApp.h which includes <vcl.h> and a number of other commonly-used header files, mostly headers for various commonly-used core classes in the project. This header is included at the top of every CPP file followed by #pragma hdrstop, like so:
// Top of .cpp file
#include "PchApp.h"
#pragma hdrstop
// Normal includes here
#include "other.h"
#include "other2.h"
// etc
We then have the following settings in the Precompiled Headers section of the project options:
It is not particularly fast to compile (12 minutes for circa 350,000 lines of code.) I am unsure about:
"Inject precompiled header file": should this inject PchApp.h?
"Cache precompiled headers (Must be used with -H or -H"xxx")": the -H option is the "PCH filename", so we are using it, but surely the point of a precompiled header is that it is "cached" or prebuilt once per compile. What extra difference does this make?
Should we have the two lines to include PchApp.h and the pragma hdrstop in the .cpp files? Is there a way to do this in the project options only, and not duplicate these two lines in every single file? Are they necessary?
In other words, I am not sure these are correct or optimal settings, but from reading the documentation I'm equally not sure what would be better. I am aware I don't understand all the options well enough - one reason for this question :)
In RAD Studio XE4
The XE4 32-bit compiler's options dialog is the same, but two things confuse me and/or make me uncertain the current 2010 approach is the best.
1. Default behaviour
When creating a new VCL Forms project, the IDE creates a header named by default Project1PCH1.h, which is intended to be the project's precompiled header. This header includes <vcl.h> and <tchar.h>, and is shown as a node in the Project Manager. It is not included in the default Form1.cpp, but #include <vcl.h> followed by #pragma hdrstop is at the very top of Form1.cpp, followed by other headers.
The default XE4 settings dialog for a new project using this header is:
I am (naively?) working on the assumption the defaults are actually the best / most optimal settings. Some things puzzle me:
The project's supposed precompiled header Project1PCH1.h is not mentioned in the precompiled header settings anywhere.
The headers aren't cached
The PCH filename isn't specified (should this be Project1PCH1.h?)
The .cpp files don't include Project1PCH1.h either.
In fact I have no idea how the compiler or IDE actually know that it is supposed to use Project1PCH1.h or for which .cpp files it is supposed to use it, since it isn't referred to in any way I can find.
This is the most puzzling thing to me, and the spur to ask this question and clear up all my confusion about PCHes. I had planned to copy/use the IDE's default settings, but I don't want to until I understand what they are doing.
2. PCH Wizard
Since 2010, the IDE has included a precompiled header wizard. I haven't ever been able to get it to work - I am running it again right now to get its results and explain my memory of "doesn't work", but it seems to take several hours, so I will update this question later.
Edit: it runs, though it takes several hours, and produced a list of (to me, knowing the source base) odd headers. My recollection of trying it several years ago is that it didn't run at all - a definite improvement.
Since it exists, it may be the best way to set up using precompiled headers in a newly created project file formed to upgrade the 2010 project. How do I best do so? Will all the .cpp files including PchApp.h confuse it?
Questions
With that as background, I have the following questions:
Existing settings. I am creating a new project file and adding thousands of pre-existing .cpp files, all with "#include PchApp.h; #pragma hdrstop" at the top. Should I copy the existing RS2010 PCH settings? Should I remove the above two lines and replace them with something else?
Using the PCH wizard: Does this, in your experience, create optimal settings? Does it include files that, if modified, will cause large areas of the project to be rebuilt (probably non-optimal for coding)? Is it possible to use on an existing project, or do items like our "#include PchApp.h" need to be stripped out before using it?
CPP files / units and the correct includes. Should .cpp files that use precompiled headers not include the precompiled header itself, but only the headers that the .cpp actually needs, even if the PCH includes those? What if you have our current situation, where the PchApp.h file includes several common headers and so the .cpp files don't actually include those themselves? If you remove the inclusion of PchApp.h and replace it with the subset of headers in PchApp.h that the specific .cpp files needs, should they be above or below the #pragma hdrstop? (Above, I think.) What if you then include something else above with them which is not included in the precompiled header - will it change PCH usage for that specific unit, cause the PCH to be rebuilt (performance issues?), etc?
Default setup: Assuming the default setup for a new project is optimal, how is best to migrate the current system to using it?
Non-default setup: If the default setup is not optimal, what is? This, I guess, is the key question.
32 and 64-bit: Knowing that we'll move to 64-bit soon, what should we do to have precompiled headers work on both 32 and 64 bit? Should all PCH knowledge be in the project options rather than .cpp files, so different settings for 32 and 64-bit compilation?
I am seeking a clear, detailed, explanatory, guiding answer, one that clearly explains the best
practice, setting options, items to include in the .cpp
files, header, and/or project file, and so forth - in other words, something to clear up my by now (after all the above!) rather confused understanding. A high-quality answer that can be used as the go-to PCH reference in future by other C++Builder users in future would be excellent. I intend to add a bounty in
a couple of days when I am able to.
Existing settings. In my experience I have changed these settings usually, because if you have hundreds of files - it's just does not seem to be optimal. In xCode i.e. it's the default configuration. There should be no compilation performance difference.
Using the PCH wizard Honestly I have never used it in real project, and it haven't impressed me, so just forgot about that and used manual settings.
CPP files / units and the correct includes. Different IDEs have different default settings for that. What I have usually used is:
Inject precompiled headers automatically (no manual #include in .cpp)
First include appropriate header matching .cpp if one exists (myfile.cpp - then include myfile.h)
After that include all the specific headers that do specific job (specific lib headers, etc.)
In "myfile.h" include ONLY stuff that is a must. Avoid any stuff you can avoid.
Everything you include specifically for a particular .cpp file should be below #pragma hdrstop. Everything you want to be precompiled should be above.
Default setup I don't think it's optimal. As for me it's much easier to migrate just changing a couple of options in the settings.
Non-default setup As I have mentioned above - as for me the optimal set up is with automatic injection of precompiled header. More details in item 3.
32 and 64-bit haven't experienced any problems with that. It should generate own precompiled headers for every particular configuration.
Here's what I do (although I am not sure if it is a good idea or not but it seems to work)
Make sure Project1PCH1.h exists (where Project1 is the name of the project)
Make it contain #pragma hdrstop and 2 trailing newlines (I got weird errors when I didn't have trailing newlines, maybe compiler bug)
In "All Configurations" put into "Inject precompiled header file" then name "Project1PCH1.h"
Do not do anything such as #include "PchApp.h" nor #pragma hdrstop in the other files.
Check everything builds correctly (i.e. files have the right includes on their own merit, not relying on the injected PCH)
Put some includes into the Project1PCH1.h. I use the wizard to come up with some suggestions, but you have to apply some human logic as well to get a good build.
When it's working properly in 32bit mode everything compiles lightning quick; you can tell if you have not quite got something right if you're compiling your project and one particular .cpp file takes a lot longer than the rest. The wizard makes suggestions based on how many files include the given header, but that's somewhat bogus; you need to include in it any system header (or boost header etc.) that would add significantly to the compilation time if it were not part of the PCH.
I don't bother to include my own project headers in it, just system and standard headers. That may differ for you depending on your project, IDK.
The PCH doesn't work for .c files so if you have any of those in your file you'll need to make Project1PCH1.h have #ifdef __cplusplus guards.
Also: even though bcc64 doesn't support PCH (but it does inject the file), if you do have your PCH set up right it does seem to make compilation go a fair bit faster, I'm not exactly sure why.
Things I don't understand about it yet:
Why does the New Project wizard autogenerate Project1PCH1.h but not actually set that in the "Inject Precompiled Header" field of Project Properties?
Sometimes the build fails saying it cannot open Project1PCH1.h but if I make some changes and re-save it it usually seems to fix this.

How do headers work in Objective-C?

Beyond allowing one file to use another file's attributes, what actually happens behind the scenes? Does it just provide the location to access to that file when its contents are later needed, or does it load the implementation's data into memory?
In short;
The header file defines the API for a module. It's a contract listing which methods a third party can call. The module can be considered a black box to third parties.
The implementation implements the module. It is the inside of the black box. As a developer of a module you have to write this, but as a user of a third party module you shouldn't need to know anything about the implementation. The header should contain all the information you need.
Some parts of a header file could be auto generated - the method declarations. This would require you to annotate the implementation as there are likely to be private methods in the implementation which don't form part of the API and don't belong in the header.
Header files sometimes have other information in them; type definitions, constant definitions etc. These belong in the header file, and not in the implementation.
The main reason for a header is to be able to #include it in some other file, so you can use the functions in one file from that other file. The header includes (only) enough to be able to use the functions, not the functions themselves, so (we hope) compiling it is considerably faster.
Maintaining the two separately most results from nobody ever having written an editor that automates the process very well. There's not really a lot of reason they couldn't do so, and a few have even tried to -- but the editors that have done so have never done very well in the market, and the more mainstream editors haven't adopted it.
Well i will try:
Header files are only needed in the preprocessing phase. Once the preprocessor is done with them the compiler never even sees them. Obviously, the target system doesn't need them either for execution (the same way .c files aren't needed).
Instead libraries are executed during the linking phase.If a program is dynamically linked and the target environment doesn't have the necessary libraries, in the right places, with the right versions it won't run.
In C nothing like that is needed since once you compile it you get native code. The header files are copy pasted when u #include it . It is very different from the byte-code you get from java. There's no need for an interpreter(like the JVM): you just feed it your binary stuff to the CPU and it does its thing.

J2ME Properties

J2ME lacks the java.util.Properties class. Although it is possible to put application settings in the JAD file this is not recommended for many properties. (Since, some platforms limits the size of JAD file.) I want to put a configuration file inside my jar file and parse it. And I do not want to go with XML because it will be overshooting for my case.
Question is, is there an already existing library for J2ME that can parse properties files or something similar such as INI file. Or would you recommend another method to solve the initial problem?
The best solution probably depends on what is going to be generating the properties files.
If you've got other non-JavaME projects using the same properties files, then stick with them, and write or find a parser. (There is a simple one from GoBible available on Google Code)
However you might find it just as easy to keep your configuration as static final String myproperty="myvalue"; in a Configuration.java file which you compile, and include in the jar instead, since you then do not need any special code to locate, open, read, and parse them.
You do then pick up a limitation on what you call them though, since you can no longer use the common dot separated namespacing idiom.

Resources