Find size contributed by each external library on iOS - ios

I'm trying to reduce my app store binary size and we have lots of external libs that might be contributing to the size of the final ipa. Is there any way to find out how much each external static lib takes up in the final binary (Other than going about removing each one ?) ?

All of this information is contained in the link map, if you have the patience for sifting through it (for large apps, it can be quite large). The link map has a listing of all the libraries, their object files, and all symbols that were packaged into your app, all in human-readable text. Normally, projects aren't configured to generate them by default, so you'll have to make a quick project file change.
From within Xcode:
Under 'Build Settings' for your target, search for "map"
In the results below, under the 'Linking' section, set 'Write Link Map File' to "Yes"
Make sure to make note of the full path and file name listed under 'Path to Link Map File'
The next time you build your app you'll get a link map dumped to that file path. Note that the path is relative to your app's location in the DerivedData folder (usually ~/Library/Developer/Xcode/DerivedData/<your-app-name>-<random-string-of-letters-and-numbers>/Build/Intermediates/..., but YMMV). Since it's just a text file, you can read it with any text editor.
The contents of the link map are divided into 3 sections, of which 2 will be relevant to what you're looking for:
Object Files: this section contains a listing of all of the object files included in your final app, including your own code and that of any third-party libraries you've included. Importantly, each object file also lists the library where it came from;
Sections: this section, not relevant to your question, contains a list of the processor segments and their sections;
Symbols: this section contains the raw data that you're interested in: a list of all symbols/methods with their absolute location (i.e. address in the processor's memory map), size, and most important of all, a cross-reference to their containing object module (under the 'File' column).
From this raw data, you have everything you need to do the required size calculation. From #1, you see that, for every library, there are N possible constituent object modules; from #2, you see that, for every object module, there are M possible symbols, each occupying size S. For any given library, then, your rough order of size will be something like O(N * M * S). That's only to give you an indication of the components that would go into your actual calculations, it's not any sort of a useful formula. To perform the calculation itself, I'm sorry to say that I'm not aware of any existing tools that will do the requisite processing for you, but given that the link map is just a text file, with a little script magic and ingenuity you can construct a script to do the heavy lifting.
For example, I have a little sample project that links to the following library: https://github.com/ColinEberhardt/LinqToObjectiveC (the sample project itself is from a nice tutorial on ReactiveCocoa, here: http://www.raywenderlich.com/62699/reactivecocoa-tutorial-pt1), and I want to know how much space it occupies. I've generated a link map, TwitterInstant-LinkMap-normal-x86_64.txt (it runs in the simulator). In order to find all object modules included by the library, I do this:
$ grep -i "libLinqToObjectiveC.a" TwitterInstant-LinkMap-normal-x86_64.txt
which gives me this:
[ 8] /Users/XXX/Library/Developer/Xcode/DerivedData/TwitterInstant-ecppmzhbawtxkwctokwryodvgkur/Build/Products/Debug-iphonesimulator/libLinqToObjectiveC.a(LinqToObjectiveC-dummy.o)
[ 9] /Users/XXX/Library/Developer/Xcode/DerivedData/TwitterInstant-ecppmzhbawtxkwctokwryodvgkur/Build/Products/Debug-iphonesimulator/libLinqToObjectiveC.a(NSArray+LinqExtensions.o)
[ 10] /Users/XXX/Library/Developer/Xcode/DerivedData/TwitterInstant-ecppmzhbawtxkwctokwryodvgkur/Build/Products/Debug-iphonesimulator/libLinqToObjectiveC.a(NSDictionary+LinqExtensions.o)
The first column contains the cross-references to the symbol table that I need, so I can search for those:
$ cat TwitterInstant-LinkMap-normal-x86_64.txt | grep -e "\[ 8\]"
which gives me:
0x100087161 0x0000001B [ 8] literal string: PodsDummy_LinqToObjectiveC
0x1000920B8 0x00000008 [ 8] anon
0x100093658 0x00000048 [ 8] l_OBJC_METACLASS_RO_$_PodsDummy_LinqToObjectiveC
0x1000936A0 0x00000048 [ 8] l_OBJC_CLASS_RO_$_PodsDummy_LinqToObjectiveC
0x10009F0A8 0x00000028 [ 8] _OBJC_METACLASS_$_PodsDummy_LinqToObjectiveC
0x10009F0D0 0x00000028 [ 8] _OBJC_CLASS_$_PodsDummy_LinqToObjectiveC
The second column contains the size of the symbol in question (in hexadecimal), so if I add them all up, I get 0x103, or 259 bytes.
Even better, I can do a bit of stream hacking to whittle it down to the essential elements and do the addition for me:
$ cat TwitterInstant-LinkMap-normal-x86_64.txt | grep -e "\[ 8\]" | grep -e "0x" | awk '{print $2}' | xargs printf "%d\n" | paste -sd+ - | bc
which gives me the number straight up:
259
Doing the same for "\[ 9\]" (13016 bytes) and "\[ 10\]" (5503 bytes), and adding them to the previous 259 bytes, gives me 18778 bytes.
You can certainly improve upon the stream hacking I've done here to make it a bit more robust (in this implementation, you have to make sure you get the exact number of spaces right and quote the brackets), but you at least get the idea.

Make a .ipa file of your app and save it in your system.
Then open the terminal and execute the following command:
unzip -lv /path/to/your/app.ipa
It will return a table of data about your .ipa file. The size column has the compressed size of each file within your .ipa file.

I think you should be able to extract the information you need from this:
symbols -w -noSources YourFileHere
Ref: https://devforums.apple.com/message/926442#926442
IIRC, it isn't going to give you clear summary information on each lib, but you should find that the functions from each library should be clustered together, so with a bit of effort you can calculate the approximate contribution from each lib:

Also make sure that you set Generate Debug Symbols to NO in your build settings. This can reduce the size of your static library by about 30%.
In case it's part of your concern, a static library is just the relevant .o files archived together plus some bookkeeping. So a 1.7mb static library — even if the code within it is the entire 1.7mb — won't usually add 1.7mb to your product. The usual rules about dead code stripping will apply.
Beyond that you can reduce the built size of your code. The following probably isn't a comprehensive list.
In your target's build settings look for 'Optimization Level'. By switching that to 'Fastest, Smallest -Os' you'll permit the compiler to sacrifice some speed for size.
Make sure you're building for thumb, the more compact ARM code. Assuming you're using LLVM that means making sure you don't have -mno-thumb anywhere in your project settings.
Also consider which architectures you want to build for. Apple doesn't allow submission of an app that supports both ARMv6 and the iPhone 5 screen and have dropped ARMv6 support entirely from the latest Xcode. So there's probably no point including that at this point.

Related

transfer learning practice for distinguishing cat and dog

i'm trying to practice transfer learning myself.
I'm trying to count the number of each cat and dog files (each 12500 pictures for cat and dog with the total of 25000 pictures).
Here is my code.Code
And here is my path for the picture folderenter image description here.
I thought this was a simple code, but still couldn't figure out why i keep getting (0,0) in my coding (supposed to be (12500 cat files,12500 dog files)):(.
Use os.path.join() inside glob.glob(). Also, if all your images are of a particular extension (say, jpg), you could replace '*.*' with '*.jpg*' for example.
Solution
import os, glob
files = glob.glob(os.path.join(path,'train/*.*'))
As a matter of fact, you might as well just do the following using os library alone, since you are not selecting any particular file extension type.
import os
files = os.listdir(os.path.join(path,'train'))
Some Explanation
The method os.path.join() here helps you join multiple folders together to create a path. This will work whether you are on a Windows/Mac/Linux system. But, for windows the path-separator is \ and for Mac/Linux it is /. So, not using os.path.join() could create an un-resolvable path for the OS. I would use glob.glob when I am interested in getting some specific types (extensions) of files. But glob.glob(path) requires a valid path to work with. In my solution, os.path.join() is creating that path from the path components and feeding it into glob.glob().
For more clarity, I suggest you see documentation for os.path.join and glob.glob.
Also, see pathlib module for path manipulation as an alternative to os.path.join().

How do you extract local variable information (address and type) from a Delphi program or the compiler-generated debug info?

My goal is:
Given a suspended thread in a Delphi-compiled 32 or 64-bit Windows program, to walk the stack (doable)
Given stack entries, to enumerate the local variables in each method and their values. That is, at the very least, find their address and type (integer32/64/signed/unsigned, string, float, record, class...) the combination of which can be used to find their value.
The first is fine and it's the second that this question is about. At a high level, how do you enumerate local variables given a stack entry in Delphi?
At a low level, this is what I've been investigating:
RTTI: does not list this kind of information about methods. This was not something I actually ever thought was a realistic option, but listing here anyway.
Debug information: Loading the debug info produced for a debug build.
Map files: even a detailed map file (a text-format file! Open one and have a look) does not contain local variable info. It's basically a list of addresses and source file line numbers. Great for address to file&line correlation, e.g. the blue dots in the gutter; not great for more detailed information
Remote debugging information (RSM file) - no known information on its contents or format.
TD32/TDS files: my current line of research. They contain global and local symbols among a lot of other information.
The problems I'm encountering here are:
There's no documentation of the TD32 file format (that I can find.)
Most of my knowledge of them comes from the Jedi JCL code using them (JclTD32.pas) and I'm not sure how to use that code, or whether the structures there are extensive enough to show local vars. I'm pretty certain it will handle global symbols, but I'm very uncertain about local. There are a wide variety of constants defined and without documentation for the format, to read what they mean, I'm left guessing. However, those constants and their names must come from somewhere.
Source I can find using TDS info does not load or handle local symbols.
If this is the right approach, then this question becomes 'Is there documentation for the TDS/TD32 file format, and are there any code samples that load local variables?'
A code sample isn't essential but could be very useful, even if it's very minimal.
Check if any debugging symbols weren't in binary.
Also possible is using GDB (on Windows a port of
it). It would be great if you found a .dbg or .dSYM
file. They contain source code, eg.
gdb> list foo
56 void foo()
57 {
58 bar();
59 sighandler_t fnc = signal(SIGHUP, SIG_IGN);
60 raise(SIGHUP);
61 signal(SIGHUP, fnc);
62 baz(fnc);
63 }
If you don't have any debugging files, you may try to get MinGW or Cygwin, and use nm(1) (man page). It will read symbol names from binary. They may contain some types, like C++ ones:
int abc::def::Ghi::jkl(const std::string, int, const void*)
Don't forget to add --demangle option then or you'll get something like:
__ZN11MRasterFont21getRasterForCharacterEh
instead of:
MRasterFont::getRasterForCharacter(unsigned char)
Take a look at the http://download.xskernel.org/docs/file%20formats/omf/borland.txt Open Architecture Handbook. It is old, but maybe you find some relevant information about the file format.

Can the ELF entry point be different from the usual 0x80****** ? Why would that be done?

I'm playing around with a binary and when I load it into my debugger, or even run readelf, I noticed the entry point is 0x530 instead of the usual 0x80****** that'd learned ELF's were loaded at.
Why is this? Is there anything else going on? The binary is linked and not stripped.
instead of the usual 0x80****** that'd learned ELF's were loaded at.
You learned wrong.
While 0x804800 is the usual address that 32-bit x86 Linux binaries are linked at, that address is by no means universal or special.
64-bit x86_64 and aarch64 binaries are linked at default address of 0x40000, and powerpc64le binaries at default address of 0x10000000.
There is no reason a binary could not be linked at any other (page-aligned) address (so long as it is not 0, and allows for sufficient stack at the high end of the address space.
Why is this?
The binary was likely linked with a custom linker script. Nothing wrong with that.
As mentioned by Employed, the entry address is not fixed.
Just to verify, I've tried on x86_64:
gcc -Wl,-Ttext-segment=0x800000 hello_world.c
which sets the entry point to 0x800000 (+ the ELF header size, which gets loaded at 0x800000 in memory) instead of the default 0x400000.
Then both:
readelf -h a.out
and gdb -ex 'b _start' tell me the entry is at 0x800440 as expected (the header is 0x440 bytes).
This is because that value is an input that tells the Linux kernel where to set the PC when forking a new process.
The default 0x400000 comes from the default linker script used. You can also modify the linker script as mentioned in https://stackoverflow.com/a/31380105/895245 , change 0x400000 there, and use the new script with -T script
If I put it at anything below 0x200000 (2Mb) exactly or other low addresses, the program gets killed. I think this is because ld always loads the sections at multiples of 2Mb, which is the largest page size supported (in huge page), so anything lower starts at 0, which is bad: Why is the ELF execution entry point virtual address of the form 0x80xxxxx and not zero 0x0?

Apple's Automator: compression settings for jpg?

When I run Apple's Automator to simply cut a bunch of images in their size Automator will also reduce the quality of the files (jpg) and they get blurry.
How can I prevent this? Are there settings that I can take control of?
Edit:
Or are there any other tools that do the same job but without affecting the image quality?
If you want to have finer control over the amount of JPEG compression, as kopischke said you'll have to use the sips utility, which can be used in a shell script. Here's how you would do that in Automator:
First get the files and the compression setting:
The Ask for Text action should not accept any input (right-click on it, select "Ignore Input").
Make sure that the first Get Value of Variable action is not accepting any input (right-click on them, select "Ignore Input"), and that the second Get Value of Variable takes the input from the first. This creates an array that is then passed on to the shell script. The first item in the array is the compression level that was given to the Automator Script. The second is the list of files that the script will do the sips command on.
In the options on the top of the Run Shell Script action, select "/bin/bash" as the Shell and select "as arguments" for Pass Input. Then paste this code:
itemNumber=0
compressionLevel=0
for file in "$#"
do
if [ "$itemNumber" = "0" ]; then
compressionLevel=$file
else
echo "Processing $file"
filename="$file"
sips -s format jpeg -s formatOptions $compressionLevel "$file" --out "${filename%.*}.jpg"
fi
((itemNumber=itemNumber+1))
done
((itemNumber=itemNumber-1))
osascript -e "tell app \"Automator\" to display dialog \"${itemNumber} Files Converted\" buttons {\"OK\"}"
If you click on Results at the bottom, it'll tell you what file it's currently working on. Have fun compressing!
Automator’s “Crop Images” and “Scale Images” actions have no quality settings – as is often the case with Automator, simplicity trumps configurability. However, there is another way to access CoreImage’s image manipulation facilities whithout resorting to Cocoa programming: the Scriptable Image Processing System, which makes image processing functions available to
the shell via the sips utility. You can fiddle with the most minute settings using this, but as it is a bit arcane in handling, you might be better served with the second way,
AppleScript via Image Events, a scriptable faceless background application provided by OS X. There are crop and scale commands, and the option of specifying a compression level when saving as a JPEG with
save <image> as JPEG with compression level (low|medium|high)
Use a “Run AppleScript” action instead of your “Crop” / “Scale” one and wrap the Image Events commands in a tell application "Image Events" block, and you should be set. For instance, to scale the image to half its size and save as a JPEG in best quality, overwriting the original:
on run {input, parameters}
set output to {}
repeat with aPath in input
tell application "Image Events"
set aPicture to open aPath
try
scale aPicture by factor 0.5
set end of output to save aPicture as JPEG with compression level low
on error errorMessage
log errorMessage
end try
close aPicture
end tell
end repeat
return output -- next action processes edited files.
end run
– for other scales, adjust the factor accordingly (1 = 100 %, .5 = 50 %, .25 = 25 % etc.); for a crop, replace the scale aPicture by factor X by crop aPicture to {width, height}. Mac OS X Automation has good tutorials on the usage of both scale and crop.
Eric's code is just brilliant. Can get most of the jobs done.
but if the image's filename contains space, this workflow will not work.(due to space will break the shell script when processing sips.)
There is a simple solution for this: add "Rename Finder Item" in this workflow.
replace spaces with "_" or anything you like.
then, it's good to go.
Comment from '20
I changed the script into a quick action, without any prompts (for compression as well as confirmation). It duplicates the file and renames the original version to _original. I also included nyam's solution for the 'space' problem.
You can download the workflow file here: http://mobilejournalism.blog/files/Compress%2080%20percent.workflow.zip (file is zipped, because otherwise it will be recognized as a folder instead of workflow file)
Hopefully this is useful for anyone searching for a solution like this (like I did an hour ago).
Comment from '17
To avoid "space" problem, it's smarter to change IFS than renaming.
Back up current IFS and change it to \n only. And restore original IFS after the processing loop.
ORG_IFS=$IFS
IFS=$'\n'
for file in $#
do
...
done
IFS=$ORG_IFS

Finding what hard drive sectors occupy a file

I'm looking for a nice easy way to find what sectors occupy a given file. My language preference is C#.
From my A-Level Computing class I was taught that a hard drive has a lookup table on the first few KB of the disk. In this table there is a linked list for each file detailing what sectors that file occupies. So I'm hoping there's a convinient way to look in this table for a certain file and see what sectors it occupies.
I have tried Google'ing but I am finding nothing useful. Maybe I'm not searching for the right thing but I can't find anything at all.
Any help is appreciated, thanks.
About Drives
The physical geometry of modern hard drives is no longer directly accessible by the operating system. Early hard drives were simple enough that it was possible to address them according to their physical structure, cylinder-head-sector. Modern drives are much more complex and use systems like zone bit recording , in which not all tracks have the same amount of sectors. It's no longer practical to address them according to their physical geometry.
from the fdisk man page:
If possible, fdisk will obtain the disk geometry automatically. This is not necessarily the physical disk geometry (indeed, modern disks do not really have anything
like a physical geometry, certainly not something that can be described in simplistic Cylinders/Heads/Sectors form)
To get around this problem modern drives are addressed using Logical Block Addressing, which is what the operating system knows about. LBA is an addressing scheme where the entire disk is represented as a linear set of blocks, each block being a uniform amount of bytes (usually 512 or larger).
About Files
In order to understand where a "file" is located on a disk (at the LBA level) you will need to understand what a file is. This is going to be dependent on what file system you are using. In Unix style file systems there is a structure called an inode which describes a file. The inode stores all the attributes a file has and points to the LBA location of the actual data.
Ubuntu Example
Here's an example of finding the LBA location of file data.
First get your file's inode number
$ ls -i
659908 test.txt
Run the file system debugger. "yourPartition" will be something like sda1, it is the partition that your file system is located on.
$sudo debugfs /dev/yourPartition
debugfs: stat <659908>
Inode: 659908 Type: regular Mode: 0644 Flags: 0x80000
Generation: 3039230668 Version: 0x00000000:00000001
...
...
Size of extra inode fields: 28
EXTENTS:
(0): 266301
The number under "EXTENTS", 266301, is the logical block in the file system that your file is located on. If your file is large there will be multiple blocks listed. There's probably an easier way to get that number, I couldn't find one.
To validate that we have the right block use dd to read that block off the disk. To find out your file system block size, use dumpe2fs.
dumpe2fs -h /dev/yourPartition | grep "Block size"
Then put your block size in the ibs= parameter, and the extent logical block in the skip= parameter, and run dd like this:
sudo dd if=/dev/yourPartition of=success.txt ibs=4096 count=1 skip=266301
success.txt should now contain the original file's contents.
sudo hdparm --fibmap file
For ext, vfat and NTFS ..maybe more.
fibmap is also a linux C library.

Resources