iOS bitcode - security concerns - ios

We are distributing a software module for iOS online. Since, Apple is advocating bitcode and even making it mandatory for apps on some devices (watchOS/tvOS) - forcing us to deliver this software module (static library) with bitcode.
The concern is how secure bitcode is from anyone to reverse engineer and decompile (like java bytecode) and how to protect against it? It is easy for anyone to download libraries from the website and extract bitcode (IR) from it and decompile. Some valuable information on it here
https://lowlevelbits.org/bitcode-demystified/
Bitcode maynot be concern for apps as apple will strip it but definitely appears to be a concern for static libraries.
Any insights?

As the link notes "malefactor can obtain your app or library, retrieve the [code] from it and steal your ‘secret algorithm.’" Yep. Totally true.
Also, if you ship non-bitcode libraries, a "malefactor can obtain your app or library, retrieve the [code] from it and steal your ‘secret algorithm.’"
Also, if you ship non-bitcode apps, a "malefactor can obtain your app or library, retrieve the [code] from it and steal your ‘secret algorithm.’"
There is no situation where this is not true. Tools as cheap as Hopper (my tool of choice, but there are also some cheaper solutions) and elaborate as IDA can decompile your functions into passable C code.
If you're working with Cocoa (ObjC or Swift), you have made it even easier to reverse engineer because it's so easy to dynamically introspect Cocoa.
This is not a solvable problem. Both apps and libraries can try to employ obfuscation techniques, but they are complex, fragile, and typically require significant expense or expertise (and often both). In any case, you will need to continually improve your obfuscation as people break it. This is fairly pointless for a library, since there's very little you could re-protect once it leaks, but you can try.
It will leak. That's not solvable. Bitcode doesn't change a whole lot about that. It might be somewhat simpler to read IR than ARM assembly, but not that much, and certainly not if the thing you're protecting is small (like a small algorithm or a key).
There are some obfuscation vendors out there. Product recommendations are off-topic for Stack Overflow (because they attract spam), but search for "ios obfuscation" and you'll find them. In this space, since it's just "tricky hiding" (not security or encryption) you generally get what you pay for. Open source solutions make little sense, since the whole point is to be tricky and hide how you're doing it. I've worked with some open source obfuscation libraries that make it easier to extract secrets from your code (because they're trivial to reverse, and their use marks the parts of the code where you're hiding things).
If this is important to your business plan, then budget for that, and expect it to introduce some challenging bugs, and expect it to be broken anyway (but maybe take longer).

#Rob Napier, you are wrong in comparing orange to apple. Reading assembly code or disassembling with IDA is world apart compared to reading code generated by decompiling intermediate code. Bitcode is totally a nuisance

Related

Upgrading Encryption Algos for Swift Dependencies

I am using the open source MobSF security framework to scan my Swift project's source code and its dependences for vulnerabilities. Most things look pretty good however I'm concerned that it is showing me that encryption algorithms (MD5, SHA1) in my dependencies are not sufficiently secure.
What would be standard practice for solving this? I made sure to pull the latest branches for most of these but they seem to insist on using outdated algos. I am reluctant to go in and have to change their source code only to have it wiped out each time I rebuild the Podfile.
First, it depends on why they're using these algorithms. For certain uses, there are no security problems with MD5 or SHA-1, and they may be necessary for compatibility with existing standards or backward compatibility.
As an example, PBKDF2 is perfectly secure using SHA-1 as its hash. It doesn't require a very strong hash function to maintain its own security. It's even secure using MD5. Switching to SHA-2 with PBKDF2 doesn't improve security, it's just "security hygiene," which is "avoid algorithms that have known problems even being in your code, even if they cause no problems in your particular use case." Security hygiene is a good practice, but it's not the same thing as security.
For other use cases, the security of the hash function is critical. If a framework is authenticating arbitrary messages using MD5, that's completely broken. Don't take this answer to suggest that algorithms don't matter. They do! But not in every use case. And if you want to decode credit card swipe transactions, you're probably going to need DES to be in your code, which is horribly broken, but you're still going to need it because that's how magnetic stripes are encrypted. It doesn't make your framework "insecure."
When you say "but they seem to insist on using outdated algos," I assume you mean you opened a PR and they rejected it, in which case I assume they have a good reason (such as backward compatibility when there is no actual security problem). If you haven't, then obviously the first step would be to open a PR.
That said, if you want to change this because you feel there is an actual security problem that they will not resolve, or purely for hygiene, then with CocoaPods you would fork the project, modify it, and point to your own version using the source attribute to the pod keyword.
Maintaining a cryptography framework myself, I often get bug reports that are simply wrong from developers using these scanners. Make sure that you know what the scanner is telling you and how to evaluate the findings. False positives are extremely common with these. These tools are useful, but you need to have some expertise to read their reports.

Is Electron a Reliable Framework for Enterprise Apps?

We can see good applications (such as Slack and Insomnia) going to Electron, but there is safety/stable enough to build an big solution (such as an ERP) with that? Thanks.
As far as stability goes, Electron is very stable. In my experience I've had no stability issues or unanticipated behavior while developing some complex software on Electron.
However a bigger concern for some is security. Allow me to explain.
How Electron Packages Applications
Electron packages applications by bundling all of their javascript components into an asar.
Asar is a simple extensive archive format, it works like tar that concatenates all files together without compression, while having random access support.
Why This is a Security Concern
What this means is that all of your applications code is just put into an archive. This archive can be explored and extracted using the asar command quite trivially.
npm install asar
asar extract my-app.asar
While this may not be an issue for open source projects or applications like Slack which rely on a backend paid service, license based or paid products could be easily stolen as there is no code security / obscurity that a traditional compiled application might offer. For some, this may be acceptable, for others it may not. Especially if business logic occurs in the application.
Can This Issue be Mitigated?
One potential solution to this issue would be the ability to encrypt the ASAR. This issue has been brought up to the Electron devs, but they have stated that while they are open to a pull request they will likely not be implementing it themselves.
Another possible technique to mitigate this issue is code obfuscation using something such as UglifyJS. However this is obviously not true protection, just a hiding technique.
A third solution, one used by NW.js is to compile your JS to a V8 snapshot. However the Electron devs have indicated that this has significant (50%) performance costs and they will likely not support such capability.
All of this being said, it is possible to decompile / reverse engineer almost any application in any language. Electron just makes it a little easier to do so by "putting your code out there." However they have strong reasoning for doing so (performance gains) and unless you have a paid license product it probably doesn't make much difference to you anyways.
Further reading:
https://github.com/electron/electron/issues/3041
https://github.com/electron/electron/issues/2570

Rolling own code instead of using libraries, avoiding the common approach

I have seen a plethora of projects roll their own things instead of using well tested libraries.
In some other instances I have seen people re-implement Elliptic Curves and Random Number Generators, refusing to use tested libraries, because their code is "better".
Why do people do this, choose to spend their time implementing something instead of using something that has been already done, tested and deployed in a plethora of systems?
For example, the Signal Android messenger app has the whole, full copy of OpenSSL embedded into itself for encryption. Ref
Why not use BouncyCastle or java.security.*?
Is it a ego thing? Is it a trust thing, ie. they don't trust libraries?
It can be for a host of different reasons.
Build vs. buy (or use by reference) should come down to a thorough analysis. That said, many folks get into programming because they like building things. Sometimes it's rewarding to build your own code (even when a third party library exists).
That said, I'll try to list some reasons why you might not want to use third party libraries:
Licensing: Does the third party library licensing conflict or restrict your intended usage of your code? For example, GPL-licensed code may not be the best pick for something used commercially.
Security: Has the third party code been thoroughly analyzed for any security vulnerabilities? If it's public-facing, then have there been exploits in the past that have targeted this code? If so, then how quickly have the contributors fixed things (or have they even bothered to issue a patch).
Ease of use: For example, I may not want to try to use a C++ library in C# code. It's possible, but it's less straightforward than using a C# library.
Bug fixes: Is development ongoing on the third party library? If there's a bug, then how easily can you get it fixed?
Domain knowledge: We can't specialize in everything. Using your example of encryption, I'd strongly discourage attempting to build an encryption library from scratch unless you have an encryption background.
Simplicity: Your use case may be much smaller than what a third party library is built to provide. For example, if you needed to build a Point class to represent an X,Y,Z point, then you could reference a third party graphics library. But if you don't need the ability to do graphics calculations on 3D space, then referencing an entire graphics library might be overkill.
All this said, there are many times using a third party library works (and is the appropriate choice). Using your example, I'd never try to implement an encryption stack on my own -- there's no reason to do so with the plethora of open-source options available.

Recommended way to distribute Halide generated functions?

I am currently experimenting with Halide, the initial tests show quite promising performance improvements.
I am now wondering about what is the best strategy to distribute Halide code. Requiring users to install Halide seems like a heavy barrier at this point in time (since there are no automated install options).
One option would be to use compile_to_c, add the generated C code in the repository, and distribute compilation scripts for such C code. scikit-learn uses a similar strategy for Cython generated code. For Halide this seems like a no-go since the generated C code loses all the optimizations, defeating the purpose of Halide.
My current idea would be to use
compile_to_bitcode, distribute the generated bitcode together with compilation scripts that call llc to generate the desired machine code. The only requirement for the user would be to have llc (i.e. llvm) installed.
Does anyone have experience on this issue?
What are the pro and cons of my idea of distributing bitcode?
What would you recommend?
Some details on the kind of software distribution would help. The question implies a source code distribution, but there is a big difference between a library where programmers may need to interact with Halide produced code at a fine-grained level, and an application where use of Halide is largely invisible to the end user and the goal is just to get it to build.
Distributing bitcode is doable but problematic. To be portable, you have to use something like the PNaCl backend. (PNaCl is fairly close to a generic LLVM bitcode representation.) If you target a specific architecture, there is no guarantee the bitcode will compile or run on any other one. (Halide can lower to architecture specific intrinsics for example.) The LLVM community discourages using bitcode as a distribution format, though if it is in source form (.ll, not .bc) it is likely fairly stable and seems not much worse than shipping assembly files in terms of long term stability.
Halide includes an OS specific runtime into the generated output so even with bitcode, the result includes a number of target specific dependencies.
Often one ends up with a design that chooses, at runtime, between one of a number of Halide outputs based on the actual type of processor being used. E.g. using Halide to compile the same algorithm with two different schedules for SSE2 and AVX2 processors. In this model, there are going to be a lot of object files anyway and one can simply choose at build time which ones to include for a given architecture and OS. Distributing the objects as .ll files rather than .o files will likely work, but I'm not sure it buys much.
I would strive to make the full source code available, requiring Halide if one is doing a compilation from the ground up, and look for ways to provide various levels of binary distribution. Certainly for end user software the emphasis should be on how to get the fully built package into the hands of users. For libraries, Halide may be used to surface a higher level programming model to users of the library, in which case the Halide compiler will need to be present anyway.
We strive to make Halide fairly easy to get onto a system and very stable, but have not absolutely nailed either yet. I'd likely try to provide some level of fallback and using the C backend to generate generic C code might be a decent way to do that without rewriting everything in C directly. (If building from source, one gets a choice between installing Halide or using the prebuilt C code.) This is one of the better use cases for the C backend. (Generating C code from Halide is generally a pretty marginal idea despite it seeming to be a good one at first.)
compile_to_c() is definitely not recommended, as the code it generates isn't very optimized; it's useful mostly as a debugging / development tool.
compile_to_bitcode() sounds like it could work, but I'm not aware of anyone using this as a distribution method.
(It would probably be useful to have an automated install available for Halide.)

How to recognize malicious source code? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
BE AWARE! Creating spyware, computer viruses and similar nasties can be illegal where you live and is considered extremely unethical by almost everyone. Still, I need to ask this to raise awareness about how easy it is to create one. I am asking this after the W32/Induc-A was introduced to this world by someone who came up with a nasty way to spread one. So I want to know how a virus can be created so I will be able to recognise them in the future!
Recently a new virus was discovered which spreads itself by replacing the developers' copies of library code. Actually, through the source code of Delphi 4 through 7. What happened is that there's a virus in the wild which searches the computer for a file called SYSCONST.PAS, to which it will add itself as source code. This file happens to be a source file for the runtime libraries of Delphi. (This runtime source code is available for Delphi developers.) As a result, after being infected a programmer would create lots of new versions of this virus without even knowing it. Since virus scanners sometimes generate false positives many developers might thus decide to ignore the warnings of the scanner and maybe they'll even disable their scanner while building their project. To make it worse, their project might even trigger the scanners of their customers so it's likely that those programmers won't check their source code but will just try to fool the scanner somehow. That is, if a virus scanner is even able to recognise the virus, which isn't very likely. Thus, we software developers might be creating viruses without realizing what we're doing!
So, how to create a virus? Simple: get your source code infected by a virus and you're done!
Okay, so the source code of Delphi 4 through 7 might be infected. All Delphi developers, please check your source files! The case is just a proof-of-concept and apparently it can be very successful. Besides, most virus scanners won't check source code but just focus on executables. This virus could stay undetected for quite a while.
This virus also was successful because it misused source code. Delphi is a commercial project and the source code is available. But who is sure that these hackers won't be attacking open-source projects in similar ways? There are lots of open-source projects out there and who is going to check them all making sure they're all behaving in a decent way? And if someone is checking the code, will he be able to recognise if something is malicious code?
So, to make sure we can recognize malicious source code, I have to ask: How do I create a virus? How do I recognise the code that will create a virus? What is it that most malware will want to do?
There is a bit of discussion about the Delphi runtime source code, about this code being open-source or not. Borland uses a dual-license for their source code from the moment when they started to support Linux with Kylix. As a result, the source code has a "GPL" symbol declared which indicates if the libraries are compiled as GPL code or not. As GPL, the source code would be open-source. This also happens to be the source version that was attacked by the virus. Anyway, to avoid discussions here, I've asked this question here so we can focus more on the virus problem and less on Delphi. Basically, we're talking about a virus that attacks source code. Technically, all source code could be at risk but open source code is a likely candidate since hackers know it's structure and can target those files that are rarely modified, thus rarely checked. (And if they can hack their way into a CVS system, they could even erase the traces of their modifications, thus no one might notice the modiifications!)
While this does not really answer your question, I think a really interesting paper to read is Reflections on Trusting Trust by Ken Thompson. It raises a fascinating point that even if your source code is free of defects (viruses, trojans, etc.), you might still be producing defective executables if your compiler is defective. And even if you rebuild the compiler from clean source code, you can still have the same problem.
Unless you're building your computer from the ground up with your own microchips, hand-assembling your own BIOS, writing your own operating system, compiler, and software, you have to draw the line somewhere and trust that the hardware and software upon which you're building your systems are correct.
You could check for the Evil Bit on incoming packets... http://en.wikipedia.org/wiki/Evil_bit
If you want to recognize malware, you must know how it works. This means researching malware and aquirering the skill to produce malware.
search for 29A - they wrote papers on virus
read about rootkits (there are even books on it)
read about reverse engineering
read source code of malware - there's plenty of it in the web.
learn assembler
learn about your OS
reverse the os-kernel
get clam-av, check the source
I won't provide links here. They are easily found though.
If you really want to learn, and are willing to put in the time, your time is probably better spent on google to find then participate in a greyhat community. this topic is highly complex.
if your question is as simple as "what's an easy way to recognize a virus from its source code", well, it probably won't be easy, because there's infinite ways to go about it.
You ask "What is it that most malware will want to do?".
An excellent source for this sort of information is The Hacker Quarterly, which is so mainstream, you may find it at your local bookstore, or you can subscribe online to get it mailed to you.
It was started to help hackers and phreakers share information. It is still very popular with hackers today and is considered by many to be controversial in nature.
Contents of the Current Issue include:
Not The Enemy
Regaining Privacy in a Digital World
The Security-Conscious Uncle
Why the "No-Fly List" is a Fraud
TELECOM INFORMER
Finding Information in the Library of Congress
Hacking the DI-524 Interface
Simple How-to on Wireless and Windows Cracking
If You Can't Stand the Heat, Hack the Computers!
Security: Truth Versus Fiction
Hacking the Beamz
HACKER PERSPECTIVE: Jason Scott
iTunes Stored Credit Card Vulnerability
Zipcar's Information Infrastructure
The How and Why of Hacking the U.N.
Listen to Radio Hackers!
HACKER SPACES - EUROPE
Abusing Metadata
Verizon FIOS Wireless Insecurities
TRANSMISSIONS
Using Network Recon to Solve a Problem
Suing Telemarketers for Fun and Profit
HACKER HAPPENINGS
Plus LETTERS and MARKETPLACE
There is also an excellent series of articles on Hacking at Wikipedia and on Computer Viruses.
... And yes, it is important for programmers to understand how hacking and code breaking works, so they can do the best they can to circumvent it in their programs.
There is no difference between malicious code and an unintentional security bug.
You might as well be asking "How can I write a useful program that has no bugs and is impossible to exploit".
As we all learn in CS its impossible to even write debuggers to catch infinite loops let alone intelligent malevolence.
My advice for security conscious applications is an ex(p|t)ensive code review and use of commercially available static analysis software.

Resources