Rolling own code instead of using libraries, avoiding the common approach - libraries

I have seen a plethora of projects roll their own things instead of using well tested libraries.
In some other instances I have seen people re-implement Elliptic Curves and Random Number Generators, refusing to use tested libraries, because their code is "better".
Why do people do this, choose to spend their time implementing something instead of using something that has been already done, tested and deployed in a plethora of systems?
For example, the Signal Android messenger app has the whole, full copy of OpenSSL embedded into itself for encryption. Ref
Why not use BouncyCastle or java.security.*?
Is it a ego thing? Is it a trust thing, ie. they don't trust libraries?

It can be for a host of different reasons.
Build vs. buy (or use by reference) should come down to a thorough analysis. That said, many folks get into programming because they like building things. Sometimes it's rewarding to build your own code (even when a third party library exists).
That said, I'll try to list some reasons why you might not want to use third party libraries:
Licensing: Does the third party library licensing conflict or restrict your intended usage of your code? For example, GPL-licensed code may not be the best pick for something used commercially.
Security: Has the third party code been thoroughly analyzed for any security vulnerabilities? If it's public-facing, then have there been exploits in the past that have targeted this code? If so, then how quickly have the contributors fixed things (or have they even bothered to issue a patch).
Ease of use: For example, I may not want to try to use a C++ library in C# code. It's possible, but it's less straightforward than using a C# library.
Bug fixes: Is development ongoing on the third party library? If there's a bug, then how easily can you get it fixed?
Domain knowledge: We can't specialize in everything. Using your example of encryption, I'd strongly discourage attempting to build an encryption library from scratch unless you have an encryption background.
Simplicity: Your use case may be much smaller than what a third party library is built to provide. For example, if you needed to build a Point class to represent an X,Y,Z point, then you could reference a third party graphics library. But if you don't need the ability to do graphics calculations on 3D space, then referencing an entire graphics library might be overkill.
All this said, there are many times using a third party library works (and is the appropriate choice). Using your example, I'd never try to implement an encryption stack on my own -- there's no reason to do so with the plethora of open-source options available.

Related

Upgrading Encryption Algos for Swift Dependencies

I am using the open source MobSF security framework to scan my Swift project's source code and its dependences for vulnerabilities. Most things look pretty good however I'm concerned that it is showing me that encryption algorithms (MD5, SHA1) in my dependencies are not sufficiently secure.
What would be standard practice for solving this? I made sure to pull the latest branches for most of these but they seem to insist on using outdated algos. I am reluctant to go in and have to change their source code only to have it wiped out each time I rebuild the Podfile.
First, it depends on why they're using these algorithms. For certain uses, there are no security problems with MD5 or SHA-1, and they may be necessary for compatibility with existing standards or backward compatibility.
As an example, PBKDF2 is perfectly secure using SHA-1 as its hash. It doesn't require a very strong hash function to maintain its own security. It's even secure using MD5. Switching to SHA-2 with PBKDF2 doesn't improve security, it's just "security hygiene," which is "avoid algorithms that have known problems even being in your code, even if they cause no problems in your particular use case." Security hygiene is a good practice, but it's not the same thing as security.
For other use cases, the security of the hash function is critical. If a framework is authenticating arbitrary messages using MD5, that's completely broken. Don't take this answer to suggest that algorithms don't matter. They do! But not in every use case. And if you want to decode credit card swipe transactions, you're probably going to need DES to be in your code, which is horribly broken, but you're still going to need it because that's how magnetic stripes are encrypted. It doesn't make your framework "insecure."
When you say "but they seem to insist on using outdated algos," I assume you mean you opened a PR and they rejected it, in which case I assume they have a good reason (such as backward compatibility when there is no actual security problem). If you haven't, then obviously the first step would be to open a PR.
That said, if you want to change this because you feel there is an actual security problem that they will not resolve, or purely for hygiene, then with CocoaPods you would fork the project, modify it, and point to your own version using the source attribute to the pod keyword.
Maintaining a cryptography framework myself, I often get bug reports that are simply wrong from developers using these scanners. Make sure that you know what the scanner is telling you and how to evaluate the findings. False positives are extremely common with these. These tools are useful, but you need to have some expertise to read their reports.

Electron with C++ backend - secure?

I have written a UI in Electron and I would like to connect it with my C++ code. However, I will be selling this product and so I would like to know if this makes it easier for people to crack my C++ code? Obviously I know compiled C++ can be cracked anyway, but does this affect it in any way?
Additionally, what is the best way to go about this while preserving maximum possible security?
Thanks.
EDIT: How about this? Is it possible to use c++ as back-end for Electron.js?
EDIT2: To clarify, my Electron app will be showing the status of operations being performed in the C++ program. As such, I will need to send lists, dictionaries, strings etc. from C++ to JS which will then render it. Additionally, buttons on my Electron app need to trigger actions in the C++ code, such as stopping or starting certain parts of the program.
I have written a UI in Electron and I would like to connect it with my C++ code ...
I would like to know if this makes it easier for people to crack my C++ code?
Using electron does not make any meaningful difference for protecting the C++ source code. (Your intellectual property)
The Javascript code running in electron will be very easy to reverse engineer though, which gives users a head start on experimenting with your C++ binary. Using minification and obfuscation tools can at least make that harder.
For the C++ side, connecting C++ to Electron can be done in at least these two ways:
By dynamically linking to a shared library (Node.js C++ Addons)
In this case your C++ API would be functions that get exported by the shared library. There are many tools to inspect shared libraries (DLLs) and view these functions.
By communicating with another process using some sort of Inter-process communication.
In this case your API would depend on the IPC method used. If it was TCP/UDP messages you could use Wireshark to inspect the packets between the processes. There are ways to inspect messages going over any type of IPC.
Either way, your application must be delivered to the end-user with a compiled binary. Preventing reverse engineering of the binary itself is impossible if you actually give the binary to your users.
You should also expect that a savvy end-user will have access to other tools that can inspect the API and implement third-party code that talks to that API.
Additionally, what is the best way to go about this while preserving maximum possible security?
By "maximum possible security", I will assume you are referring to preventing unauthorized use of the C++ code with other applications.
You would need a licensing system that can authenticate the application that is using your C++ binary's API. Explaining what that would be exactly is probably too large of an answer for a Stack Overflow, and you will have to do some research on how licensing systems are implemented.
It may be theoretically impossible to develop a perfect licensing system though. Look at the gaming industry, it takes just a matter of days to for the licensing software become circumvented for every new game that is released. The only software architecture that cracks haven't completely conquered are cloud-based applications, which don't actually deliver compiled code with their business logic to the end-user's computer.

iOS bitcode - security concerns

We are distributing a software module for iOS online. Since, Apple is advocating bitcode and even making it mandatory for apps on some devices (watchOS/tvOS) - forcing us to deliver this software module (static library) with bitcode.
The concern is how secure bitcode is from anyone to reverse engineer and decompile (like java bytecode) and how to protect against it? It is easy for anyone to download libraries from the website and extract bitcode (IR) from it and decompile. Some valuable information on it here
https://lowlevelbits.org/bitcode-demystified/
Bitcode maynot be concern for apps as apple will strip it but definitely appears to be a concern for static libraries.
Any insights?
As the link notes "malefactor can obtain your app or library, retrieve the [code] from it and steal your ‘secret algorithm.’" Yep. Totally true.
Also, if you ship non-bitcode libraries, a "malefactor can obtain your app or library, retrieve the [code] from it and steal your ‘secret algorithm.’"
Also, if you ship non-bitcode apps, a "malefactor can obtain your app or library, retrieve the [code] from it and steal your ‘secret algorithm.’"
There is no situation where this is not true. Tools as cheap as Hopper (my tool of choice, but there are also some cheaper solutions) and elaborate as IDA can decompile your functions into passable C code.
If you're working with Cocoa (ObjC or Swift), you have made it even easier to reverse engineer because it's so easy to dynamically introspect Cocoa.
This is not a solvable problem. Both apps and libraries can try to employ obfuscation techniques, but they are complex, fragile, and typically require significant expense or expertise (and often both). In any case, you will need to continually improve your obfuscation as people break it. This is fairly pointless for a library, since there's very little you could re-protect once it leaks, but you can try.
It will leak. That's not solvable. Bitcode doesn't change a whole lot about that. It might be somewhat simpler to read IR than ARM assembly, but not that much, and certainly not if the thing you're protecting is small (like a small algorithm or a key).
There are some obfuscation vendors out there. Product recommendations are off-topic for Stack Overflow (because they attract spam), but search for "ios obfuscation" and you'll find them. In this space, since it's just "tricky hiding" (not security or encryption) you generally get what you pay for. Open source solutions make little sense, since the whole point is to be tricky and hide how you're doing it. I've worked with some open source obfuscation libraries that make it easier to extract secrets from your code (because they're trivial to reverse, and their use marks the parts of the code where you're hiding things).
If this is important to your business plan, then budget for that, and expect it to introduce some challenging bugs, and expect it to be broken anyway (but maybe take longer).
#Rob Napier, you are wrong in comparing orange to apple. Reading assembly code or disassembling with IDA is world apart compared to reading code generated by decompiling intermediate code. Bitcode is totally a nuisance

iOS lock Cocoapod/Framework

I have to develop library for 3rd party. This library have to be secure so that only parties that have credentials can use it.
I am going in a direction that I will provide API key that 3rd party app will have to enter it in order for library to work.
Is there any possibility that I do some sort of locking of a Cocoapod? Or is Framework better solution for this kind of problem?
Does anyone have any other solution/suggestion?
It would be better if you elaborated a bit on what exactly you are trying to achieve and what type of task this library performs, maybe give some examples. If I understand you correctly, you need to prevent usage of library by those who did not pay for the service etc.
The key approach will be OK if the library is a client for some webservice. In this case, you should have API keys anyway to protect the API itself, so the client library will just forward this key to the webservice. This approach is widely used in lots of client libraries.
If the library does work only locally (for example, it performs some science-heavy computation / computer vision / etc), then you can just give out the compiled library and license to those who have already paid. You can protect it with a key of course, but it is not too useful, as the key will likely be validated locally, therefore it can easily be compromised or reverse engineered. So the only good way will be to distribute the library to those who purchased / requested trial, and force upon them a license which will restrict the library's usage.
EDIT
If by "Cocoapod" you mean "distributing as source code" and by "Framework" you mean "distributing as a binary", then it depends on what exactly you do in the library. If it is just connecting to the endpoint and marshalling data (e.g. parsing), you can just distribute the source version, as there is no "know-how" to that. On the other hand, if there is something business-related and specific done besides contacting the API, use closed source distribution (binary).
Source distribution has a benefit that you don't have to recompile it if new target architectures appear. It is also easier to distribute via CocoaPods, and your library users will like it more (for a variety of reasons).

Recommended way to distribute Halide generated functions?

I am currently experimenting with Halide, the initial tests show quite promising performance improvements.
I am now wondering about what is the best strategy to distribute Halide code. Requiring users to install Halide seems like a heavy barrier at this point in time (since there are no automated install options).
One option would be to use compile_to_c, add the generated C code in the repository, and distribute compilation scripts for such C code. scikit-learn uses a similar strategy for Cython generated code. For Halide this seems like a no-go since the generated C code loses all the optimizations, defeating the purpose of Halide.
My current idea would be to use
compile_to_bitcode, distribute the generated bitcode together with compilation scripts that call llc to generate the desired machine code. The only requirement for the user would be to have llc (i.e. llvm) installed.
Does anyone have experience on this issue?
What are the pro and cons of my idea of distributing bitcode?
What would you recommend?
Some details on the kind of software distribution would help. The question implies a source code distribution, but there is a big difference between a library where programmers may need to interact with Halide produced code at a fine-grained level, and an application where use of Halide is largely invisible to the end user and the goal is just to get it to build.
Distributing bitcode is doable but problematic. To be portable, you have to use something like the PNaCl backend. (PNaCl is fairly close to a generic LLVM bitcode representation.) If you target a specific architecture, there is no guarantee the bitcode will compile or run on any other one. (Halide can lower to architecture specific intrinsics for example.) The LLVM community discourages using bitcode as a distribution format, though if it is in source form (.ll, not .bc) it is likely fairly stable and seems not much worse than shipping assembly files in terms of long term stability.
Halide includes an OS specific runtime into the generated output so even with bitcode, the result includes a number of target specific dependencies.
Often one ends up with a design that chooses, at runtime, between one of a number of Halide outputs based on the actual type of processor being used. E.g. using Halide to compile the same algorithm with two different schedules for SSE2 and AVX2 processors. In this model, there are going to be a lot of object files anyway and one can simply choose at build time which ones to include for a given architecture and OS. Distributing the objects as .ll files rather than .o files will likely work, but I'm not sure it buys much.
I would strive to make the full source code available, requiring Halide if one is doing a compilation from the ground up, and look for ways to provide various levels of binary distribution. Certainly for end user software the emphasis should be on how to get the fully built package into the hands of users. For libraries, Halide may be used to surface a higher level programming model to users of the library, in which case the Halide compiler will need to be present anyway.
We strive to make Halide fairly easy to get onto a system and very stable, but have not absolutely nailed either yet. I'd likely try to provide some level of fallback and using the C backend to generate generic C code might be a decent way to do that without rewriting everything in C directly. (If building from source, one gets a choice between installing Halide or using the prebuilt C code.) This is one of the better use cases for the C backend. (Generating C code from Halide is generally a pretty marginal idea despite it seeming to be a good one at first.)
compile_to_c() is definitely not recommended, as the code it generates isn't very optimized; it's useful mostly as a debugging / development tool.
compile_to_bitcode() sounds like it could work, but I'm not aware of anyone using this as a distribution method.
(It would probably be useful to have an automated install available for Halide.)

Resources