I am writing a commercial application which will have license keys which will be checked and validated server side. I would like to restrict the amount of computers that the application can be installed on (ie 1 copy only). IP addresses can be unreliable for this scenario. Is there any unique identifier between computers on all operating systems?
You can read MAC address or UUID to make more bulletproof, and identify the computer by the mix of both.
If you can read the UUID maybe it's enough to identify unique computer even in different operating systems.
Since you don't tag with a language this question, there are several possible ways to read that values.
Related
We all know situations when you cannot go open source and freely distribute software - and I am in one of these situations.
I have an app that consists of a number of binaries (compiled from C sources) and Python code that wraps it all into a system. This app used to work as a cloud solution so users had access to app functions via network but no chance to touch the actual server where binaries and code are stored.
Now we want to deliver the "local" version of our system. The app will be running on PCs that our users will physically own. We know that everything could be broken, but at least want to protect the app from possible copying and reverse-engineering as much as possible.
I know that Docker is a wonderful deployment tool so I wonder: is it possible to create encrypted Docker containers where no one can see any data stored in the container's filesystem? Is there a known solution to this problem?
Also, maybe there are well known solutions not based on Docker?
The root user on the host machine (where the docker daemon runs) has full access to all the processes running on the host. That means the person who controls the host machine can always get access to the RAM of the application as well as the file system. That makes it impossible to hide a key for decrypting the file system or protecting RAM from debugging.
Using obfuscation on a standard Linux box, you can make it harder to read the file system and RAM, but you can't make it impossible or the container cannot run.
If you can control the hardware running the operating system, then you might want to look at the Trusted Platform Module which starts system verification as soon as the system boots. You could then theoretically do things before the root user has access to the system to hide keys and strongly encrypt file systems. Even then, given physical access to the machine, a determined attacker can always get the decrypted data.
What you are asking about is called obfuscation. It has nothing to do with Docker and is a very language-specific problem; for data you can always do whatever mangling you want, but while you can hope to discourage the attacker it will never be secure. Even state-of-the-art encryption schemes can't help since the program (which you provide) has to contain the key.
C is usually hard enough to reverse engineer, for Python you can try pyobfuscate and similar.
For data, I found this question (keywords: encrypting files game).
If you want a completely secure solution, you're searching for the 'holy grail' of confidentiality: homomorphous encryption. In short, you want to encrypt your application and data, send them to a PC, and have this PC run them without its owner, OS, or anyone else being able to scoop at the data.
Doing so without a massive performance penalty is an active research project. There has been at least one project having managed this, but it still has limitations:
It's windows-only
The CPU has access to the key (ie, you have to trust Intel)
It's optimised for cloud scenarios. If you want to install this to multiple PCs, you need to provide the key in a secure way (ie just go there and type it yourself) to one of the PCs you're going to install your application, and this PC should be able to securely propagate the key to the other PCs.
Andy's suggestion on using the TPM has similar implications to points 2 and 3.
Sounds like Docker is not the right tool, because it was never intended to be used as a full-blown sandbox (at least based on what I've been reading). Why aren't you using a more full-blown VirtualBox approach? At least then you're able to lock up the virtual machine behind logins (as much as a physical installation on someone else's computer can be locked up) and run it isolated, encrypted filesystems and the whole nine yards.
You can either go lightweight and open, or fat and closed. I don't know that there's a "lightweight and closed" option.
I have exactly the same problem. Currently what I was able to discover is bellow.
A. Asylo(https://asylo.dev)
Asylo requires programs/algorithms to be written in C++.
Asylo library is integrated in docker and it seems to be feаsable to create custom dоcker image based on Asylo .
Asylo depends on many not so popular technologies like "proto buffers" and "bazel" etc. To me it seems that learning curve will be steep i.e. the person who is creating docker images/(programs) will need a lot of time to understand how to do it.
Asylo is free of charge
Asylo is bright new with all the advantages and disadvantages of being that.
Asylo is produced by Google but it is NOT an officially supported Google product according to the disclaimer on its page.
Asylo promises that data in trusted environment could be saved even from user with root privileges. However, there is lack of documentation and currently it is not clear how this could be implemented.
B. Scone(https://sconedocs.github.io)
It is binded to INTEL SGX technology but also there is Simulation mode(for development).
It is not free. It has just a small set of functionalities which are not paid.
Seems to support a lot of security functionalities.
Easy for use.
They seems to have more documentation and instructions how to build your own docker image with their technology.
For the Python part, you might consider using Pyinstaller, with appropriate options, it can pack your whole python app in a single executable file, which will not require python installation to be run by end users. It effectively runs a python interpreter on the packaged code, but it has a cipher option, which allows you to encrypt the bytecode.
Yes, the key will be somewhere around the executable, and a very savvy costumer might have the means to extract it, thus unraveling a not so readable code. It's up to you to know if your code contains some big secret you need to hide at all costs. I would probably not do it if I wanted to charge big money for any bug solving in the deployed product. I could use it if client has good compliance standards and is not a potential competitor, nor is expected to pay for more licenses.
While I've done this once, I honestly would avoid doing it again.
Regarding the C code, if you can compile it into executables and/or shared libraries can be included in the executable generated by Pyinstaller.
My product needs to be deployed and installed on the client's server. How can I take care that my ruby code is encrypted and invisible
You can't. In order to execute the code, the CPU needs to understand the code. CPUs are much stupider than humans, so if the CPU can understand the code, then so can a human.
There are only two possibilities:
Don't give your client the code. (The "Google" model.) Instead, give them a service that runs your code under your control.
Give your client a sealed box. (The "XBox" model.) Give your client the code, pre-installed on a hardened, tamper-proof, secure computer under your control, running hardened, tamper-proof, secure firmware under your control, and a hardened, tamper-proof, secure OS under your control. Note that this is non-trivial: Microsoft employed some of the most brilliant hardware security, information security, and cryptography experts on the planet, and they still made a mistake that made the XBox easy to crack.
Unfortunately, you have excluded both those possibilities, so the answer is: you can't.
Note, however, that copying your code is illegal. So, if you don't do business with criminals, then it may not even be necessary to protect your code.
Here are some examples how other companies solve this problem:
Have a good relationship with your clients. People are less likely to steal from friends they like than from strangers they don't know, or people they actively dislike.
Make the product so good that clients want to pay.
Make the product so cheap that clients have no incentive to copy the code.
Offer additional values that you cannot get by copying the code, e.g. support, services, maintenance, training, customization, and consulting.
Especially in the corporate world, clients often prefer to pay, simply for having someone to sue in case something goes wrong. (You can see this as a special case of the last point.)
Note that copy protection schemes are not free. You at least have to integrate it into your product, which takes developer time and resources. And this assumes that the protection scheme itself is gratis, which is typically not the case. These are either pretty expensive, or you have to develop your own (which is also pretty expensive because experienced cryptographers and infosec specialists are not cheap, and cheap cryptographers and infosec specialists will not be able to create a secure system.)
This in turn increases the price of your product, which makes it more likely that someone can't afford it and will copy it.
Also, I have never seen a copy protection scheme that works. There's always something wrong with them. The hardware dongle is only available with an interface the client doesn't have. (For example, when computers stopped having serial and parallel ports in favor of USB, a lot of copy protection schemes still required serial or parallel ports and didn't work with USB-to-serial or USB-to-parallel adapters.) Or, the client uses a VM, so there is no hardware to plug the dongle into. Or, the copyright protection scheme requires Internet access, but that is not available. Or, the driver of the dongle crashes the client's machine. Or, the license key contains characters that can't easily by typed on the client's keyboard. Or, the copy protection scheme has a bug that doesn't allow non-ASCII characters, but you are using the client's name as part of the key. Or, the manufacturer of the copy protection scheme changes the format of dongle to an incompatible one without telling you, and without changing the type number, or the color and physical form of the dongle, so you don't notice.
Note that none of this is hypothetical: all of these have happened to me as a user. Several of these happened at vendors I know.
This means that a there will be significant amount of resources needed in your support department to deal with those problems, which increases the cost of your product even further. It also decreases client satisfaction, when they have problems with your product. (Again, I know some companies that use copy protection and get a significant amount of support tickets because of that.)
There are industries where it is quite common that people buy the product, but then use a cracked version anyway because the copyright protection schemes are so bad that the risk of losing your data due to a cracked version from an untrusted source is lower than losing your data due to the badly implemented copyright protection scheme.
There is a company that is very successful, and very loved by its users that does not use any copy protection in a market where everybody uses copy protection. This is how they do it:
Because they don't have to invest development resources into copy protection, their products are at least as good as their competition's for less development effort.
Because they don't have to invest development resources into copy protection, their products are cheaper than their competition's.
Because their product are not burdened with the overhead of copy protection, their products are more stable and more efficient than their competition's.
They have fair pricing, based on income levels in their target countries, meaning they charge lower prices in poorer countries. This makes it less likely that someone copies their product because they can't afford it.
A single license can be used on as many machines as you like, both Windows and macOS.
There is a no-questions-asked, full-refund return policy.
The lead-developer and the lead-designer personally respond to every single support issue, feature request, and enhancement suggestion.
Of course, they know full well that people abuse their return policy. They buy the product, use it for a project, then give it back. But, they have received messages from people saying "Hey, I copied your software and used it in a project. During this project, I realized how awesome your software is, here's your money, and here's something extra as an apology. Also, I showed it to my friends and colleagues, and they all bought a copy!"
Another example are switch manufacturers. Most of them have very strict license enforcement. However, one of them goes a different route: there is one version of the firmware, and it always has all features enabled. Nothing bad will happen if you use features that you haven't paid for. However, when you need support, they will compare your config to your account, and then say "Hey, we noticed that you are using some features you haven't paid for. We are sure that this is an honest mistake on your part, so we will help you this once, but please don't forget to send us a purchase order as soon as possible, thanks!"
Guess which manufacturer I prefer to work with, and prefer to recommend?
I wish to create a platform as a service in the financial markets using Erlang/Elixir. I will provide AWS lambda-style functions in financial markets, but rather than being accessible via web/rest/http, I plan to distribute my own ARM-based hardware terminals to clients (Nvidia Jetson TX2-based or similar, so decent hardware). They will access the functions from these terminals. I want said terminals to be full nodes in the system. So they will use the actor model to message pass to my central servers, and indeed, the terminals might message pass amongst each other if terminal users decide to put their own functions online.
Is this a viable model? Could I run 1000 terminals like this? 100 000? What kinds of limitations might I start bumping into? Is Erlang message routing scalable enough to imagine such a network still being performant if we had soft-real time financial markets streaming data flowing around? (mostly from central servers to terminals, but a good proportion possible moving directly around from terminal to terminal). We could have a system where up to 100k or more different "subscription" data channel processes were available, many of them taking input and producing output every second.
Basically I'd like a canonical guide to the scalability capabilities of an Erlang system something like the above. Ideally I'd also like some guide to the security implications of such a system ie. would global routing tables or any other part of the system be compromisable by a rogue terminal user, or can edge nodes be partly "sealed off" from sensitive parts of the rest of the Erlang network?
Note that I'd want to make heavy use of ports/NIFs for high-compute processes.
I would not pursue this avenue for various reasons, all of which hark back to the sort of systems that Erlang's distribution mechanism was developed for - a set of boards on a passive backplane: "free" local bandwidth and the whole machine sits in the same security domain. The Erlang distribution protocol is probably too chatty to work well on widely spread and large networks, and it is certainly too insecure. Unless you want nodes to be able to execute :os.cmd("rm -rf /") on each other, of course.
Use the Erlang distribution protocol in your central system to your heart's content, and have these terminals talk something that's data-only-over-SSL to that system and each other. On top of that, you can quite simply build a sort of overlay network to do whatever you want.
I recommend read this carefully and i recommend divide your service to little Micro-Services too.
Another benchmark is Investigating the Scalability Limits of
Distributed Erlang.
In the Joe Armstorng's book programming Erlang, he said:
"A few years ago, when I had my research hat on, I was working with PlanetLab. I had access to the PlanetLab a network, so I installed empty Erlang servers on all the PlanetLab machines (about 450 of them).
I didn’t really know what I would do with the machines, so I just set up the server infrastructure to do something later."
Do not use External ports, use internal drivers which are written in C or C++ instead.
You will find a lot of information regarding erlang Architectures is this answer: How scalable is distributed Erlang?
Short answer is, there is a pratical limitation of nodes in a cluster, but this limitation can be breach with federations fairly easily.
EDIT 1/ Further more I would recommend to read this book : Designing for scalability with Erlang/OTP
As I understand, ACPI defines a generic hardware programming model where operating system relies on the OEM firmware provided AML (ACPI machine language) code to manipulate the hardware.
In order to execute the AML code, operating system has to incorporate an AML interpreter.
So, it looks to me that firmware developers use AML to provide a control interface between platform hardware and operating system.
But do we really need AML?
I think ultimately the hardware can only be configured through the native instruction of the platform. So the AML interpreter must translate the AML into native instructions otherwise it cannot be executed on the platform.
But what's the point of using an intermediate language like AML? I mean though the AML is said to be platform-independent, which means I can use AML to describe my platform in a non-native way.
But the AML is part of the platform firmware in practice. And the entire firmware has already been built into the target platform's native instructions. So what good can it be to make such a little part of the firmware as platform-independent? Why not just use native instructions? There must be some way to let OS use it as well. And this way operating system doesn't need the AML interpreter at all. A lot of complexity can be avoided.
One of the big goals of ACPI over its predecessor APM was to give the OS more viability and control over power state transitions.
APM was a black box. The OS knew nothing about the power management implementation. It would just call a BIOS function and the BIOS handled all of the magic. Did it work? Did the system sleep properly? Did the system freeze? Was a user application able to handle the BIOS implementation? The sad truth was that many systems had power management that was downright broken, and Microsoft wanted to provide a better power management experience for the growing laptop industry.
Now, the BIOS hands the ASL/AML code over to the OS and the OS executes it not the BIOS. If the BIOS code does something dumb (like messing with registers it shouldn't), Windows can detect that by parsing the code and block it. AML is 100% decompilable unlike C.
Remember that ACPI is not x86 specific. At the time it was developed, Itanium and Xscale were around. Intel and Microsoft needed a language that would work on all platforms, both 32 and 64 bit.
Lastly, ASL is more than just a list of executable functions. It is also number of static configuration tables. The ASL code has tables to define the non PnP hardware built onto your motherboard. It has tables of supported power states. A traditional programming language like C isn't really setup for that.
If ACPI was invented today, they would probably use something like XML to provide the info to the OS.
Originally, hardware for "80x86 PC" was cloned from IBM's PC, and this created an effective de-facto standard for hardware to follow. However it didn't take long before manufacturers wanted to add features that didn't previously exist, where there was no (official or de-facto) standard to follow.
This led to a major problem for operating system software (how do you support "non-standard chaos"). Some standards were created for some things (APM, etc) but they didn't really cover everything needed and became out-of-date. ACPI was created to fix this.
Ideally, what was (and still is) needed is standards that allow operating system to detect and use supported features of the motherboard. For example, a "standardised case temperature and fan control" device (with support for detecting how many fans, temperature sensors, etc), or a "standardised CPU speed/power consumption", a "PCI slot IRQ routing for IO APICs" standard, a "hot-plug PCI controller device" standard, etc.
However, ACPI didn't provide useful standards that hardware manufacturers and operating systems can use. Instead, ACPI provided an over-engineered mess (AML) to allow an OS to cope with ACPI's failure to standardise the hardware.
Essentially; we "need" AML now because it's the only viable way for an OS to work-around the "non-standard chaos" problem that ACPI failed to fix.
The problem with providing native code instead of AML is that different operating systems use CPUs in different ways (e.g. native 64-bit 80x86 code in firmware would be useless for an older "32-bit only" OS). AML provides portability between different types of CPUs and between the same CPU/s in different modes.
Also; native code is considered a major security problem (rootkits, etc); and people tend to think an interpreted language mitigates that problem. Of course in practice AML needs far too much access to the underlying hardware and does it in a way that an OS can't check, and there's isn't even a way for an OS to determine if the AML has been maliciously modified before the OS booted. For these reasons AML is still a major security problem despite using interpreted language.
I completely newbie in device drivers, so I hope my question is in place, but I need to develop a driver to control some equipment. I was thinking on using Linux as the host OS, but not sure if it is such good idea. I've heard some horror stories about the mess of developing device drivers under Linux. Is there a better alternative under the *Nix world? Or maybe should I check other OSes?
Linux documentation is basically non-existent (similar to other platforms). However, there are a few books which do cover enough information to get started, and the trickier kernel bits can borrowed from other drivers (yay for Open Source).
However, it is one of the easiest current platforms to develop drivers for. There are cleaner models, such as QNX, but that product is sadly near the end (and doesn't support 1/10th as much as hardware as Linux)
What type of device is the driver targetting? Many times you can avoid writing in-kernel drivers (for instance, using libusb in userspace, or the user space IO framework)