Fortify Scan Engine Version effect on results - fortify

Scanning 3.5 million lines of code on SCA 4.30 w/ Scan Engine 6.30.0086 gets vastly different results than SCA 16.11 w/ 16.11.003
Is this correct? Similar scans on another code-base with much less in terms of code showed no difference across versions but obviously with a smaller sample size, this could be expected.
Does HP release information on what changes in Scan Engine versions? Does the scan engine versions effect results purposefully or could there be other factors influencing the results?

Multiple factors go into the rendering of results, more than just the scan engine. Over time the scan engine is improved, this can be in terms performance, fixing of bugs, or adding new features that allow our Security Research Team (https://community.hpe.com/t5/Security-Research/bg-p/off-by-on-software-security-blog) create new rules and/or refine existing rules to utilize the new features when identifying potential issues.
Even if the same rule pack is run on two difference scan engine versions, if some of the rules require features that are not available, the rule is not run and thus the results will differ.
You should also look at the warnings of the two scans. Are there any reference, parsing, or memory issues. These all should be handled to make sure you have a good clean scan (at the bare minimum, do you have the same exact warnings between the two scans). These warnings can also cause a difference in results between scans.
There could also be different settings between the to installs to cause the difference as well (filters, templates, etc.).
But in short, yes Scan Engine versions can cause different results even on the same code base with the same Rulepack versions. The internal workings of the Scan Engine is proprietary information and the detailed changes are not published.

Related

Reduce DeepLearning4j dependency size of exported jar

In my application, I would like to use Deeplearning4j. Deeplearning4j has over 120mb of dependencies, which is a lot considering my own code is only 0.5mb.
Is it possible to reduce the dependencies required? Would loading an already-trained network allow me to ship my application with a smaller file size?
There are many ways to reduce the size of your jar depending on what your use case is. We cover this more recently in our docs, but I'll summarize some things to try here:
DL4j is heavily based on javacpp. You can add -Djavacpp.platform=$YOUR_PLATFORM (linux-x86_64, windows-x86_64,..) to your build to reduce the number of native dependencies in there.
If you are using deeplearning4j-core, that includes a lot of extra dependencies you may not need. In this case, you may only need deeplearning4j-nn for the configuration. The same goes for if you are using only samediff, you do not need the dl4j apis. I don't know enough about your use case to confirm what you do and don't need though.
If you are deploying on an embedded platform, we also have the ability to reduce the number of supported operations and data types now as well. This feature is mainly for advanced users right now (involves building from source) but if you think that could also be applicable despite the first 2, please do confirm and I can try to clarify that a bit.

Difference between the ROS UniversalRobots/Universal_Robots_ROS_Driver and the ros-industrial/universal_robot driver

I've tried to understand the difference between the two ROS Universal Robot drivers and decide which one to use. So far, I'm mainly confused. As I am new to ROS and robot control I'd appreciate any explanation and hints where to start looking for more details.
From what I've seen, there are two Universal Robot ROS drivers available
(1) https://github.com/ros-industrial/universal_robot
(2) https://github.com/UniversalRobots/Universal_Robots_ROS_Driver
with (1) being "supplemented" by (3) https://github.com/fmauch/universal_robot/tree/calibration_devel.
(1) wasn't updated in while whereas (2) seems to get regular updates. Yet (3) seems to suggest that (1) is still being worked on (see https://github.com/ros-industrial/universal_robot/issues/573 as well). Which one is the "active" one? Which one should be used for which use case?
It also seems to me that I can't use (2) with gazebo. However, as I said, I'm new to ROS and might be misunderstanding something entirely. What I'd like to be able to do is simulate an UR10e to develop my application within gazebo and then swap to the real robot as transparently as possible.
Thanks to all the maintainers of the UR ROS drivers (from both/all repos :))!
The ROS-Industrial set of drivers was the original community driven drivers, started back in 2012. While sparsely updated compared to back then, since it has the force & name of ROS Industrial, the community is still actively submitting issues, PR, and they are (slowly) getting merged in, as needed.
The Universal Robotics drivers were started back in 2016, as the company's official ROS support, with support from ROS Industrial for the "modernization" of the drivers. It's seeing a ton of active development & upkeep, as there are dedicated people to maintaining just this repo & drivers.
They are both active, and functionally, they likely both work for what you're trying to do (since what you want seems simple or solved). Once the driver is written, if the device doesn't change, the driver doesn't need to. For example, the older ROS Industrial has a bunch of documentation (possibly old but still good) on using the drivers with Gazebo, and drivers for UR 10e (a very common device), so that would be sufficient for your needs. If you get done what you need to get done, that's fine.
As for the UR official drivers, they are (with ROS Industrial group's support) the "new" and "modern" drivers, to extend the functionality from the older drivers; I suppose I should be recommending you to use these, but they are still work-in-progress. If there are features/limitations in the old drivers that you need, they should be fixed in the modern drivers. To not date this reply too much, the exact coverage is constantly increasing, and the support should be quicker than the old drivers. Eventually, full functionality and more will be done through this "modern" branch; for example, ROS2 support will/should not be added to the older ROS Industrial drivers.

how mature is SD erlang project?

do you have any experience with SD Erlang project?
There seems to be implemented many interesting concepts regarding the comm mesh optimalizations and I'm just curious if some of you used those in production already or in some real project at least.
SD erlang repo
Thanks!
The project has finished a week ago. The main ideas behind SD Erlang are reducing the number of connections Erlang nodes maintain while keeping transitivity and common namespace for groups of nodes. Benchmarks that we used (Orbit, Ant Colony Optimization (ACO), and Instant Messenger) showed very promising results. Unfortunately, we didn't have enough human resources to refactor Sim-Diasca simulation engine. So, no, SD Erlang hasn't been used yet in a real application.
At the moment we are writing up the last deliverable that will provide an overview of what has been achieved. It will appear here in a few weeks (D6.2). In general we are happy with the results we get using SD Erlang, so there are plans for a follow up project to continue to work on it but currently this is work in progress.
This is not a direct answer but I will use SD-Erlang in a embedded application which needs to scale to hundreds of nodes (small embedded CPUs). From what I have seen its ready to be tried out in a real application. To furtehr evaluate lets consider the alternatives:
You have only a few distributed nodes: then you probably don't need it and can just connect all the nodes and for name registry use either the global module (slow but sturdy) or gprocwith the new locks_leader branch which avoids the quite broken gen_leader which so far prevented using gproc in distributed mode in production.
You need many nodes (how many depends on your hardware and requirements but you start to get into interesting territory with > 70 nodes)
Use SD-Erlang and fix whatever problems you encounter in production, or at least report them. It certainly solves a lot of the problems you get with normal Erlang distribution
Roll your own solution either with playing with different cookie values or with hidden nodes: hint you can set different cookie values for different peer nodes. But then you need to roll your own global name registry and management code: looks like a variant of Greenspuns 10th rule or closer to Erlang Virdings 1st rule : you probably will result in implementing half of SD Erlang yourself.
Don't use Erlang distribution at all. That seems to be the industry standard that for anything involving more nodes or crossing data-centers you shouldn't use Erlang distribution at all but run your own protocols. My personal opinion is to rather fix Erlang Distributions problems than just ditch it. Its much too useful and time saving when it works for a use case to just give up on it. And I see SD-Erlang as being the fix for the "too many nodes" problem, its at least the right starting point.

Datamining models in FORTRAN or C (or managed code)?

We are planning to develop a datamining package for windows. The program core / calculation engine will be developed in F# with GUI stuff / DB bindings etc done in C# and F#.
However, we have not yet decided on the model implementations. Since we need high performance, we probably can't use managed code here (any objections here?). The question is, is it reasonable to develop the models in FORTRAN or should we stick to C (or maybe C++). We are looking into using OpenCL at some point for suitable models - it feels funny having to go from managed code -> FORTRAN -> C -> OpenCL invocation for these situations.
Any recommendations?
F# compiles to the CLR, which has a just-in-time compiler. It's a dialect of ML, which is strongly typed, allowing all of the nice optimisations that go with that type of architecture; this means you will probably get reasonable performance from F#. For comparison, you could also try porting your code to OCaml (IIRC this compiles to native code) and see if that makes a material difference.
If it really is too slow then see how far that scaling hardware will get you. With the performance available through a modern PC or server it seems unlikely that you would need to go to anything exotic unless you are working with truly brobdinagian data sets. Users with smaller data sets may well be OK on an ordinary PC.
Workstations give you perhaps an order of magnitude more capacity than a standard dekstop PC. A high-end workstation like a HP Z800 or XW9400 (similar kit is available from several other manufacturers) can take two 4 or 6 core CPU chips, tens of gigabytes of RAM (up to 192GB in some cases) and has various options for high-speed I/O like SAS disks, external disk arrays or SSDs. This type of hardware is expensive but may be cheaper than a large body of programmer time. Your existing desktop support infrastructure shouldn be able to this sort of kit. The most likely problem is compatibility issues running 32 bit software on a 64-bit O/S. In this case you have various options like VMs or KVM switches to work around the compatibility issues.
The next step up is a 4 or 8 socket server. Fairly ordinary wintel servers go up to 8 sockets (32-48 cores) and perhaps 512GB of RAM - without having to move off the Wintel platform. This gives you fairly wide range of options within your platform of choice before you have to go to anything exotic1.
Finally, if you can't make it run quickly in F#, validate the F# prototype and build a C implementation using the F# prototype as a control. If that's still not fast enough you've got problems.
If your application can be structured in a way that suits the platform then you could look at a more exotic platform. Depending on what will work with your application, you might be able to host it on a cluster, cloud provider or build the core engine on a GPU, Cell processor or FPGA. However, in doing this you're getting into (quite substantial) additional costs and exotic dependencies that might cause support issues. You will probably also have to bring a third-party consultant who knows how to program the platform.
After all that, the best advice is: suck it and see. If you're comfortable with F# you should be able to prototype your application fairly quickly. See how fast it runs and don't worry too much about performance until you have some clear indication that it really will be an issue. Remember, Knuth said that premature optimisation is the root of all evil about 97% of the time. Keep a weather eye out for issues and re-evaluate your strategy if you think performance really will cause trouble.
Edit: If you want to make a packaged application then you will probably be more performance-sensitive than otherwise. In this case performance will probably become an issue sooner than it would with a bespoke system. However, this doesn't affect the basic 'suck it and see' principle.
For example, at the risk of starting a game of buzzword bingo, if your application can be parallelized and made to work on a shared-nothing architecture you might see if one of the cloud server providers [ducks] could be induced to host it. An appropriate front-end could be built to run locally or through a browser. However, on this type of architecture the internet connection to the data source becomes a bottleneck. If you have large data sets then uploading these to the service provider becomes a problem. It may be quicker to process a large dataset locally than to upload it through an internet connection.
I would advise not to bother with optimizations yet. First try to get a working prototype, then find out where computation time is spent. You can probably move the biggest bottlenecks out into C or Fortran when and if needed -- then see how much difference it makes.
As they say, often 90% of the computation is spent in 10% of the code.

What is bootstrapping?

I keep seeing "bootstrapping" mentioned in discussions of application development. It seems both widespread and important, but I've yet to come across even a poor explanation of what bootstrapping actually is; rather, it seems as though everyone is just supposed to know what it means. I don't, though. Near as I can figure, it has something to do with initialization tasks required of an application upon launch, but I could be completely wrong about that. Can anyone help me to understand this idea?
"Bootstrapping" comes from the term "pulling yourself up by your own bootstraps." That much you can get from Wikipedia.
In computing, a bootstrap loader is the first piece of code that runs when a machine starts, and is responsible for loading the rest of the operating system. In modern computers it's stored in ROM, but I recall the bootstrap process on the PDP-11, where you would poke bits via the front-panel switches to load a particular disk segment into memory, and then run it. Needless to say, the bootstrap loader is normally pretty small.
"Bootstrapping" is also used as a term for building a system using itself -- or more correctly, a predecessor version. For example, ANTLR version 3 is written using a parser developed in ANTLR version 2.
An example of bootstrapping is in some web frameworks. You call index.php (the bootstrapper), and then it loads the frameworks helpers, models, configuration, and then loads the controller and passes off control to it.
As you can see, it's a simple file that starts a large process.
The term "bootstrapping" usually applies to a situation where a system depends on itself to start, sort of a chicken and egg problem.
For instance:
How do you compile a C compiler written in C?
How do you start an OS initialization process if you don't have the OS running yet?
How do you start a distributed (peer-to-peer) system where the clients depend on their currently known peers to find out about new peers in the system?
In that case, bootstrapping refers to a way of breaking the circular dependency, usually with the help of an external entity, e.g.
You can use another C compiler to compile (bootstrap) your own compiler, and then you can use it to recompile itself
You use a separate piece of code that sets up the initial process without depending on any functions provided by the OS
You use a hard-coded list of initial peers or a hard-coded tracker URL that supplies the peer list
etc.
See on the Wikipedia article on bootstrapping.
There is a section and links explaining what it means in Computing. It has four different uses in the field.
Here are some quotes, but for a more in depth explanation, and alternative meanings, consult the links above.
"...is a technique by which a simple computer program activates a more complicated system of programs."
"A different use of the term bootstrapping is to use a compiler to compile itself, by first writing a small part of a compiler of a new programming language in an existing language to compile more programs of the new compiler written in the new language."
In the context of application development, "bootstrapping" usually comes up when talking about modular and/or auto-updatable software.
Rather than the user downloading the entire app, including features he does not need, and re-downloading and manually updating it whenever there is an update, the user only downloads and starts a small "bootstrap" executable, which in turn downloads and installs those parts of the application that the user needs. Additionally, the bootstrap component is able to look for updates and install them each time it is started.
Alex, it's pretty much what your computer does when it boots up. ('Booting' a computer actually comes from the word bootstrapping)
Initially, the small program in your BIOS runs. That contains enough machine code to load and run a larger, more complex program.
That second program is probably something like NTLDR (in Windows) or LILO (in Linux), which then executes and is able to load, then run, the rest of the operating system.
For completeness, it is also a rather important (and relatively new) method in statistics that uses resampling / simulation to infer population properties from a sample. It has its own lengthy Wikipedia article on bootstrapping (statistics).
Boot strapping the dictionary meaning is to start up with minimum resources. In the Context of an OS the OS should be able to swiftly load once the Power On Self Test (POST) determines that its safe to wake up the CPU. The boot strap code will be run from the BIOS. BIOS is a small sized ROM. Generally it is a jump instruction to the set of instructions which will load the Operating system to the RAM. The destination of the Jump is the Boot sector in the Hard Disk. Once the bios program checks it is a valid Boot sector which contains the starting address of the stored OS, ie whether it is a valid MBR (Master Boot Record) or not. If its a valid MBR the OS will be copied to the memory (RAM)from there on the OS takes care of Memory and Process management.
As the question is answered. For web develoment.
I came so far and found a good explanation about bootsrapping in Laravel doc. Here is the link
In general, we mean registering things, including registering service
container bindings, event listeners, middleware, and even routes.
hope it will help someone who learning web application development.
Bootstrapping has yet another meaning in the context of reinforcement learning that may be useful to know for developers, in addition to its use in software development (most answers here, e.g. by kdgregory) and its use in statistics as discussed by Dirk Eddelbuettel.
From Sutton and Barto:
Widrow, Gupta, and Maitra (1973) modified the Least-Mean-Square (LMS)
algorithm of Widrow and Hoff (1960) to produce a reinforcement
learning rule that could learn from success and failure signals
instead of from training examples. They called this form of learning
“selective bootstrap adaptation” and described it as “learning with a
critic” instead of “learning with a teacher.” They analyzed this rule
and showed how it could learn to play blackjack. This was an isolated
foray into reinforcement learning by Widrow, whose contributions to
supervised learning were much more influential.
The book describes various reinforcement algorithms where the target value is based on a previous approximation as bootstrap methods:
Finally, we note
one last special property of DP [Dynamic Programming] methods. All of them update estimates
of the values of states based on estimates of the values of successor
states. That is, they update estimates on the basis of other
estimates. We call this general idea bootstrapping. Many reinforcement
learning methods perform bootstrapping, even those that do not
require, as DP requires, a complete and accurate model of the
environment.
Note that this differs from bootstrap aggregating and intelligence explosion that is mentioned on the wikipedia page on bootstrapping.
I belong to the generation who flipped switches to enter a boot program. In the early 1980s, I worked on a microcomputer called Micro-78, developed by Electronics Corporation of India Ltd (ECIL). It was a sort of clone of Altair 8800. I distinctly remember what happens when a small boot program was entered using the toggle switches and executed by pressing a button. The program reads a second boot program contained in the 1st track of the floppy disk and overwrites it on itself in such a way that the second boot program starts executing to load a disk operating system. I think the term "bootstrap" refers to this process of the first boot program reading and overwriting the second boot program on itself, in a way "pulling itself up" with the additional functionality of the second boot program. That may be the origin of the original meaning of "the bootstrap program".
IMHO there is not a better explanation than the fact about How was the first compiler written?
These days the Operating System loading is the most common process referred as Bootstrapping
In terms of it in regards to using the popular Twitter Bootstrap I feel like this type of bootstrapping is the action of integrating a modular component into a Web application without the Web application having to even acknowledge the modular component exists until it needs it or references it.
The developer can seamlessly integrate a default copy of the CSS Twitter Bootstrap theme by simply loading (referencing) it into the Web application. Vuola! Then you may need to override some of these changes, but you can do so in such a way that the resource/component is untouched and completely reusable.
This same concept is how Web Devs implement jQuery APIs and so on, but it's not really expressed by Devs as bootstrapping per se. What it does is it improves flexibility and reusability while allowing the isolation of different components/resources of an app to reside freely either on the same server/s or possibly on a CDN.
NOTE: In computing bootstrapping deals with the MBR and in UNIX it requires a special bootloader or manager which is a small program in ROM that loads the OS into RAM. If you think about it the same concept takes places in the action of the bootstrap loader checking the MBR and loading the OS based on this table which occurs without the OS having any idea that this takes place.
Bootstrap file is responsible for loading contents of main file. It is a wrapper around main file. This way we can catch errors if loading of file was unsuccessful for some reason.
As a humble beginner in the world of programming, and flicking through all the answers here after seeing this word used a lot in apparently slightly different ways in different places, I found reading the Wikipedia page on Bootstrapping (duh! I didn't think of it either at first) is very informative to understand differences in use of this word. Could it be......on extremely rare occasions......Wikipedia might even have better explanations of certain terms than....(redacted)? Will they bring in rep points on Wikipedia though?
To me, it seems all the meanings something to do with: start with something as simple as possible Thing1, make something slightly more complex with that Thing2, and now you can use Thing2 to do some kind of tasks more efficiently and quickly than you could originally with Thing1. Then repeat from Thing2 to Thing 3 ad infinitum...
I see it as closely connected to both biological evolution and 'Layers of Abstraction' (newbies like me see, ahem, Wikipedia, cough) - the evolution from 1940's computers with switches, machine code, Assembly, C, Python, AIs you can give all kinds of complex instructions to like "make the %4^% dinner to my default &^$% requirements and clean the floor you %$£"#:~" in drunken slang English or Amazon tribal dialect without them 'raising an exception' (for newbies again...you guessed it) - missed out lot of links there due to simple ignorance.
Then in certain specific software meanings:
Meaning1: Thing1 is used to load latest version of Thing2 (because of course Thing2 will be bigger than Thing1, just as Thing3 will be be bigger than Thing2).
Meaning2: Thing1 is a lower level language (closer to 1001011100....011001 than print("Hello, ", user.name)) used to write a little bit of the higher language of Thing2, then this little bit of Thing2 is used to expand Thing2 itself from baby vocabulary level towards adult vocabulary level (Thing2 starts to be processed, or to use correct technical term 'compiled', by the baby version of itself (it's a clever baby!), whereas the baby version of Thing2 itself could of course only be compiled by Thing1, cause it can't exist before it exists, right duh!), then child version of Thing2 compiles Surly Teenager version of Thing2, at which point programming community decides whether Surly Teenager's 'issues' (software term and metaphor term!) are worth spending enough time resolving to be accepted long term, or to abandon them to (not sure where to take the analogy here).
If yes, then Thing2 has 'Bootstrapped' itself (possibly a few times) from babyhood to adulthood: "the child is the father of the man" (Wordsworth, suggest don't try looking up the quote or the author on Stack Overflow).

Resources