How can I convert security logs to a Knowledge Graph? - parsing

Given a set of logs from AWS, GCP, Azure and system/services logs, I would like to convert them to a knowledge graph.
I have researched that there are some partial solutions (and very old) for doing similar tasks - Slogert, LEKG and VloGraph
Any best practices doing it? Any known repo or solution?

Related

How do I create sample security issues on Docker?

I'm trying to create an assignment for students to do that contains the following :
A docker image with issues that have to be scanned and remedied. (using an opensource scanner in kubernetes)
(Maybe) A sample attack scenario that can exploit those vulnerabilities.
The problem arises when I try to find a suitable vulnerable image or create one. I cannot find a base of security issues at all. I really bend my back thinking of a suitable phrase in Google but everything leads merely to some blog posts about how-to scan an image.
I expected a database that might contain multiple sec issues and what causes them. I'd also expect some way to discern which are the most popular ones.
Do you have the source I require ?
Maybe you can just offer me 3-4 common security issues that are good to know and educational when having your first brush with docker ? (And how to create those issues ?)
The whole situation would have been probably easier if I myself would have been an expert in the field, but the thing I do is also my assignment as a student. (So as students we design assignments for each other. )
Looks like you are looking for the Container security hardening and Kubernetes security options maybe.
You can use some tools like
kubesec - Security risk analysis for Kubernetes resources
checkov - Prevent cloud misconfigurations and find vulnerabilities during build-time in infrastructure as code, container images and open-source packages
Trivy - vulnerabilities, misconfigurations, secrets, SBOM in containers, Kubernetes, code repositories, clouds and more
If you are looking for some questions you can set like, this is CKS (Certified Kubernetes Security) exam question
There are a number of pods/container running in the "spectacle" namespace.
Identify and delete the pods which have CRITICAL vulnerabilities.
For this trivy opensource tools comes into the picture to scan the image that you will be using in the deployment of Kubernetes or docker
trivy image --severity CRITICAL nginx:1.16 (Image running in container)
List of few questions you can create lab out of it : https://github.com/moabukar/CKS-Exercises-Certified-Kubernetes-Security-Specialist/tree/main/7-mock-exam-questions

Azure CI container per customer

I have a monolithic application based on .NET , the application itself is a web based app.
I am looking at multiple articles and trying to figure out if the Azure CI or similar would be an correct service to use.
The application will run 24/7 and i guess this is where confusion comes in, wouldn't it be normal to have always on application running on CI?
What i am trying to achieve is a container per customer where each customer gets one or more instances that he owns. The other question would be costs and scalability, i would expect to have thousands of containers so perhaps i should be looking at Kubernetes ?
Thanks.
Here is my understanding. I'm pretty new to both ACI and Kubernetes, so treat this as a suggestions and not a definitive answers 🙂.
Azure Container Instances is a quick, easy and cheap way to run a single-instance of a container in Azure. However, it doesn't scale very well on its own (it can scale up, but not out, and not automatically..), and it lacks the many container-orchestration features that kubernetes offers.
Kubernetes offers a lot more, such as zero-downtime deployments, scaling out with multiple replicates, and many more features. It is also a lot more complex, costs more, and takes much longer to set up.
I think ACI is a bit too simple to meet your use-case.

Automating Speed Tests on Merging Pull Requests

I am trying to track the page speed of certain urls of my project on each merging of the pull requests in Github and output the results of report in HTML format or JSON file. On the CI side, I am going to use Jenkins. I have no prior knowledge on performance testing. I want to know about the best approach to automate the speed test, integrate it with Jenkins and output the result.
On researching over the internet, I noted few possibilities which could be done to achieve this goal.
Installing "Page Speed Insights (psi) node package", creating the script that uses the psi for fetching the speed of certain pages, generating the test reports for use with Jenkins. (Referred to this link by Oxagile)
Performance testing using Jmeter and integrating with Jenkins.
Performance analysis using LightHouse. (Referred to this link by Timo Stollenwerk)
Choosing the right approach is very important. Therefore, I would be very grateful if anyone can suggest me different approaches and thus the right one to use(with examples if possible)in my case to achieve this goal.
Thank you in advance.
After quite a bit of research, I found out that sitespeed.io is the best solution for achieving this goal. It is a complete web performance tool that helps us to measure the performance of the website. It is best for running in the continuous integration to find web performance regressions on commits and monitoring them in production and alerting on regressions.

Is there any available panel to store machine learning models and their config files in a structured manner?

Saving different models with their corresponding config files, tracking the results and parameters, searching among them using customized filters and maybe always having a pointer to the current SOTA can be quite time-saving.
I couldn't even find something similar to TensorFlow Hub on the local server. Right now, closest I could get is Git LFS.
Is there anything better out there?
I found the answer. A few open source projects are trying to do the job. The first one is named Data Science Version Control or DVC. Which according to the docs is:
simple command line Git-like experience. Does not require installing and maintaining any databases. Does not depend on any proprietary online services;
It manages and versions datasets and machine learning models. Data is saved in S3, Google cloud, Azure, Alibaba cloud, SSH server, HDFS or even local HDD RAID;
It makes projects reproducible and shareable, it helps to answer the question: "how the model was built";
It helps manage experiments with Git tags or branches and metrics tracking;
The other possible solution to think of is MinIO which is an object storage server
suited for storing unstructured data such as photos, videos, log files, backups and container / VM images.
Microsoft Azure has a service called Azure Machine Learnig service that does exactly this, but goes much further with governance/model explainability/DevOps etc. We also include free tiers to a lot of services and announced unlimited private repos on GitHub recently.

Some questions about Docker Image

I am very very new to Docker and trying to really wrap my head around the concept and also struggling a little bit. While I have not created any image yet but my team is moving to Docker and I do have very fundamental questions. Let me start with what I understand
I can create an Image of my application which can consist of OS version, Web server configuration and Applicaton binaries
However what I do not understand is that there are far more things involved in n-tier application and I have a lot of questions which I am struggling to find answers for. i just wanted to post some of them here and see if some of them can be clarified.
As I just mentioned above an n-tier application has far more things involved than my Binaries and web server settings. If I have multiple layers(Binaries) for my application say one for services and one for MVC client then do I need an image per layer?)
What happens to .config files? One thing that confused me a lot is that it is mentioned that you can use the same image for testing and prod. Then something has to be different across these environments right? Would that something be config files? if yes then why it is not mentioned anywhere?
What happens to DB? Do we spin up another image for DB?
I am hope I am not very far off on my assumptions.

Resources