How to fully manage ML Lifecycle in git for reproducible - machine-learning

All components of the end to end machine learning life cycle including data preparation steps, data clean, training model code, models and more want to be stored and version controlled in Git.
Can you please share the steps to do this?

Azure Machine Learning Pipelines support everything. Further along the deployment and scale maturity level, use Kubeflow+MLflow to manage the scheduling of jobs. Both are open-source, platform and framework agnostic.
Azure MLOps provides comprehensive ML lifecycle management.
MLOps Happy Path: happy path for MLOps - end-to-end solution,
Azure pipelines + ML CLI/MLOps Example Azure (DevOps) pipeline that uses ML CLI
Kubeflow Labs: https://github.com/Azure/kubeflow-labs

Our team does this by:
Use the SDK and Azure ML Pipelines to encapsulate all code and artifact creation within the pipeline control plane
Use pygit2 to coordniate artifact registration with feature branch names.
The Azure ML MLOps team also has this repo which provides more info on step 1.

Related

AWS SageMaker ML DevOps tooling / architecture - Kubeflow?

I'm tasked with defining AWS tools for ML development at a medium-sized company. Assume about a dozen ML engineers plus other DevOps staff familiar with serverless ( lambdas and the framework ). The main questions are: a) what is an architecture that allows for the main tasks related to ML development (creating, training, fitting models, data pre-processing, hyper parameter optimization, job management, wrapping serverless services, gathering model metrics, etc ), b) what are the main tools that can be used for packaging and deploying things and c) what are the development tools (IDEs, SDKs, 'frameworks' ) used for it?
I just want to set Jupyter notebooks aside for a second. Jupyter notebooks are great for proof-of-concepts and the closest thing to PowerPoint for management... But I have a problem with notebooks when thinking about deployable units of code.
My intuition points to a preliminary target architecture with 5 parts:
1 - A 'core' with ML models supporting basic model operations (create blank, create pre-trained, train, test/fit, etc). I foresee core Python scripts here - no problem.
2- (optional) A 'containerized-set-of-things' that performs hyper parameter optimization and/or model versioning
3- A 'contained-unit-of-Python-scripts-around-models' that exposes an API and that does job management and incorporates data pre-processing. This also reads and writes to S3 buckets.
4- A 'serverless layer' with high level API ( in Python ). It talks to #3 and/or #1 above.
5- Some container or bundling thing that will unpack files from Git and deploy them onto various AWS services creating things from the previous 3 points.
As you can see, my terms are rather fuzzy:) If someone can be specific with terms that will be helpful.
My intuition and my preliminary readings say that the answer will likely include a local IDE like PyCharm or Anaconda or a cloud-based IDE (what can these be? - don't mention notebooks please).
The point that I'm not really clear about is #5. Candidates include Amazon SageMaker Components for Kubeflow Pipelines and/or Amazon SageMaker Components for Kubeflow Pipelines and/or AWS Step Functions DS SDK For SageMaker. It's unclear to me how they can perform #5, however. Kubeflow looks very interesting but does it have enough adoption or will it die in 2 years? Are Amazon SageMaker Components for Kubeflow Pipelines, Amazon SageMaker Components for Kubeflow Pipelines and AWS Step Functions DS SDK For SageMaker mutually exclusive? How can each of them help with 'containerizing things' and with basic provisioning and deployment tasks?
Its a long question although and these things totally make sense when you think to design ML infrastructure for production. So there are three levels that defines the maturity of your machine learning process.
1- CI/CDeployment: in this docker image will go through stages like build, test and push the versioned training image to the registry. You can also perform training in these and can store versioned model using git references.
2- Continuous Training: Here we deal with the ML Pipeline. Automation of the process using new data to retrain models. It becomes very useful when you have to run whole ML pipeline with new data or new implementation.
Tools for implementation: Kubeflow pipelines, Sagemaker, Nuclio
3- Continuous delivery: Where?? On cloud or on Edge? On cloud then you can use KF serving or use sage maker with kubeflow pipelines and deploy the model with sagemaker through Kubeflow.
Sagemaker and Kubeflow somehow give same functionality but each of them have their unique power. Kubeflow has power of kubernetes, pipelines, portability, cache and artifacts meanwhile Sagemaker have power of Manged infrastructure and scale from 0 capability and AWS ML services like Athena or Groundtruth.
Solution:
Kubeflow pipelines standalone + AWS Sagemaker(Training+Serving Model) + Lambda to trigger pipelines from S3 or Kinesis.
Infra required.
-Kubernettess cluster (Atleast 1 m5)
-MinIo or S3
-Container registry
-Sagemaker credentials
-MySQL or RDS
-Loadbalancer
-Ingress for using kubeflow SDK
Again you asked me my year journey in one question. If you are intrested lets connect :)
Permissions:
Kube --> registry (Read)
Kube --> S3 (Read, Write)
Kube --> RDS (Read, Write)
Lambda --> S3 (Read)
Lambda --> Kube (API Access)
Sagemaker --> S3, Registery
A good starting guide
https://github.com/kubeflow/pipelines/tree/master/manifests/kustomize/env/aws
https://aws.amazon.com/blogs/machine-learning/introducing-amazon-sagemaker-components-for-kubeflow-pipelines/
https://github.com/shashankprasanna/kubeflow-pipelines-sagemaker-examples/blob/master/kfp-sagemaker-custom-container.ipynb

How can I make an API after compiling and running my pipeline on Kubeflow?

I built a pipeline which takes an image and returns a number of persons. I want to make an API which takes an image and returns a JSON file with count using Kubeflow.
There are a few ways that you can deploy a model for inference from a pipeline:
You can use Kubeflow components like KFServing and the KFServing Deployer component for Kubeflow Pipelines
If you are using a cloud provider, they may have services you can use for inference. For example, there is a component that deploys trained models to Google Cloud AI Platform
Or, you could build a custom solution

What is the difference between jenkins and cloudbees jenkins?

I could not find the difference between these two. Are these same or different.
The first difference is support (as others have mentioned). CloudBees offers enterprise grade support as well as a fully vetted and tested version of Jenkins that will be more stable under various plugins and deployments. You can actually purchase "Support Only" from CloudBees if you are satisfied with your OSS Jenkins deployment and simply want support during upgrades, patching, break/fix, etc.
From a feature perspective, CloudBees brings a lot from an enterprise manageability, scalability, and security standpoint.
Manageability: CloudBees comes with CJOC (CloudBees Jenkins Operations Center) built into the software. This is a single pane of glass management console that allows organizations or large teams to centrally manage the jenkins environment. Things like folders, RBAC, pipeline and master templates, and the ability to rapidly spin up/tear down a containerized jenkins master are all managed from this single console.
Scaleability: CloudBees leverages Kubernetes to provide organizations with the ability to elastically scale Jenkins environments as needed. With CloudBees, your oganization can move away from a single "Monolithic"/"Frankenstein" master and into a multi-master and distributed pipeline architecture. This greatly reduces upgrade and administration complexity. This also eliminates the risk from a single point of failure that a monolithic architecture exposes.
Security: CloudBees allows organizations to install Roll Based Access Control within Jenkins. This keeps users from accidentally or intentionally accessing repos that they shouldn't be allowed to interact with. CloudBees also provides "folders" to segregate specific job executions onto specific agents. Lastly CloudBees allows organizations to create pipeline templates and associated plugins for each team. These templates can be as rigid or loose as desired per the organizations security policies.
CloudBees is regularly adding enhancements to further differentiate themselves from Jenkins Open Source and make themselves more appealing to large enterprise requirements.
On top of the above, CloudBees has developed a presentation layer that rides on top of Jenkins for SDLC pipeline, CD monitoring, and metric tracking called DevOptics.
Jenkins is open source while CloudBees Jenkins Enterprise is a commercial extension of open source Jenkins. Go here for an up to date comparison table.

Roles and responsibilities in Devops

In a Devops context, Who is the responsible for the automation tasks ?
more exactly in the case of "pipeline as a code" in jenkins . who is supposed to do this task ? the devoloper or the operator ?
who is the actor ?
"The key to DevOps is greater collaboration between engineering and operations."
Roles : DEVOPS
Responsibilities :
1. Management : The DevOps Engineer ensures compliance to standards by monitoring the enterprise software and online websites. The engineer also regulates tools and processes in the engineering department and catalyses their simultaneous enhancement and evolution.
2 Design and Development : Design and Development of enterprise infrastructure and its architecture is one of the major responsibilities that DevOps Engineers are tasked with. Such Engineers are highly skilled coders which enable them to script tools aimed at enhancing developer productivity.
3 Collaboration and Support : The DevOps’ Modus Operandi is to collaborate extensively and yield results in all aspects of their work. Everything ranging from technical analyses to deployment and monitoring is handled, with the focus to enhance overall system reliability and scalability. The diagram below gives one a clear picture of the values that define DevOps.
4 Knowledge : DevOps staff and Engineers aid in promotion of knowledge sharing and overall DevOps culture throughout the engineering department
5 Versatile Duties : DevOps staff and Engineers also take on work delegated by IT director, CTO, DevOps head and more. They will also perform similar duties to the designations mentioned above.
Standard Definition :
DevOps is an IT mindset that encourages communication, collaboration, integration and automation among software developers and IT operations in order to improve the speed and quality of delivering software.
Layman's Definition :
Any kind of automation that enables the opportunity for smoother Development, Operations, Support and delivery of the product is DevOps.
Indrustry's View :
There usually are two prominent area's where DevOps mindset is applied across industry :
a) Primary functionaries of DevOps like
• Continuous Integration,
• Continuous Delivery,
• Continuous Deployment,
• Infrastructure as an code or infrastructure Automation,
• CI/CD Pipeline Orchestration,
• Configuration Management and
• Cloud Management (AWS, Azure or GCP)
b) Secondary functionaries of DevOps like
• SCM tool Support,
• Code Quality tech support like Sonar, Veracode, Nexus etc.
• Middleware tech support for tools like NPM, Kafka, Redis, NGIMX, API Gateway, etc
• Infrastructure tech support for components like F5, DNS, Web Servers, Build Server Management etc
• OS Level support for miscellaneous activities lke Server Patching, Scripting for automation of server level tasks etc.
There is no exact answer to this. It depends on many factors.
The development team will most likely want more ownership over the pipeline, and therefore would want to own the templates / code required to achieve the end goal of automation.
The opposite side of this is also completely valid. An operations team could be the custodians of a pipeline and mandate a development team must meet certain standards and use their automation pipelines to be able to get into an environment or onto a platform.
If an environment is an island, and development teams are trying to get to that island. Each development team can build their own bridge to get there. Or the operations team can build a bridge and ask the development teams to use it. Both are valid and the end result is the same either way.
If the end result is the same, then the only thing that matters is how you apply it in the context of the organization, team(s) and the people you are working with to achieve that common goal.
The assigned developer (and scrum team) should be responsible for the complete delivery of all aspects of development through final deployment into production. This fosters the notions of ownership and empowerment, and focuses responsibility for the full life cycle delivery of the service (application).
DevOps engineering should be responsible for providing an optimal tool chain and environments for rapid and quality delivery. I see DevOps role as the development focused precedent to SRE. If SRE's maintain high performance, stable production environments, then DevOps team maintains optimal development and testing environments. In theory, DevOps should extend into the realm of SRE, conforming into a single team supporting the environments for rapid innovation with quality to meet the business needs.
Everything from committing of the code to production. This includes
Automation
Production Support
Writing automation scripts
Debugging Production Infrastructure
In short Devops = Infrastructure + Automation + Support

Can BuildForge do what Hudson CI is currently doing?

I am looking for a comparison between IBM Build Forge (Rational) and Hudson CI.
At work we have full licenses for BuildForge but recently we started using Hudson for doing continuous integration and automating other tasks.
I used BuildForge very little and I would like to see if there are any special advantages of BuildForge over Hudson.
Also it would be very helpful to see a list of specific advantages of Hudson over BuildForge.
I not sure if it important or not, but I found interesting that Build Forge is not listed under continuous integration tools at wikipedia.
Thanks for bringing attention to the fact it was not on the wikipedia list of continuous integration applications. I have now added it. Build Forge has been a leader in providing continuous integration capabilities by use of it's SCM adapters for many, many years. Build Forge has a strength in supporting many platforms through its use of agents. These agents can run on Windows, Linux, AIX, Solaris, System Z, and many more -- they even give you the source code for the agents for free so you can compile it on just about any platform. The interface allows you to easily automate tasks that run sequentially or in parallel on one or multiple boxes. Selectors allow you to select a specific build server by host name or by criteria such as "any windows machine with 2gb of ram" from a pool of available agents. The entire process is fully auditable, utilizes role based permissions, and is stored in a central enterprise database such as DB2, Oracle, SQL Server, and others.
One of the most compelling reasons to use Build Forge is it's Rational Automation Framework for WebSphere. It allows a full integration into WebSphere environments to automate deployments and configurations of WebSphere through out of the box libraries. The full installation, patching, deployment of apps, and configuration of WAS and Portal can be performed using these libraries. To find out more, it is best to contact your IBM Rational representative.
You can use RAFW (IBM Rational Automation Framework for WebSphere) with BuildForge. It does not make sense to use RAFW with other ci servers, since RAFW requires BuildForge.
You have support for BuildForge and it integrates with other IBM software like ClearCase. Theoretically you have only to deal with one vendor if something in the chain does not work, but IBM has different support teams for their products and you might become their ping pong ball. :(
Hudson is open source (if you like that), that means you can get the source and modify it to serve you better. But the release cycle is very short (about 1 week, agile development). There is a more stable version with support available now (for cash of course) from the company of the main author of Hudson.
Hudson is currently main stream and is actively developed. I don't know how the usability of BuildForge is, but Hudson is good (not always perfect). The plugin concept of Hudson is a great plus, not sure if BuildForge has it as well.
Currently, we are using Hudson, but BuildForge was not looked at in detail.
You need to define what you would need continuous integration for (e.g. building, testing). Having used Hudson, I can vouch for its usefulness and effectiveness. There are many plugins that extend Hudson that can suit various needs. And you can't beat the price point (free).
You need to inquire as to why a BuildForge license was obtained at your place of employment. Perhaps someone on your team knows why this was done. If it isn't necessary for your needs, don't renew your BuildForge license and simply continue using Hudson.
Being a BuildForge/RAFW user, I have to object to one point stated above. It is perfectly possible to use RAFW without BuildForge. It is driven by a command line script, and you could use for example Hudson and RAFW together just fine.
A sample command would look like:
rafw.sh -e env -c cell -t was_common_configure_start_dmgr
The primary differentiators IMO:
Hudson/Jenkins is more readily extensible with the many existing plugins. It has a large active community and plenty information and documentation.
BuildForge can be configured with agents running on multiple machines and tasks can be assigned to run on a target agent. Reliable vendor support.

Resources