24 September 2018

Kubernetes for the Enterprise!

Announcing SUSE CaaS Platform 3

Containers for Big Data: How MapR Expands Containers Use to Access Data Directly

Every enterprise needs Kubernetes today, including yours.  But with the platform evolving so rapidly, it can be difficult to keep up.  Not to worry, SUSE can take care of that for you: SUSE CaaS Platform delivers Kubernetes advancements to you in an enterprise-grade solution.

SUSE and Big Data

SUSE today announced SUSE CaaS Platform 3, introducing support for a raft of new features and a special focus on the Kubernetes platform operator.  You can read all about it in the press release, but let’s hit on a few of the highlights here.  With SUSE CaaS Platform 3 you can:

Optimize your cluster configuration with expanded datacenter integration and cluster re-configuration options
Setting up your Kubernetes environment is easier than ever with improved integration of private (OpenStack) and public (Amazon Web Services, Microsoft Azure, and Google Cloud Platform) cloud storage, and automatic deployment of the Kubernetes software load balancer.

Persistent Storage for Docker Containers | Whiteboard Walkthrough

A new SUSE toolchain module also allows you to tune the MicroOS container operating system to support your custom hardware configuration needs. Now you can, for example, install additional packages required to run your own monitoring agent or other custom software.
Transform your start-up cluster into a highly available environment. With new cluster reconfiguration capabilities, you can switch from a single-master to a multi-master environment, or vice-versa, to accommodate your changing needs.

Manage container images more efficiently and securely with a local container registry
Download a container image from an external registry once, then save a copy in your own local registry for sharing among all nodes in your cluster. By connecting to an internal proxy rather than an external registry, and by downloading from a local cache rather than a remote server, you’ll improve security and increase performance every time a cluster node pulls an image from the local registry.
For still greater security, disconnect from external registries altogether and use only trusted images you’ve loaded into your local registry.

Try out the new, lightweight CRI-O container runtime, designed specifically for Kubernetes, and introduced in CaaSP 3 as a tech preview feature. Stable and secure, CRI-O is also smaller and architecturally simpler than traditional container runtimes.

Simplify deployment and management of long running workloads through the Apps Workloads API. Promoted to ‘stable’ in upstream Kubernetes 1.9 code, the Apps Workloads API is now supported by SUSE.  This API generally facilitates orchestration (self-healing, scaling, updates, termination) of common types of workloads.

Modern Big Data Pipelines over Kubernetes [I] - Eliran Bivas, Iguazio

With Kubernetes now a must-have for every enterprise, you’ll want to give SUSE CaaS Platform a serious look.  Focused on providing an exceptional platform operator experience, it delivers Kubernetes innovations in a complete, enterprise grade solution that enables IT to deliver the power of Kubernetes to users more quickly, consistently, and efficiently.

SUSE CaaS Platform also serves as the Kubernetes foundation for SUSE Cloud Application Platform, which addresses modern application developers’ needs by bringing the industry’s most respected cloud-native developer experience (Cloud Foundry) into a Kubernetes environment.

SUSE CaaS Platform

SUSE CaaS Platform is an enterprise class container management solution that enables IT and DevOps professionals to more easily deploy, manage, and scale container-based applications and services. It includes Kubernetes to automate lifecycle management of modern applications, and surrounding technologies that enrich Kubernetes and make the platform itself easy to operate. As a result, enterprises that use SUSE CaaS Platform can reduce application delivery cycle times and improve business agility.

SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK

SUSE is focused on delivering an exceptional operator experience with SUSE CaaS Platform.

HDFS on Kubernetes—Lessons Learned - Kimoon Kim

With deep competencies in infrastructure, systems, process integration, platform security, lifecycle management and enterprise-grade support, SUSE aims to ensure IT operations teams can deliver the power of Kubernetes to their users quickly, securely and efficiently. With SUSE CaaS Platform you can:

Achieve faster time to value with an enterprise-ready container management platform, built from industry leading technologies, and delivered as a complete package, with everything you need to quickly offer container services.

Simplify management and control of your container platform with efficient installation, easy scaling, and update automation.

Maximize return on your investment, with a flexible container services solution for today and tomorrow.

Episode 3: Kubernetes and Big Data Services

Key Features

A Cloud Native Computing Foundation (CNCF) certified Kubernetes distribution, SUSE CaaS Platform automates the orchestration and management of your containerized applications and services with powerful Kubernetes capabilities, including:

  • Workload scheduling places containers according to their needs while improving resource utilization
  • Service discovery and load balancing provides an IP address for your service, and distributes load behind the scenes
  • Application scaling up and down, accommodates changing load
  • Non-disruptive Rollout/Rollback of new applications and updates enables frequent change without downtime
  • Health monitoring and management supports application self-healing and ensures application availability

In addition, SUSE CaaS Platform simplifies the platform operator’s experience, with everything you need to get up and running quickly, and to manage the environment effectively in production. It provides:

  • Application ecosystem support with SUSE Linux container base images, and access to tools and services offered by SUSE Ready for CaaS Platform partners and the Kubernetes community
  • Enhanced datacenter integration features that enable you to plug Kubernetes into new or existing infrastructure, systems, and processes
  • A complete container execution environment, including a purpose-built container host operating system, container runtime, and container image registries
  • End-to-End security, implemented holistically across the full stack
  • Advanced platform management that simplifies platform installation, configuration, re-configuration, monitoring, maintenance, updates, and recovery
  • Enterprise hardening including comprehensive interoperability testing, support for thousands of platforms, and world-class platform maintenance and technical support

Cisco and SUSE have collaborated for years on solutions that improve efficiencies and lower costs in the data center by leveraging the flexibility and value of the UCS platform and the performance and reliability of the SUSE Linux Enterprise Server.

With focus and advancement in the areas of compute, storage and networking, Cisco and SUSE are now looking to help organizations tackle the challenges associated with the ‘5 Vs’ of Big Data:

1. Volume
2. Variety
3. Velocity
4. Veracity (of data)
5. Value

Ian Chard of Cisco recently published a great read untangling these challenges, and pointing to areas that help harness the power of data analytics.
Article content Below:

The harnessing of data through analytics is key to staying competitive and relevant in the age of connected computing and the data economy.

Analytics now combines statistics, artificial intelligence, machine learning, deep learning and data processing in order to extract valuable information and insights from the data flowing through your business.

Unlock the Power of Kubernetes for Big Data by Joey Zwicker, Pachyderm

Your ability to harness analytics defines how well you know your business, your customers, and your partners – and how quickly you understand them.

But it’s still hard to gain valuable insights from data. Collectively the challenges are known as the ‘5 Vs of big data’:

The Volume of data has grown so much that traditional relational database management software running on monolithic servers is incapable of processing it.

The Variety of data has also increased. There are many more sources of data and many more different types.

Velocity describes how fast the data is coming in. It has to be processed, often in real time, and stored in huge volume.

Veracity of data refers to how much you can trust it. Traditional structured data (i.e. in fixed fields or formats) goes through a validation process. This approach does not work with unstructured (i.e. raw) data.

Deriving Value from the data is hard due to the above.

If you’re wrestling with the 5 Vs, chances are you’ll be heading to ExCeL London for the annual Strata Data Conference on 22-24 May 2018.

We’ll be there on Booth 316, together with our partners including SUSE, where we’ll be showcasing how much of the progress made in compute, storage, and networking, as well as distributed data processing frameworks can help to address these challenges.

1) The Infrastructure evolution
Compute demands are growing in direct response to data growth. More powerful servers or, more servers working in parallel – aka scale-out – are needed.

Deep learning techniques for example can absorb an insatiable amount of data, making a robust HDFS cluster a great way to achieve scale out storage for the collection and preparation of the data. Machine learning algorithms can run on traditional x86 CPUs, but GPUs can accelerate these algorithms by up to a factor of 100.

New approaches to data analytics applications and storage are also needed because the majority of the data available is unstructured. Email, text documents, images, audio, and video are data types that are a poor fit for relational databases and traditional storage methods.

Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop

Storing data in the public cloud can ease the load. But as your data grows and you need to access it more frequently, cloud services can become expensive, while the sovereignty of that data can be a concern.

Software-defined storage is a server virtualisation technology that allows you to shift large amounts of unstructured data to cost-effective, flexible solutions located on-premises. This assures performance and data sovereignty while reducing storage costs over time.

You can use platforms such as Hadoop to create shared repositories of unstructured data known as data lakes. Running on a cluster of servers, data lakes can be accessed by all users. However, they must be managed in a way that’s compliant, using enterprise-class data management platforms that allow you to store, protect and access data quickly and easily.

2) Need for speed
The pace of data analytics innovation continues to increase. Previously, you would define your data structures and build an application to operate on the data. The lifetime of such applications was measured in years.

Today, raw data is collected and explored for meaningful patterns using applications that are rebuilt when new patterns emerge. The lifetime of these applications is measured in months – and even days.

The value of data can also be short-lived. There’s a need to analyse it at source, as it arrives, in real time. Data solutions that employ in-memory processing for example, give your users immediate, drill-down access to all the data across your enterprise applications, data warehouses and data lakes.

3) Come see us at Strata Data Conference
Ultimately, your ability to innovate at speed with security and governance assured comes down to your IT infrastructure.

Cisco UCS is a trusted computing platform proven to deliver lower TCO and optimum performance and capacity for data-intensive workloads.

85% of Fortune 500 companies and more than 60,000 organisations globally rely on our validated solutions. These combine our servers with software from a broad ecosystem of partners to simplify the task of pooling IT resources and storing data across systems.

Modern big data and machine learning in the era of cloud, docker and kubernetes

Crucially, they come with role- and policy-based management, which means you can configure hundreds of storage servers as easily as you can configure one, making scale-out a breeze as your data analytics projects mature.

If you’re looking to transform your business and turn your data into insights faster, there’s plenty of reasons to come visit us on booth 316:

4) Accelerated Analytics
If your data lake is deep and your data scientists are struggling to making sense of what lies beneath, then our MapD demo powered by data from mobile masts will show you how to cut through the depths and find the enlightenment you seek fast.

5) Deep learning with Cloudera Data Science Workbench
For those with a Hadoop cluster to manage their data lakes and deep learning framework, we’ll be demonstrating how to accelerate the training of deep learning modules with Cisco UCS C240 and C480 servers equipped with 2 and 6 GPUs respectively. We’ll also show you how to support growing cluster sizes using cloud-managed service profiles rather than more manpower.

6) Get with the Cisco Gateway
If you’re already a customer and fancy winning some shiny new tech, why not step through the Gateway to grow your reputation as a thought leader and showcase the success you’ve had?

7) Find your digital twin
To effectively create a digital twin of the enterprise, data scientists have to incorporate data sources inside and outside of the data centre for a holistic 360-view. Come join our resident expert Han Yang for his session on how we’re benefiting from big data and analytics, as well as helping our customers to incorporate data sources from Internet of Things and deploy machine learning at the edge and at the enterprise.

8) Get the scoop with SUSE
We’re set to unveil a new integration of SUSE Linux Enterprise Server and Cisco UCS. There’ll be SUSE specialists on our booth, so you can be the first to find out more about what’s in the pipeline.

What is Kubernetes?

Kubernetes is an open source system for automatically orchestrating and managing containerized applications.

AI meets Big Data

Designing applications using open source Linux containers is an ideal approach for building cloud-native applications for hosting in private, public or hybrid clouds. Kubernetes automates the deployment, management and scaling of these containerized applications, making the whole process easier, faster and more efficient.

Businesses of all types are looking for a new paradigm to drive faster innovation and agility. This is changing forever how applications are architected, deployed, scaled and managed to deliver new levels of innovation and agility. Kubernetes has become widely embraced by almost everyone interested in dramatically accelerating application delivery with containerized and cloud-native workloads.

Kubernetes is now seen as the outright market leader by software developers, operations teams, DevOps professionals and IT business decision makers.

Manage Microservices & Fast Data Systems on One Platform w/ DC/OS

Kubernetes Heritage

Kubernetes was originally the brainchild of Google. Google has been building and managing container-based applications and cloud-native workloads in production and at scale for well over a decade. Kubernetes emerged from the knowledge and experience gained with earlier Google container management systems called Borg and Omega.

Extending DevOps to Big Data Applications with Kubernetes

Now an open source project, Kubernetes is under the stewardship of the Cloud Native Computing Foundation (CNCF) and The Linux Foundation. This ensures that the project benefits from the best ideas and practices from a huge open source community and makes sure the danger of vendor lock-in is avoided.

Key Features:

  • Deploy applications rapidly and predictably to private, public or hybrid clouds
  • Scale applications non-disruptively
  • Roll out new features seamlessly
  • Lean and efficient use of computing resources
  • Keep production applications up and running with self-healing capabilities

SUSE and Kubernetes

SUSE believes Kubernetes will be a key element of the application delivery solutions needed to drive the enterprise business of the future.

Big data and Kubernetes

Here is a selection of SUSE products built using Kubernetes:

SUSE Cloud Application Platform brings advanced Cloud Foundry productivity to modern Kubernetes infrastructure, helping software development and operations teams to streamline lifecycle management of traditional and new cloud-native applications. Building on
SUSE CaaS Platform, SUSE Cloud Application Platform adds a unique Kubernetes-based implementation of Cloud Foundry, introducing a powerful DevOps workflow into a Kubernetes environment. Built on enterprise-grade Linux and with full Cloud Foundry and Kubernetes certification, it is an outstanding platform to support the entire development lifecycle for traditional and new cloud-native applications.

SUSE OpenStack Cloud makes it easy to spin up Kubernetes clusters in a full multi-tenant environment, allowing different users to have their own Kubernetes cluster. Customers can use either the built-in support for OpenStack Magnum or leverage SUSE CaaS Platform, which gives the added benefits of ready-to-run images, templates and heat automation. With these Kubernetes-as-a-Service capabilities, it’s no wonder that OpenStack users are reported to be adopting containers 3 times faster than the rest of the enterprise market.

SUSE CaaS Platform is a certified Kubernetes software distribution. It provides an enterprise-class container management solution that enables IT and DevOps professionals to more easily deploy, manage, and scale container-based applications and services. Using SUSE CaaS Platform, enterprises can reduce application delivery cycle times and improve business agility.

What's the Hadoop-la about Kubernetes?


Big data, long an industry buzzword, is now commonplace among most businesses. A 2014 survey from Gartner found 73 percent of organizations had already invested or planned to invest in big data by 2016. For many companies, the question now is not how to manage and harness data, but how to do it even more effectively. The next frontier for big data is to master speed. If you can’t analyze big data in real time, you lose much of the value of the information passing through databases.

What is fast data?

Fast Data with Apache Ignite and Apache Spark - Christos Erotocritou

While big data refers to the massive fire hose of information generated each hour, fast data refers to data that provides real-time insights. In many industries, especially the payment industry, making quick analyses of information is crucial to the bottom line. For example, fast data could prevent a massive breach that would release sensitive customer information. In this case, analyzing data in real time is far more important than storing it in massive quantities. When it comes to ecommerce fraud, the insights happening in the moment matter the most.

Kubernetes vs Docker Swarm | Container Orchestration War | Kubernetes Training | Edureka

As a Wired article put it, where in the past, gaining insights from big data was like finding a needle in a haystack, fast data is like finding the needle as soon as it’s dropped.

Fast data for payments

“For payment systems, decisions must be made in the sub-second range,” Richard Harris, head of international operations at Feedzai, recently told Payment Cards and Mobile. “Our clients typically require 20-50 millisecond response times. So we’ve overcome this by using technology founded in the Big Data era, such as Hadoop and Cassandra.”

Apache Spark on Kubernetes - Anirudh Ramanathan & Tim Chen

Payment processor First Data and Feedzai has teamed up to use machine learning to fight fraud. Feedzai monitors the company’s STAR Network, which enables debit payments for First Data’s clients.

Todd Clark, Senior Vice President and Head of STAR Network and Debit Processing at First Data explained “The combination of Feedzai’s machine learning software and First Data’s experience, [Clark says], has made the STAR Network capable of scoring over 3,000 transactions per second.”

 “This big speed and accuracy advantage means the STAR network is less of an attractive target for fraud,” Harris said.

Infrastructure challenges

Not all systems are set up to handle fast data. Without the right tools to manage the data flow quickly, valuable insights are lost or gained too late to be of use. While many existing platforms can handle and store large quantities of data, most fall behind when it comes to analyzing the information in real time. To begin with, organizations need to move beyond systems that only allow batch processing, according to Wired. In this case, companies need to tell computers to analyze large batches of information, which it processes one at a time – similar to the way credit card bills are processed at the end of each month.

With most companies now set up to gain insights from big data, the next step is to enable real-time insights. In the payment world, this means catching potential fraud as it’s happening, not waiting until it has already happened.

Beyond Hadoop: The Rise of Fast Data

Over the past two to three years, companies have started transitioning from big data, where analytics are processed after-the-fact in batch mode, to fast data, where data analysis is done in real-time to provide immediate insights. For example, in the past, retail stores such as Macy’s analyzed historical purchases by store to determine which products to add to stores in the next year. In comparison, Amazon drives personalized recommendations based on hundreds of individual characteristics about you, including what products you viewed in the last five minutes.

Containerized Hadoop beyond Kubernetes

Big data is collected from many sources in real-time, but is processed after collection in batches to provide information about the past. The benefits of data are lost if real-time streaming data is dumped into a database because of the inability to act on data as it is collected.

Super Fast Real-time Data Processing on Cloud-Native Architecture [I] - Yaron Haviv, iguazio

Modern applications need to respond to events happening now, to provide insights in real time. To do this they use fast data, which is processed as it is collected to provide real-time insights. Whereas big data provided insights into user segmentation and seasonal trending using descriptive (what happened) and predictive analytics (what will likely happen), fast data allows for real-time recommendations and alerting using prescriptive analytics (what should you do about it).

Big help for your first big data project

It’s clear. Today, big data is changing the way companies work. What hasn’t been clear is how companies should go about implementing big data projects.

Until now.

Our highly practical workbook is full of advice about big data that’ll help you keep your project on track. From setting clear goals to strategic resourcing and ideal big data architectures, we’ve covered everything you need to know about big data.

Streaming Big Data with Heron on Kubernetes Cluster

Read “The Big, Big Data Workbook” to gain insights into:

  • How to choose the right project and set up the right goals
  • How to build the right team and maximize productivity
  • What your data governance framework should look like
  • The architecture and processes you should aim to build
  • “The Big, Big Data Workbook” is a comprehensive guide about the practical aspects of big data and an absolute must-read if you’re attempting to bring greater insights to your enterprise.

More Information:
















0 reacties:

Post a Comment