• IBM Consulting

    DBA Consulting can help you with IBM BI and Web related work. Also IBM Linux is our portfolio.

  • Oracle Consulting

    For Oracle related consulting and Database work and support and Migration call DBA Consulting.

  • Novell/RedHat Consulting

    For all Novell Suse Linux and SAP on Suse Linux questions releated to OS and BI solutions. And offcourse also for the great RedHat products like RedHat Enterprise Server and JBoss middelware and BI on RedHat.

  • Microsoft Consulting

    For Microsoft Server 2012 onwards, Microsoft Client Windows 7 and higher, Microsoft Cloud Services (Azure,Office 365, etc.) related consulting services.

  • Citrix Consulting

    Citrix VDI in a box, Desktop Vertualizations and Citrix Netscaler security.

  • Web Development

    Web Development (Static Websites, CMS Websites (Drupal 7/8, WordPress, Joomla, Responsive Websites and Adaptive Websites).

20 July 2018

Capsule Neural Networks (CNN) a Better alternative for Convolutional Neural Networks (CNN)

Capsule Neural Networks (CNN) a Better alternative

Geoffrey Hinton and his team published two papers that introduced a completely new type of neural network based on so-called capsules. In addition to that, the team published an algorithm, called dynamic routing between capsules, that allows to train such a network.

Introduction to Capsule Networks (CapsNets)

For everyone in the deep learning community, this is huge news, and for several reasons. First of all, Hinton is one of the founders of deep learning and an inventor of numerous models and algorithms that are widely used today. Secondly, these papers introduce something completely new, and this is very exciting because it will most likely stimulate additional wave of research and very cool applications.

Capsule Neural Networks

What is a CapsNet or Capsule Network?

Introduction to How Faster R-CNN, Fast R-CNN and R-CNN Works

Faster R-CNN Architecture

How RPN (Region Proposal Networks) Works

What is a Capsule Network? What is a Capsule? Is CapsNet better than a Convolutional Neural Network (CNN)? In this article I will talk about all the above questions about CapsNet or Capsule Network released by Hinton.
Note: This article is not about pharmaceutical capsules. It is about Capsules in Neural Networks or Machine Learning world.
There is an expectation from you as a reader. You need to be aware of CNNs. If not, I would like you to go through this article on Hackernoon. Next I will run through a small recap of relevant points of CNN. That way you can easily grab on to the comparison done below. So without further ado lets dive in.

CNN are essentially a system where we stack a lot of neurons together. These networks have been proven to be exceptionally great at handling image classification problems. It would be hard to have a neural network map out all the pixels of an image since it‘s computationally really expensive. So convolutional is a method which helps you simplify the computation to a great extent without losing the essence of the data. Convolution is basically a lot of matrix multiplication and summation of those results.

Capsule Networks Are Shaking up AI — An Introduction

After an image is fed to the network, a set of kernels or filters scan it and perform the convolution operation. This leads to creation of feature maps inside the network. These features next pass via activation layer and pooling layers in succession and then based on the number of layers in the network this continues. Activation layers are required to induce a sense of non linearity in the network (eg: ReLU). Pooling (eg: max pooling) helps in reducing the training time. The idea of pooling is that it creates “summaries” of each sub-region. It also gives you a little bit of positional and translational invariance in object detection. At the end of the network it will pass via a classifier like softmax classifier which will give us a class. Training happens based on back propagation of error matched against some labelled data. Non linearity also helps in solving the vanishing gradient in this step.

What is the problem with CNNs?

CNNs perform exceptionally great when they are classifying images which are very close to the data set. If the images have rotation, tilt or any other different orientation then CNNs have poor performance. This problem was solved by adding different variations of the same image during training. In CNN each layer understands an image at a much more granular level. Lets understand this with an example. If you are trying to classify ships and horses. The innermost layer or the 1st layer understands the small curves and edges. The 2nd layer might understand the straight lines or the smaller shapes, like the mast of a ship or the curvature of the entire tail. Higher up layers start understanding more complex shapes like the entire tail or the ship hull. Final layers try to see a more holistic picture like the entire ship or the entire horse. We use pooling after each layer to make it compute in reasonable time frames. But in essence it also loses out the positional data.

Pooling helps in creating the positional invariance. Otherwise CNNs would fit only for images or data which are very close to the training set. This invariance also leads to triggering false positive for images which have the components of a ship but not in the correct order. So the system can trigger the right to match with the left in the above image. You as an observer clearly see the difference. The pooling layer also adds this sort of invariance.

Depthwise Separable Convolution - A FASTER CONVOLUTION!

This was never the intention of pooling layer. What the pooling was supposed to do is to introduce positional, orientational, proportional invariances. But the method we use to get this uses is very crude. In reality it adds all sorts of positional invariance. Thus leading to the dilemma of detecting right ship in image 2.0 as a correct ship. What we needed was not invariance but equivariance. Invariance makes a CNN tolerant to small changes in the viewpoint. Equivariance makes a CNN understand the rotation or proportion change and adapt itself accordingly so that the spatial positioning inside an image is not lost. A ship will still be a smaller ship but the CNN will reduce its size to detect that. This leads us to the recent advancement of Capsule Networks.

Hinton himself stated that the fact that max pooling is working so well is a big mistake and a disaster:

Hinton: “The pooling operation used in convolutional neural networks is a big mistake and the fact that it works so well is a disaster.”

Of course, you can do away with max pooling and still get good results with traditional CNNs, but they still do not solve the key problem:

Internal data representation of a convolutional neural network does not take into account important spatial hierarchies between simple and complex objects.

In the example of a Dog, a mere presence of 2 eyes, a mouth and a nose in a picture does not mean there is a face, we also need to know how these objects are oriented relative to each other.

What is a Capsule Network?

Every few days there is an advancement in the field of Neural Networks. Some brilliant minds are working on this field. You can pretty much assume every paper on this topic is almost ground breaking or path changing. Sara Sabour, Nicholas Frost and Geoffrey Hinton released a paper titled “Dynamic Routing Between Capsules” 4 days back. Now when one of the Godfathers of Deep Learning “Geoffrey Hinton” is releasing a paper it is bound to be ground breaking. The entire Deep Learning community is going crazy on this paper as you read this article. So this paper talks about Capsules, CapsNet and a run on MNIST. MNIST is a database of tagged handwritten digit images. Results are showing a significant increase in performance in case of overlapped digits. The paper compares to the current state-of-the-art CNNs. In this paper the authors project that human brain have modules called “capsules”. These capsules are particularly good at handling different types of visual stimulus and encoding things like pose (position, size, orientation), deformation, velocity, albedo, hue, texture etc. The brain must have a mechanism for “routing” low level visual information to what it believes is the best capsule for handling it.

Capsule Networks

Capsule is a nested set of neural layers. So in a regular neural network you keep on adding more layers. In CapsNet you would add more layers inside a single layer. Or in other words nest a neural layer inside another. The state of the neurons inside a capsule capture the above properties of one entity inside an image. A capsule outputs a vector to represent the existence of the entity. The orientation of the vector represents the properties of the entity. The vector is sent to all possible parents in the neural network. For each possible parent a capsule can find a prediction vector. Prediction vector is calculated based on multiplying it’s own weight and a weight matrix. Whichever parent has the largest scalar prediction vector product, increases the capsule bond. Rest of the parents decrease their bond. This routing by agreement method is superior than the current mechanism like max-pooling. Max pooling routes based on the strongest feature detected in the lower layer. Apart from dynamic routing, CapsNet talks about adding squashing to a capsule. Squashing is a non-linearity. So instead of adding squashing to each layer like how you do in CNN, you add the squashing to a nested set of layers. So the squashing function gets applied to the vector output of each capsule.

Why Deep Learning Works: Self Regularization in Deep Neural Networks

The paper introduces a new squashing function. You can see it in image 3.1. ReLU or similar non linearity functions work well with single neurons. But the paper found that this squashing function works best with capsules. This tries to squash the length of output vector of a capsule. It squashes to 0 if it is a small vector and tries to limit the output vector to 1 if the vector is long. The dynamic routing adds some extra computation cost. But it definitely gives added advantage.

Now we need to realise that this paper is almost brand new and the concept of capsules is not throughly tested. It works on MNIST data but it still needs to be proven against much larger dataset across a variety of classes. There are already (within 4 days) updates on this paper who raise the following concerns:
1. It uses the length of the pose vector to represent the probability that the entity represented by a capsule is present. To keep the length less than 1 requires an unprincipled non-linearity that prevents there from being any sensible objective function that is minimized by the iterative routing procedure.
2. It uses the cosine of the angle between two pose vectors to measure their agreement for routing. Unlike the log variance of a Gaussian cluster, the cosine is not good at distinguishing between quite good agreement and very good agreement.
3. It uses a vector of length n rather than a matrix with n elements to represent a pose, so its transformation matrices have n 2 parameters rather than just n.

The current implementation of capsules has scope for improvement. But we should also keep in mind that the Hinton paper in the first place only says:

The aim of this paper is not to explore this whole space but to simply show that one fairly straightforward implementation works well and that dynamic routing helps.

Capsule Neural Networks: The Next Neural Networks?  CNNs and their problems.
Convolutional (‘regular’) Neural Networks are the latest hype in machine learning, but they have their flaws. Capsule Neural Networks are the recent development from Hinton which help us solve some of these issues.

Neural Networks may be the hottest field in Machine Learning. In recent years, there were many new developments improving neural networks and building making them more accessible. However, they were mostly incremental, such as adding more layers or slightly improving the activation function, but did not introduce a new type of architecture or topic.

Geoffery Hinton is one of the founding fathers of many highly utilized deep learning algorithms including many developments to Neural Networks — no wonder, for having Neurosciences and Artificial Intelligence background.

Capsule Networks: An Improvement to Convolutional Networks

Neural Networks may be the hottest field in Machine Learning. In recent years, there were many new developments improving neural networks and building making them more accessible. However, they were mostly incremental, such as adding more layers or slightly improving the activation function, but did not introduce a new type of architecture or topic.

Geoffery Hinton is one of the founding fathers of many highly utilized deep learning algorithms including many developments to Neural Networks — no wonder, for having Neurosciences and Artificial Intelligence background.

 At late October 2017, Geoffrey Hinton, Sara Sabour, and Nicholas Frosst Published a research paper under Google Brain named “Dynamic Routing Between Capsules”, introducing a true innovation to Neural Networks. This is exciting, since such development has been long awaited for, will likely spur much more research and progress around it, and is supposed to make neural networks even better than they are now.

Capsule networks: overview

The Baseline: Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are extremely flexible machine learning models which were originally inspired by principles from how our brains are theorized to work.
Neural Networks utilize layers of “neurons” to process raw data into patterns and objects.
The primary building blocks of a Neural Network is a “Convolutional” layer (hence the name). What does it do? It takes raw information from a previous layer, makes sense of patterns in it, and send it onward to the next layer to make sense of a larger picture.

 If you are new to neural networks and want to understand it, I recommend:

  • Watching the animated videos by 3Blue1Brown.
  • For a more detailed textual/visual guide, you can check out this beginner’s blogpost
  • If you can deal with some more math and greater details, you can read instead this guide from CS231 at Stanford. 

 In case you didn’t do any of the above, and plan to continue, here is a hand-wavy brief overview.

The Intuition Behind Convolutional Neural Networks

Let’s start from the beginning.
 The Neural Net receives raw input data. Let’s say it’s a doodle of a dog. When you see a dog, you brain automatically detects it’s a dog. But to the computer, the image is really just an array of numbers representing the colors intensity in the colors channels. Let’s say it’s just a Black&White doodle, so we can represent it with one array where each cell represents the brightness of the pixel from black to white.

Understanding Convolutional Neural Networks.

Convolutional Layers. The first convolutional layer maps the image space to a lower space — summarizing what’s happening in each group of, say 5x5 pixels — is it a vertical line? horizontal line? curve of what shape? This happens with element wise multiplication and then summation of all the values in the filter with the original filter value and summing up to a single number.

This leads to the Neuron, or convolutional filters. Each Filter / Neuron is designed to react to one specific form (a vertical line? a horizontal line? etc…). The groups of pixels from layer 1 reach these neurons, and lights up the neurons that match its structure according to how much this slice is similar to what the neuron looks for.

Activation (usually “ReLU”) Layers — After each convolutional layer, we apply a nonlinear layer (or activation layer), which introduces non-linearity to the system, enabling it to discover also nonlinear relations in the data. ReLU is a very simple one: making any negative input to 0, or if it’s positive — keeping it the same. ReLU(x) = max(0,x).

Pooling Layers. This allows to reduce “unnecessary” information, summarize what we know about a region, and continue to refine information. For example, this might be “MaxPooling” where the computer will just take the highest value of the passed this patch — so that the computer knows “around these 5x5 pixels, the most dominant value is 255. I don’t know exactly in which pixel but the exact location isn’t as important as that it’s around there. → Notice: This is not good. We loose information here. Capsule Networks don’t have this operation here, which is an improvement.

Dropout Layers. This layer “drops out” a random set of activations in that layer by setting them to zero. This makes the network more robust (kind of like you eating dirt builds up your immunity system, the network is more immune to small changes) and reduces overfitting. This is only used when training the network.

Last Fully Connected Layer. For a classification problem, we want each final neuron represents the final class. It looks at the output of the previous layer (which as we remember should represent the activation maps of high level features) and determines which features most correlate to a particular class.

SoftMax — This layer is sometimes added as a another way to represent the outputs per classes that we can later pass on in a loss function. Softmax represents the distribution of probabilities to the various categories.
Usually, there are more layers which provide nonlinearities and preservation of dimensions (like padding with 0’s around the edges) that help to improve the robustness of the network and control overfitting. But these are the basics you need to understand what comes after.

Capsule Network

Usually, there are more layers which provide nonlinearities and preservation of dimensions (like padding with 0’s around the edges) that help to improve the robustness of the network and control overfitting. But these are the basics you need to understand what comes after.

 Now, importantly, these layers are connected only SEQUENTIALLY. This is in contrast to the structure of capsule networks.

What is The Problem With Convolutional Neural Networks?

If this interests you, watch Hinton's lecture explaining exactly what it wrong with them. Below you'll get a couple of key points that are improved by Capsule Networks.

Hinton says that they have too few levels of substructures (nets are composed from layers composed from neurons, that's it); and that we need to group the neurons in each layer into “capsules”, like mini-columns, that do a lot of internal computations, and then output a summary result.

Problems with CNNs and Introduction to capsule neural networks

Problem #1: Pooling looses information

CNN use “pooling” or equivalent methods to “summarize” what's going on in the smaller regions and make sense of larger and larger chunks of the image. This was a solution that made CNNs work well, but it looses valuable information.

 Capsule networks will compute a pose (transnational and rotational) relationship between smaller features to make up a larger feature.
 This loss of information leads to loss of spatial information.

Problem #2: CNNs don't account for the spatial relations between the parts of the image. Therefore, they also are too sensitive to orientation.

Subsampling (and pooling) loses the precise spatial relationships between higher-level parts like a nose and a mouth. The precise spatial relationships are needed for identity recognition.

(Hinton, 2012, in his lecture).

Geoffrey Hinton Capsule theory

 CNNs don't account for spatial relationships between the underlying objects. By having these flat layers of neurons that light up according to which objects they've seen, they recognize the presence of such objects. But then they are passed on to other activation and pooling layers and on to the next layer of neurons (filters), without recognizing what are the relations between these objects we identified in that single layer.
 They just account for their presence.

Hinton: Dynamic Routing Between Capsules

So a (simplistic) Neural network will not hesitate about categorizing both these dogs, Pablo and Picasso, as similarly good representations of “corgi-pit-bull-terrier mix”.

Capsule Networks (CapsNets) – Tutorial

Problem #3: CNNs can't transfer their understanding of geometric relationships to new viewpoints.

This makes them more sensitive to the original image itself in order to classify images as the same category.

 CNNs are great for solving problems with data similar to what they have been trained on. It can classify images or objects within them which are very close to things it has seen before.

 But if the object is slightly rotated, photographed from a slightly different angle, especially in 3D, is tilted or in another orientation than what the CNN has seen - the network won't recognize it well.

 One solution is to artificially create tilted representation of the image or groups and add them to the “training” set. However, this still lacks a fundamentally more robust structure.

What is a Capsule?

Capsule Networks Explained in detail ! (Deep learning)

In order to answer this question, I think it is a good idea to refer to the first paper where capsules were introduced — “Transforming Autoencoders” by Hinton et al. The part that is important to understanding of capsules is provided below:

“Instead of aiming for viewpoint invariance in the activities of “neurons” that use a single scalar output to summarize the activities of a local pool of replicated feature detectors, artificial neural networks should use local “capsules” that perform some quite complicated internal computations on their inputs and then encapsulate the results of these computations into a small vector of highly informative outputs. Each capsule learns to recognize an implicitly defined visual entity over a limited domain of viewing conditions and deformations and it outputs both the probability that the entity is present within its limited domain and a set of “instantiation parameters” that may include the precise pose, lighting and deformation of the visual entity relative to an implicitly defined canonical version of that entity. When the capsule is working properly, the probability of the visual entity being present is locally invariant — it does not change as the entity moves over the manifold of possible appearances within the limited domain covered by the capsule. The instantiation parameters, however, are “equivariant” — as the viewing conditions change and the entity moves over the appearance manifold, the instantiation parameters change by a corresponding amount because they are representing the intrinsic coordinates of the entity on the appearance manifold.”

The paragraph above is very dense, and it took me a while to figure out what it means, sentence by sentence. Below is my version of the above paragraph, as I understand it:

Artificial neurons output a single scalar. In addition, CNNs use convolutional layers that, for each kernel, replicate that same kernel’s weights across the entire input volume and then output a 2D matrix, where each number is the output of that kernel’s convolution with a portion of the input volume. So we can look at that 2D matrix as output of replicated feature detector. Then all kernel’s 2D matrices are stacked on top of each other to produce output of a convolutional layer.

Then, we try to achieve viewpoint invariance in the activities of neurons. We do this by the means of max pooling that consecutively looks at regions in the above described 2D matrix and selects the largest number in each region. As result, we get what we wanted — invariance of activities. Invariance means that by changing the input a little, the output still stays the same. And activity is just the output signal of a neuron. In other words, when in the input image we shift the object that we want to detect by a little bit, networks activities (outputs of neurons) will not change because of max pooling and the network will still detect the object.

Dynamic routing between capsules

The above described mechanism is not very good, because max pooling loses valuable information and also does not encode relative spatial relationships between features. We should use capsules instead, because they will encapsulate all important information about the state of the features they are detecting in a form of a vector (as opposed to a scalar that a neuron outputs).

[PR12] Capsule Networks - Jaejun Yoo

Capsules encapsulate all important information about the state of the feature they are detecting in vector form.

Capsules encode probability of detection of a feature as the length of their output vector. And the state of the detected feature is encoded as the direction in which that vector points to (“instantiation parameters”). So when detected feature moves around the image or its state somehow changes, the probability still stays the same (length of vector does not change), but its orientation changes.

t.1 – Capsules and routing techniques (part 1/2)

Imagine that a capsule detects a face in the image and outputs a 3D vector of length 0.99. Then we start moving the face across the image. The vector will rotate in its space, representing the changing state of the detected face, but its length will remain fixed, because the capsule is still sure it has detected a face. This is what Hinton refers to as activities equivariance: neuronal activities will change when an object “moves over the manifold of possible appearances” in the picture. At the same time, the probabilities of detection remain constant, which is the form of invariance that we should aim at, and not the type offered by CNNs with max pooling.

PR-012: Faster R-CNN : Towards Real-Time Object Detection with Region Proposal Networks

More Information:

“Understanding Dynamic Routing between Capsules (Capsule Networks)”

Understanding Hinton’s Capsule Networks. Part I: Intuition.   https://medium.com/ai³-theory-practice-business/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b

Understanding Capsule Networks — AI’s Alluring New Architecture.  https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc

What is a CapsNet or Capsule Network?   https://hackernoon.com/what-is-a-capsnet-or-capsule-network-2bfbe48769cc



A “weird” introduction to Deep Learning.  https://towardsdatascience.com/a-weird-introduction-to-deep-learning-7828803693b0

Faster R-CNN Explained   https://medium.com/@smallfishbigsea/faster-r-cnn-explained-864d4fb7e3f8

A Brief History of CNNs in Image Segmentation: From R-CNN to Mask R-CNN   https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4

Convolutional Neural Network (CNN)   https://skymind.ai/wiki/convolutional-network

Capsule Neural Networks: The Next Neural Networks? Part 1: CNNs and their problems  https://towardsdatascience.com/capsule-neural-networks-are-here-to-finally-recognize-spatial-relationships-693b7c99b12

Understanding Hinton’s Capsule Networks. Part II: How Capsules Work    https://medium.com/ai³-theory-practice-business/understanding-hintons-capsule-networks-part-ii-how-capsules-work-153b6ade9f66

Understanding Hinton’s Capsule Networks. Part III: Dynamic Routing Between Capsules   https://medium.com/ai³-theory-practice-business/understanding-hintons-capsule-networks-part-iii-dynamic-routing-between-capsules-349f6d30418

Understanding Hinton’s Capsule Networks. Part IV: CapsNet Architecture   https://medium.com/@pechyonkin/part-iv-capsnet-architecture-6a64422f7dce

22 June 2018

IBM Summit High Performance Computing: Accelerating Cognitive Workloads with Machine Learning

HPC and HPDA for the Cognitive Journey with OpenPOWER

The high-performance computing landscape is evolving at a furious pace that some are describing as an important inflection point, as Moore’s Law delivers diminishing returns while performance demands increase. Leaders of organizations are grappling with how to embrace recent system-level innovations like acceleration, while simultaneously being challenged to incorporate analytics into their HPC workloads.

Intro summit webinar: Innovative and Novel Computational Impact on Theory and Experiment (INCITE) Program for 2019

On the horizon, even more demanding applications built with machine learning and deep learning are emerging to push system demands to all-new highs. With all of this change in the pipeline, the usual tick-tock of minor code tweaks to accompany nominal hardware performance improvements can’t continue as usual. For many HPC organizations, significant decisions need to be made.

Introduction to ECP’s newest Focus Area, Hardware and Integration (HI)

Realizing that these demands could only be addressed by an open ecosystem, IBM partnered with other industry leaders Google, Mellanox, NVIDIA and others to form the OpenPOWER Foundation, dedicated to stewarding the Power CPU architecture into the next generation.

IBM Power9 Features and Specifications

A data-centric approach to HPC with OpenPOWER

In 2014, this disruptive approach to HPC innovation led to IBM being awarded two contracts to build the next generation of supercomputers as part of the US Department of Energy’s Collaboration of Oak Ridge, Argonne, and Lawrence Livermore, or CORAL program. In partnership with NVIDIA and Mellanox, we demonstrated to CORAL that a “data-centric” approach to systems – an architecture designed to embed compute power everywhere data resides in the system, positioning users for a convergence of analytics, modeling, visualization and simulation, which could lead to driving new insights at incredible speeds – could help them achieve their goals. Now, on the three-year anniversary of that agreement, we’re pleased to announce that we are delivering on our project, with our next-generation IBM Power Systems with NVIDIA Volta GPUs being deployed at Oak Ridge and Lawrence Livermore National Labs.

Moving mountains

Both systems, Summit at ORNL and Sierra at LLNL, are being installed as you read this, with completion expected early next year. Both systems are impressive. Summit is expected to increase individual application performance 5 to 10 times over Titan, Oak Ridge’s older supercomputer, and Sierra is expected to provide 4 to 6 times the sustained performance of Sequoia, Lawrence Livermore’s older supercomputer.

Summit Supercomputer

With Summit in place, Oak Ridge National Labs will advance their stated mission: “Be able to address, with greater complexity and higher fidelity, questions concerning who we are, our place on earth, and in our universe.” But most importantly, the clusters will position them to push the boundaries of one of the most important technological developments of our generation, artificial intelligence (AI).

IBM's world-class Summit supercomputer gooses speed with AI abilities

Built for AI, built for the future

However, emerging AI workloads are vastly different than traditional HPC workloads. The measurements of performance listed above, while interesting, do not really capture the performance requirements for deep learning algorithms. With AI workloads, bottlenecks shift away from compute and networking back to data movement at the CPU level. IBM POWER9 systems are specifically designed for these emerging challenges.

IBM Readies POWER9-based Systems for US Department of Energy CORAL Supercomputers at SC17

“We’re excited to see accelerating progress as the Oak Ridge National Laboratory Summit supercomputer continues to take shape. The infrastructure is now complete and we’re beginning to deploy the IBM POWER9 compute nodes.  We’re still targeting early 2018 for the final build-out of the Summit machine, which we expect will be among the world’s fastest supercomputers. The advanced capabilities of the IBM POWER9 CPUs coupled with the NVIDIA Volta GPUs will significantly advance the computational performance of DOE’s mission critical applications,” says Buddy Bland, Oak Ridge Leadership Computing Facility Director.

AI, The Next HPC Workload

POWER9 leverages PCIe Gen-4, next-generation NVIDIA NVLink interconnect technology, memory coherency and more features designed to maximize throughput for AI workloads. This should translate to more overall performance and larger scales while reducing space creep due to excessive node counts and potentially out-of-control power consumption. Projections from competitors show anticipated node counts exceeding 50,000 to break into exascale territory; but this is not until 2021. Already this year, IBM was able to leverage distributed deep learning to reduce model training time from 16 days to 7 hours by successfully scaling TensorFlow and Caffe across 256 NVIDIA Tesla GPUs. These new systems feature 100 times more GPUs spread across thousands of nodes, meaning the only theoretical limit to the deep learning benchmarks we can set with these new supercomputers is our own imaginations.

Start with the data

Data preparation for deep learningAll machine learning and deep learning models train on large amounts of data. Fortunately (and unfortunately), organizations are swimming in data sitting in structured and unstructured forms, and beyond the data they have under their control, organizations also have access to data for free or for a fee from a variety of sources.
Often, little of this data is in proper placement or forms for training a new AI model. To date, we have found that this has been a problem largely solved by manual methods: miles and miles of python scripting, often run inside spark clusters for speed of execution, along with a lot of orphan code.

Share Your Science: Accelerating Cognitive Workloads with Machine Learning

To help shorten transformation time, PowerAI Enterprise integrates a structured, template-based approach to building and transforming data sets. It starts with common output formats (LMDB, TensorFlowRecords, Images for Vector Output), and allows users to define the input format/structure of raw data and some of the key characteristics of what is needed in the transform step.

The data import tools in PowerAI Enterprise are aware of the size and the complexity of the data and the resources available to transform the data. For this reason, the integrated resource manager is able to intelligently manage the execution of the job: helping to optimize for either low cost (run across the fewest number of nodes/cores) or optimize for the fastest execution of the transform job (run across more nodes/cores).

Integrated into the data preparation step is a quality check function which is designed to allow a data engineer and a data scientist to check the clarity of the signal in the data, running a simplified model and sample training from within the data import tool. Although not as sophisticated as a fully-developed model, this “gut check” allows a data scientist to discover early on in the process whether there are obvious issues or deficiencies in the training data set before investing significant time in the model development phase.

Cognitive Computing: From Data to Analytics to Learning

The majority of data doesn't offer much value unless iteratively and progressively analyzed by the user and the system to produce powerful insights with recommended actions for the best outcome(s). In fact, IBM Watson (IBM’s leadership Cognitive system) constantly sifts through data, discovers insights, learns and determines the best course of action(s).

“The cognitive computing landscape continues to evolve rapidly; giving clients unique capabilities to progressively solve complex problems for higher value.”

Learning (Cognitive and Deep Machine Learning) interactive analytics systems that continuously build knowledge over time by processing natural language and data. These systems learn a domain by experience just as humans do and can discover and suggest the “best course of action”; providing highly time-critical valuable guidance to humans or just executing this “next best action”. IBM Watson is the premier cognitive system in the market.

Converging big data, AI, and BI with a GPU-accelerated database by Karthik Lalithraj

The underlying technologies for Deep Learning include Artificial Neural Networks (ANN)–neural networks inspired by and designed to mimic the function of the cortex, the thinking matter of the brain. Driverless autonomous cars, robotics and personalized medical therapies are some key disruptive innovations enabled by Deep Learning.
A performance-optimized infrastructure is critical for the Cognitive Computing journey.

Speed up the model development process

PowerAI Enterprise includes powerful model setup tools designed to address the earliest “dead end” training runs. Integrated hyperparameter optimization automates the process of characterizing new models by sampling from the training data set and instantiating multiple small training jobs across cluster resources (this means sometimes tens, sometimes thousands of jobs depending on the complexity of the model). The tool is designed to select the most promising combinations of hyperparameters to return to the data science teams. The outcome: fewer non-productive early runs and more time to focus on refining models for greater organizational value.

Once you have selected hyperparameters, you can begin bringing together all of the different elements a deep learning model training.

Lecture 15 | Efficient Methods and Hardware for Deep Learning

This next phase in the development process is extremely iterative. Even with assistance in selecting the correct hyperparameters, ensuring that your data is clean and has a clear signal within it, and that you were able to operate at the appropriate level of scale, chances are you will still be repeating training runs. By instrumenting the training process, PowerAI Enterprise can allow a data scientist to see feedback in real time on the training cycle.

PowerAI Enterprise provides the ability to visualize current progress and status of your training job, including iteration, loss, accuracy and histograms of weights, activations, and gradients of the neural network.
With this feedback, data scientists and model developers are alerted when the training process begins to go awry. These early warnings can allow data scientists and model developers to stop training runs that will eventually go nowhere and adjust parameters.

These workflow tools run on top of IBM’s scalable, distributed deep learning platforms. They take the best of open source frameworks and augment them for both large model support and better cluster performance, both of which open up the potential to take applied artificial intelligence into areas and use cases which were not previously feasible.

Bringing all these capabilities together accelerates development for data scientists, and the combination of automating workflow and extending the capabilities of open source frameworks unlocks the hidden value in organizational data.

As Gurinder Grewal, Senior Director, Architecture at PayPal said at IBM’s Think conference: “How do you take all these technologies and marry them together to build end to end platform that we can hand over to a data scientist and the business policy owners so they can extract most value out of the data?  I think that’s what excites us most about things your company is working on in terms of the PowerAI platform…  I think that’s one of the things we actually really appreciate the openness of the platform, extracting the most value out of the compute power we have and the power from the data.”

A foundation for data science as a service

At the core of the platform is an enterprise-class management software system for running compute- and data-intensive distributed applications on a scalable, shared infrastructure.

IBM PowerAI Enterprise supports multiple users and lines of business with multi-tenancy end-to-end security, including role-based access controls. Organizational leaders are looking to deploy AI infrastructure at scale. The combination of integrated security (including role-based access, encryption of workload and data), the ability to support service level agreements, and an extremely scalable resource orchestration designed for very large compute infrastructure, mean that it is now possible to share data science environments across the organization.

High Performance Computing and the Opportunity with Cognitive Technology

One customer which has successfully navigated the new world of AI is Wells Fargo. They use deep learning models to comply with a critical financial validation process.  Their data scientists build, enhance, and validate hundreds of models each day. Speed is critical, as well as scalability, as they deal with greater amounts of data and more complicated models. As Richard Liu, Quantitative Analytics manager at Wells Fargo said at IBM Think, “Academically, people talk about fancy algorithms. But in real life, how efficiently the models run in distributed environments is critical.” Wells Fargo uses the IBM AI Enterprise software platform for the speed and resource scheduling and management functionality it provides. “IBM is a very good partner and we are very pleased with their solution,” adds Liu.

Each part of the platform is designed to remove both time and pain from the process of developing a new applied artificial intelligence service.  By automating highly repetitive and manual steps, time is saved for improving and refining models, which can lead to a higher-quality result.

We’ve introduced a lot of new functionality in IBM PowerAI Enterprise 1.1, and I’ll be sharing more detail on these new capabilities in future posts.  I also welcome your input as we continue to add new capabilities moving forward.

The Sierra Supercomputer: Science and Technology on a Mission

More Information:







22 May 2018

Introducing Windows Server 2019 – now available in preview

What’s new in Windows Server 2019

Windows Server 2019 is built on the strong foundation of Windows Server 2016 – which continues to see great momentum in customer adoption. Windows Server 2016 is the fastest adopted version of Windows Server, ever! We’ve been busy since its launch at Ignite 2016 drawing insights from your feedback and product telemetry to make this release even better.

We also spent a lot of time with customers to understand the future challenges and where the industry is going. Four themes were consistent – Hybrid, Security, Application Platform, and Hyper-converged infrastructure. We bring numerous innovations on these four themes in Windows Server 2019.

Windows Server 1709 – Everything you need to know in 10 minutes

Hybrid cloud scenarios:

We know that the move to the cloud is a journey and often, a hybrid approach, one that combines on-premises and cloud environments working together, is what makes sense to our customers. Extending Active Directory, synchronizing file servers, and backup in the cloud are just a few examples of what customers are already doing today to extend their datacenters to the public cloud. In addition, a hybrid approach also allows for apps running on-premises to take advantage of innovation in the cloud such as Artificial Intelligence and IoT. Hybrid cloud enables a future-proof, long-term approach – which is exactly why we see it playing a central role in cloud strategies for the foreseeable future.

At Ignite in September 2017, we announced the Technical Preview of Project Honolulu – our reimagined experience for management of Windows and Windows Server. Project Honolulu is a flexible, lightweight browser-based locally-deployed platform and a solution for management scenarios. One of our goals with Project Honolulu is to make it simpler and easier to connect existing deployments of Windows Server to Azure services. With Windows Server 2019 and Project Honolulu, customers will be able to easily integrate Azure services such as Azure Backup, Azure File Sync, disaster recovery, and much more so they will be able to leverage these Azure services without disrupting their applications and infrastructure.


Security continues to be a top priority for our customers. The number of cyber-security incidents continue to grow, and the impact of these incidents is escalating quickly. A Microsoft study shows that attackers take, on average, just 24-48 hours to penetrate an environment after infecting the first machine. In addition, attackers can stay in the penetrated environment – without being noticed – for up to 99 days on average, according to a report by FireEye/Mandiant. We continue on our journey to help our customers improve their security posture by working on features that bring together learnings from running global-scale datacenters for Microsoft Azure, Office 365, and several other online services.

Our approach to security is three-fold – Protect, Detect and Respond. We bring security features in all three areas in Windows Server 2019.
On the Protect front, we introduced Shielded VMs in Windows Server 2016, which was enthusiastically received by our customers. Shielded VMs protect virtual machines (VM) from compromised or malicious administrators in the fabric so only VM admins can access it on known, healthy, and attested guarded fabric. In Windows Server 2019, Shielded VMs will now support Linux VMs. We are also extending VMConnect to improve troubleshooting of Shielded VMs for Windows Server and Linux. We are adding Encrypted Networks that will let admins encrypt network segments, with a flip of a switch to protect the network layer between servers.

On the Detect and Respond front, in Windows Server 2019, we are embedding Windows Defender Advanced Threat Protection (ATP) that provides preventative protection, detects attacks and zero-day exploits among other capabilities, into the operating system. This gives customers access to deep kernel and memory sensors, improving performance and anti-tampering, and enabling response actions on server machines.

Application Platform:

A key guiding principle for us on the Windows Server team is a relentless focus on the developer experience. Two key aspects to call out for the developer community are improvements to Windows Server containers and Windows Subsystem on Linux (WSL).

Since the introduction of containers in Windows Server 2016, we have seen great momentum in its adoption. Tens of millions of container images have been downloaded from the Docker Hub. The team learned from feedback that a smaller container image size will significantly improve experience of developers and IT Pros who are modernizing their existing applications using containers. In Windows Server 2019, our goal is to reduce the Server Core base container image to a third of its current size of 5 GB. This will reduce download time of the image by 72%, further optimizing the development time and performance.

We are also continuing to improve the choices available when it comes to orchestrating Windows Server container deployments. Kubernetes support is currently in beta, and in Windows Server 2019, we are introducing significant improvements to compute, storage, and networking components of a Kubernetes cluster.

A feedback we constantly hear from developers is the complexity in navigating environments with Linux and Windows deployments. To address that, we previously extended Windows Subsystem on Linux (WSL) into insider builds for Windows Server, so that customers can run Linux containers side-by-side with Windows containers on a Windows Server. In Windows Server 2019, we are continuing on this journey to improve WSL, helping Linux users bring their scripts to Windows while using industry standards like OpenSSH, Curl & Tar.

Hyper-converged infrastructure (HCI): 

Hyper-converged infrastructure (HCI): HCI is one of the latest trends in the server industry today. According to IDC, the HCI market grew 64% in 2016 and Gartner says it will be a $5 billion market by 2019. This trend is primarily because customers understand the value of using x86 servers with high performant local disks to run their compute and storage needs at the same time. In addition, HCI gives the flexibility to easily scale such deployments.

Customers looking for HCI solutions can use Windows Server 2016 and the Windows Server Software Defined program today. We partnered with industry leading hardware vendors to provide an affordable and yet extremely robust HCI solution with validated design. In Windows Server 2019 we are building on this platform by adding scale, performance, and reliability. We are also adding the ability to manage HCI deployments in Project Honolulu, to simplify the management and day-to-day activities on HCI environments.

Finally, Window Server customers using System Center will be excited to know that System Center 2019 is coming and will support Windows Server 2019.

We have much more to share between now and the launch later this year. We will bring more details on the goodness of Windows Server 2019 in a blog series that will cover the areas above.

What’s new in Windows Server, version 1709 for the software-defined datacenter | BRK2278

Windows Server 2019 with no RDSH and Windows 10 Multi-user and even RDmi, where do we go?

The newest rumors and stories in my timeline suggested that the RDSH role is depleted in Windows Server 2019. Windows Server 2019 is a preview version just released. Some are installing it and they find that you can’t install the Remote Desktop Services role anymore. Together with stories about a Multi-user Windows 10 version, Microsoft working on RDmi, rumors come easily. My thoughts on this are captured in this blog, they are thoughts only so far, the truth is out there but not available for us right now. Perhaps my thoughts are far-fetched but it is what came to mind. There is an update already, I woven it into the article.

Remote Desktop Services Host is a role of Remote desktop services. RDS is the backbone of a lot of virtual environments. Since the late 90s, we’ve seen Citrix and Microsoft progress their offering based on this. You can’t deploy Citrix XenApp,  VMware Horizon RDSH server or Microsoft RDSH without this role enabled. Many companies rely on this role. Multiple users could access applications or a desktop session on one server and work together without interfering with each other. It paved the way to a centralized desktop (before VDI came into play) with a reasonable TCO. One of the key benefits of this model was that data and application managed was centralized.

The downside of the solution always was the fact that resources are shared, applications are not always supported and features like store apps are not supported. The performance was a challenge for some use cases and that’s one of the reasons VDI was introduced, a single user desktop with non-shared resources (shared on a different level).

Windows Server 2019
Soon after Windows Server 2019 – Preview Release was available stories came out of the RDSH role missing. I saw several stories about trying to install the role but failing to do so. Of course, this is a preview so we have to see if the final version also has this limitation. If the role is not available, and why would the preview not have a default role like this, there be no reason for that. It seems that the RDSH role is to disappear and that customers will be offered other option, read on for the other options.

Sign on the wall
There are signs on the wall that times are a changing. Let’s take a look at the different suspects in this case (watching a detective while writing). Windows 10 Multi-user and RDmi are the ones that come to mind.

MVPDays - New & Cool Tools! Management with Project Honolulu - Mike Nelson

Windows 10 Multi-user
Microsoft Windows 10 will be having a multi-user version. So the initial thought was that they are transferring the RDS roles to Windows 10. It would make sense in a way that several features are easier implemented when running Windows 10. Features like access to Store apps, OneDrive on demand are accessible for Windows 10 users. That, however, is only true when you run a single user Windows 10 platform and will not have issues with a multi-user environment no matter the operating system. A Windows 10 Multi-user to replace an RDSH server to bring certain features seems far sought.

One reason I can think of is licensing. Server licenses are less expensive and transferring RDS to Windows 10 would force customers to acquire Windows 10 Desktop licenses with the CALs. For a lot of customers that would be a huge issue perhaps even getting them to think of moving to physical devices again. Microsoft announced that Windows Server 2019 might be more expensive and forcing people to RDS-VDI environments might hurt them more than they like to. Initially, I thought this was the reason for the missing role but perhaps there is more. This is still a valid option I think but one for the future when RDmi is a more common scenario.

Another announcement of Microsoft is RDmi, Remote Desktop modern infra. Another initial thought is about Citrix XenApp essentials and RDmi but that’s another topic. One I work on from the 1st of April. Back to the topic.

RDmi is Remote Desktop Modern infra is the evolution in RDS and is offered as a .NET service running in Azure. The idea behind it is that all the roles you need to set up an RDS environment (given you want a Microsoft environment) are offered as a service. I won’t go deeper into RDmi right now, the intent of this article is not to explain RDmi. What I see from this offering is that Microsoft is moving RDS to Azure and enabling it to work with HTML5 clients as well. It enables more flexibility and disconnect some components from your network. There is far more to learn about this but the drawing and link below give a very good insight.

More info is found at https://cloudblogs.microsoft.com/enterprisemobility/2017/09/20/first-look-at-updates-coming-to-remote-desktop-services/

There will be a migration strategy offered for customers when it goes live. we have to wait a bit for more info. there are some blogs online already so do your “google” search.

Windows 10 Multi-user, RDmi or “old skool” RDSH, where do we go?
RDmi is a more interesting suspect, it brings modern features to RDS. It brings Azure into the picture and would offer customers a route to migrate to the new RDS offering without huge investments and testing. not every customer is keen on moving their workload to the Cloud so that might be why Windows 10 Multi-User mode is coming, although I wonder if customers are looking for that one.

I think, but that is just me, that Multi-user Windows 10s use case is different. Not sure yet what that use case is but not to massively replace RDSH. Migrating to Windows 10 would cost a lot of effort for customers, assuming they now run a server version for their desktop environment. The Windows 10 features would not be usable with multiple users working alongside each other.

Extending Windows Admin Center to manage your applications and infrastructure using modern browser-based technologies

So there are two offerings on the table and if you ask me I think there will be a campaign to move customers to RDmi. It won’t take away the burden of image management but will offer the roles as a service relieving IT admins from that management. We’ve seen similar offerings from Citrix and VMware, take the management burden away and let IT admins take care of the image only. Customs that can’t or won’t still run an on-premises environment presumably with Windows 10 in the future (1809). Microsoft is mapping the future and their idea of how you offer RDSH, as a service that is.

Because Microsoft has shifted to a more gradual upgrade of Windows Server, many of the features that will become available with Windows Server 2019 have already been in use in live corporate networks, and here are half a dozen of the best.

Enterprise-grade hyperconverged infrastructure (HCI)

With the release of Windows Server 2019, Microsoft rolls up three years of updates for its HCI platform. That’s because the gradual upgrade schedule Microsoft now uses includes what it calls Semi-Annual Channel releases – incremental upgrades as they become available. Then every couple of years it creates a major release called the Long-Term Servicing Channel (LTSC) version that includes the upgrades from the preceding Semi-Annual Channel releases.

Windows Admin Center

The LTSC Windows Server 2019 is due out this fall, and is now available to members of Microsoft’s Insider program.

While the fundamental components of HCI (compute, storage and networking) have been improved with the Semi-Annual Channel releases, for organizations building datacenters and high-scale software defined platforms, Windows Server 2019 is a significant release for the software-defined datacenter.

With the latest release, HCI is provided on top of a set of components that are bundled in with the server license. This means a backbone of servers running HyperV to enable dynamic increase or decrease of capacity for workloads without downtime. (For more on Microsoft HCI go here.)

GUI for Windows Server 2019

A surprise for many enterprises that started to roll-out the Semi-Annual Channel versins of Windows Server 2016 was the lack of a GUI for those releases.  The Semi-Annual Channel releases only supported ServerCore (and Nano) GUI-less configurations.  With the LTSC release of Windows Server 2019, IT Pros will once again get their desktop GUI of Windows Server in addition to the GUI-less ServerCore and Nano releases.

Project Honolulu

With the release of Windows Server 2019, Microsoft will formally release their Project Honolulu server management tool.

Project Honolulu is a central console that allows IT pros to easily manage GUI and GUI-less Windows 2019, 2016 and 2012R2 servers in their environments.

The evolution of Windows Server: Project Honolulu and what's new in 1709

Early adopters have found the simplicity of management that Project Honolulu provides by rolling up common tasks such as performance monitoring (PerfMon), server configuration and settings tasks, and the management of Windows Services that run on server systems.  This makes these tasks easier for administrators to manage on a mix of servers in their environment.

Updates to server management with the Windows Admin Center (formerly Honolulu) & PowerShell Core

Improvements in security

Microsoft has continued to include built-in security functionality to help organizations address an “expect breach” model of security management.  Rather than assuming firewalls along the perimeter of an enterprise will prevent any and all security compromises, Windows Server 2019 assumes servers and applications within the core of a datacenter have already been compromised.

Windows Server 2019 includes Windows Defender Advanced Threat Protection (ATP) that assess common vectors for security breaches, and automatically blocks and alerts about potential malicious attacks.  Users of Windows 10 have received many of the Windows Defender ATP features over the past few months. Including  Windows Defender ATP on Windows Server 2019 lets them take advantage of data storage, network transport and security-integrity components to prevent compromises on Windows Server 2019 systems.

The battle to increase security continues unabated and in this version we get Windows Defender ATP Exploit Guard, which is an umbrella for four new features: Network protection blocks outbound access from processes on the server to untrusted hosts/IP address based on Windows Defender SmartScreen information. Controlled folder access protects specified folders against untrusted process access such as ransomware whereas Exploit protection mitigates vulnerabilities in similar ways to what EMET used to do. Finally, Attack Surface Reduction (ASR) lets you set policies to block malicious files, scripts, lateral movement and so on.

Windows Defender Advanced Threat Protection (ATP) is now available for Windows Server, as well, and can integrate with your current deployment.

These measures will increase the security of your Hyper-V hosts but another feature (also first seen in a SAC release) applies directly to virtualization deployments: Encrypted Networks in SDN. A single click when you create a new virtual network in the SDN stack will ensure that all traffic on that network is encrypted, preventing eavesdropping. Note that this does not protect against malicious administrators but curiously, Microsoft has promised such protection in forthcoming versions, bringing the network protection in line with the host security Shielded Virtual Machines offer.

Smaller, more efficient containers

Organizations are rapidly minimizing the footprint and overhead of their IT operations and eliminating more bloated servers with thinner and more efficient containers. Windows Insiders have benefited by achieving higher density of compute to improve overall application operations with no additional expenditure in hardware server systems or expansion of hardware capacity.

Windows Server 2019 has a smaller, leaner ServerCore image that cuts virtual machine overhead by 50-80 percent.  When an organization can get the same (or more) functionality in a significantly smaller image, the organization is able to lower costs and improve efficiencies in IT investments.

There's a lot of focus on hybrid cloud in this preview, which makes sense, given Microsoft's assertion that most businesses will be in a hybrid state for a long time to come. The focus on containers continues with much smaller images available for both the server core and Nano server images.

But the coolest feature yet is the ability to run Linux containers on Windows Server. This first saw light in one of the SAC releases and it makes a lot of sense. Remember that in Windows (unlike Linux) we have two flavors of containers, Windows Containers and Hyper-V Containers. For a developer they work exactly the same and it's a deployment choice (develop on normal containers and deploy in production in Hyper-V containers). The Hyper-V flavor gives you the security isolation of a VM although they're much smaller than a "real" VM. So, the next logical step was running a different OS in the container, in this case Linux. Following a tutorial, I was able to get a Linux  container up and running quickly.

Windows subsystem on Linux

A decade ago, one would rarely say Microsoft and Linux in the same breath as complimentary platform services, but that has changed. Windows Server 2016 has open support for Linux instances as virtual machines, and the new Windows Server 2019 release makes huge headway by including an entire subsystem optimized for the operation of Linux systems on Windows Server.

The Windows Subsystem for Linux extends basic virtual machine operation of Linux systems on Windows Server, and provides a deeper layer of integration for networking, native filesystem storage and security controls. It can enable encrypted Linux virtual instances. That’s exactly how Microsoft provided Shielded VMs for Windows in Windows Server 2016, but now native Shielded VMs for Linux on Windows Server 2019.

Enterprises have found the optimization of containers along with the ability to natively support Linux on Windows Server hosts can decrease costs by eliminating the need for two or three infrastructure platforms, and instead running them on Windows Server 2019.

Because most of the “new features” in Windows Server 2019 have been included in updates over the past couple years, these features are not earth-shattering surprises.  However, it also means that the features in Windows Server 2019 that were part of Windows Server 2016 Semi-Annual Channel releases have been tried, tested, updated and proven already, so that when Windows Server 2019 ships, organizations don’t have to wait six to 12 months for a service pack of bug fixes.

Windows Admin Center

No discussion of the future of Windows Server is complete without mentioning the free, Web-based Windows Admin Center (WAC), formerly known as "Project Honolulu." It's going to be the GUI for managing Windows Server, including Hyper-V servers, clusters, Storage Spaces Direct and HCI clusters. It's got a lot of benefits over the current mix of Server Manager, Hyper-V Manager and Failover Cluster Manager (along with PowerShell) that we use today, including the simple fact that it's all in the one UI.

How to get started with Windows Admin Center

Updates to server management with the Windows Admin Center (formerly Honolulu) & PowerShell Core

Storage Replica & Migration

In Windows Server 2016 (Datacenter only) we finally got the missing puzzle piece in Microsoft's assault on SANs -- Storage Replica (SR). This directly competes with (very expensive) SAN replication technologies and lets you replicate from any volume on a single server or a cluster to another volume in another location (synchronously up to 150 km [90 miles for those of you in the United States]), asynchronously anywhere on the planet). This is useful for creating stretched Hyper-V clusters for very high resiliency or for Disaster Recovery (DR) in general.

In Windows Server 2019 Standard we're getting SR "Lite": a single volume per server (unlimited in Datacenter), a single partnership per volume (unlimited in Datacenter) and up to 2TB volumes (unlimited in Datacenters). These are the current limitations in the preview and voting is open to change this.

Hyper-V Replica is a different technology than SR. For instance, you could create a stretched Hyper-V cluster with SR as the transport mechanism for the underlying storage between the two locations and then use Hyper-V Replica for DR, replicating VMs to a third location or to Azure.

A totally new feature, Storage Migration Service is coming in Windows Server 2019. Intended to solve the problem of migrating from older versions of Windows Server to 2019 or Azure, it's not directly related to Hyper-V, although you can of course use it from within VMs or to migrate data to Azure Stack.

Data Deduplication is now available for Storage Spaces Direct (S2D) with the ReFS filesystem, so you could be looking at saving up to 50 percent of disk space. Speaking of S2D, Microsoft now supports Persistent Memory (aka Storage Class Memory) which is essentially battery-backed DDR memory sticks, leading to storage with incredibly low latency. Also new is performance history for S2D, where you can get a history of performance across drives, NICs, servers, VMs, vhd/vhdx files, volumes and the overall cluster. You can either use PowerShell or Windows Admin Center to access the data.

Failover Clustering

One of the biggest gripes I hear from cluster administrators is the difficulty of moving a cluster from one domain to another (mergers is a common cause of this); this is being addressed in 2019. Using just two PowerShell cmdlets you can remove the cluster name account from the original Active Directory domain, shut down the cluster functionality, unjoin from the source domain and add all nodes to a workgroup, then join them to the new domain and create new cluster resources in the destination AD domain. This definitely adds flexibility around Hyper-V clusters and their domain status.

Speaking of clusters, most businesses I speak to tend to keep the number of nodes in their clusters relatively low (six, eight, 12 and 16 nodes), even though the max number of nodes is 64, and instead have more clusters. Each of these clusters is totally separate but that's going to change in Windows Server 2019. You'll be able to group several clusters together (Hyper-V, Storage and even Hyper-Converged), with a Master cluster resource running on one cluster, coordinating with a Cluster Set Worker in each cluster. You'll be able to Live Migrate VMs from one cluster to another. I can see this being useful for scaling out Azure Stack (currently limited to 12 nodes) and for bringing the concept of the Software-Defined Datacenter (SDDC) closer to reality.

Another minor but potentially vital detail is using a file share witness stored in DFS. This isn't and has never been supported but not everyone reads the documentation. Imagine a six-node cluster with three nodes in a separate building with a file share witness as the tie breaker for the quorum. You could end up in a situation where the network connection between the two buildings is severed and the three nodes on one side keeps the cluster service (and thus the VMs) running because they can talk to the file share witness. But the other side has a DFS replicated copy of the same file share witness, so they, too, decide to keep the cluster service running (as they also have a majority of votes) and both sides could potentially be writing to back-end storage simultaneously, leading to serious data corruption. In Windows Server 2019 if you try to store a file share witness in DFS you'll get an error message and if it's added to DFS replication at some point in time later, it'll stop working.

You can also create a file share witness that doesn't use an AD account for scenarios where a DC isn't available (DMZ), or in a workgroup/cross-domain cluster.

Hyper-converged infrastructure (HCI): In Windows Server 2019, HCI will get scale, performance, and reliability. The team is also adding the ability to manage HCI deployments in Project Honolulu, to simplify the management and day-to-day activities on HCI environments.

Windows Server 2019 will be integrated with Project Honolulu, a browser-based management solution. Microsoft aims to make it easier for enterprises to connect their existing deployments of Windows Server to Azure services.

“With Windows Server 2019 and Project Honolulu, customers will be able to easily integrate Azure services such as Azure Backup, Azure File Sync, disaster recovery, and much more so they will be able to leverage these Azure services without disrupting their applications and infrastructure,” wrote Erin Chapple, Director of Program Management, Windows Server.

Microsoft is enhancing the security in Windows Server 2019, with a three-point approach: protect, detect and respond. The company has added Shielded VMs with support for Linux VMs as well. It will protect VMs against malicious activities. The addition of Encrypted Networks will enable encryption of network segments to protect network layer between servers.

Windows Server 2019 will have embedded Windows Defender Advanced Threat Protection (ATP) to detect attacks in the operating system. Sysadmins will have access to deep kernel and memory sensors, so that they can respond on server machines.

Under application platform, there will be improved orchestration for Windows Server container deployments. Windows Subsystem on Linux (WSL) support in new version will enable Linux users to bring their scripts to Windows while using industry standards like OpenSSH, Curl, and Tar. There is also a support of Kubernetes, which is currently in beta.

The Windows Server 2019 reduces the size of Server Core base container image from 5 GB to less than 2 GB. This will reduce the image download time by 72%, resulting in optimized development time and performance.

On Hyper-converged infrastructure (HCI) front, Microsoft said that it has added the ability in Windows Server 2019 to manage HCI deployments using Project Honolulu. It will make the management of several activities on HCI environments simpler.

This is a significant change that is helping organizations plan their adoption of Windows Server 2019 sooner than orgs may have adopted a major release platform in the past, and with significant improvements for enterprise datacenters in gaining the benefits of Windows Server 2019 to meet security, scalability, and optimized data center requirements so badly needed in today’s fast-paced environments.

Sign up for the Insiders program to access Windows Server 2019

We know you probably cannot wait to get your hands on the next release, and the good news is that the preview build is available today to Windows Insiders  https://insider.windows.com/en-us/for-business-getting-started-server/.

Join the program to ensure you have access to the bits. For more details on this preview build, check out the Release Notes.

We love hearing from you, so don’t forget to provide feedback using the Windows Feedback Hub app, or the Windows Server space in the Tech community.

Frequently asked questions

Q: When will Windows Server 2019 be generally available?

A: Windows Server 2019 will be generally available in the second half of calendar year 2018.

Q: Is Windows Server 2019 a Long-Term Servicing Channel (LTSC) release?

A: Windows Server 2019 will mark the next release in our Long-Term Servicing Channel. LTSC continues to be the recommended version of Windows Server for most of the infrastructure scenarios, including workloads like Microsoft SQL Server, Microsoft SharePoint, and Windows Server Software-defined solutions.

Q: What are the installation options available for Windows Server 2019?

A: As an LTSC release Windows Server 2019 provides the Server with Desktop Experience and Server Core installation options – in contrast to the Semi-Annual Channel that provides only the Server Core installation option and Nano Server as a container image. This will ensure application compatibility for existing workloads.

Q: Will there be a Semi-Annual Channel release at the same time as Windows Server 2019?

A: Yes. The Semi-Annual Channel release scheduled to go at the same time as Windows Server 2019 will bring container innovations and will follow the regular support lifecycle for Semi-Annual Channel releases – 18 months.

Q: Does Windows Server 2019 have the same licensing model as Windows Server 2016?

A: Yes. Check more information on how to license Windows Server 2016 today in the Windows Server Pricing page. It is highly likely we will increase pricing for Windows Server Client Access Licensing (CAL). We will provide more details when available.

More Information: