• IBM Consulting

    DBA Consulting can help you with IBM BI and Web related work. Also IBM Linux is our portfolio.

  • Oracle Consulting

    For Oracle related consulting and Database work and support and Migration call DBA Consulting.

  • Novell/RedHat Consulting

    For all Novell Suse Linux and SAP on Suse Linux questions releated to OS and BI solutions. And offcourse also for the great RedHat products like RedHat Enterprise Server and JBoss middelware and BI on RedHat.

  • Microsoft Consulting

    For Microsoft Server 2012 onwards, Microsoft Client Windows 7 and higher, Microsoft Cloud Services (Azure,Office 365, etc.) related consulting services.

  • Citrix Consulting

    Citrix VDI in a box, Desktop Vertualizations and Citrix Netscaler security.

  • Web Development

    Web Development (Static Websites, CMS Websites (Drupal 7/8, WordPress, Joomla, Responsive Websites and Adaptive Websites).

24 September 2018

Kubernetes for the Enterprise!

Announcing SUSE CaaS Platform 3

Containers for Big Data: How MapR Expands Containers Use to Access Data Directly

Every enterprise needs Kubernetes today, including yours.  But with the platform evolving so rapidly, it can be difficult to keep up.  Not to worry, SUSE can take care of that for you: SUSE CaaS Platform delivers Kubernetes advancements to you in an enterprise-grade solution.

SUSE and Big Data

SUSE today announced SUSE CaaS Platform 3, introducing support for a raft of new features and a special focus on the Kubernetes platform operator.  You can read all about it in the press release, but let’s hit on a few of the highlights here.  With SUSE CaaS Platform 3 you can:

Optimize your cluster configuration with expanded datacenter integration and cluster re-configuration options
Setting up your Kubernetes environment is easier than ever with improved integration of private (OpenStack) and public (Amazon Web Services, Microsoft Azure, and Google Cloud Platform) cloud storage, and automatic deployment of the Kubernetes software load balancer.

Persistent Storage for Docker Containers | Whiteboard Walkthrough

A new SUSE toolchain module also allows you to tune the MicroOS container operating system to support your custom hardware configuration needs. Now you can, for example, install additional packages required to run your own monitoring agent or other custom software.
Transform your start-up cluster into a highly available environment. With new cluster reconfiguration capabilities, you can switch from a single-master to a multi-master environment, or vice-versa, to accommodate your changing needs.

Manage container images more efficiently and securely with a local container registry
Download a container image from an external registry once, then save a copy in your own local registry for sharing among all nodes in your cluster. By connecting to an internal proxy rather than an external registry, and by downloading from a local cache rather than a remote server, you’ll improve security and increase performance every time a cluster node pulls an image from the local registry.
For still greater security, disconnect from external registries altogether and use only trusted images you’ve loaded into your local registry.

Try out the new, lightweight CRI-O container runtime, designed specifically for Kubernetes, and introduced in CaaSP 3 as a tech preview feature. Stable and secure, CRI-O is also smaller and architecturally simpler than traditional container runtimes.

Simplify deployment and management of long running workloads through the Apps Workloads API. Promoted to ‘stable’ in upstream Kubernetes 1.9 code, the Apps Workloads API is now supported by SUSE.  This API generally facilitates orchestration (self-healing, scaling, updates, termination) of common types of workloads.

Modern Big Data Pipelines over Kubernetes [I] - Eliran Bivas, Iguazio

With Kubernetes now a must-have for every enterprise, you’ll want to give SUSE CaaS Platform a serious look.  Focused on providing an exceptional platform operator experience, it delivers Kubernetes innovations in a complete, enterprise grade solution that enables IT to deliver the power of Kubernetes to users more quickly, consistently, and efficiently.

SUSE CaaS Platform also serves as the Kubernetes foundation for SUSE Cloud Application Platform, which addresses modern application developers’ needs by bringing the industry’s most respected cloud-native developer experience (Cloud Foundry) into a Kubernetes environment.

SUSE CaaS Platform

SUSE CaaS Platform is an enterprise class container management solution that enables IT and DevOps professionals to more easily deploy, manage, and scale container-based applications and services. It includes Kubernetes to automate lifecycle management of modern applications, and surrounding technologies that enrich Kubernetes and make the platform itself easy to operate. As a result, enterprises that use SUSE CaaS Platform can reduce application delivery cycle times and improve business agility.

SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK

SUSE is focused on delivering an exceptional operator experience with SUSE CaaS Platform.

HDFS on Kubernetes—Lessons Learned - Kimoon Kim

With deep competencies in infrastructure, systems, process integration, platform security, lifecycle management and enterprise-grade support, SUSE aims to ensure IT operations teams can deliver the power of Kubernetes to their users quickly, securely and efficiently. With SUSE CaaS Platform you can:

Achieve faster time to value with an enterprise-ready container management platform, built from industry leading technologies, and delivered as a complete package, with everything you need to quickly offer container services.

Simplify management and control of your container platform with efficient installation, easy scaling, and update automation.

Maximize return on your investment, with a flexible container services solution for today and tomorrow.

Episode 3: Kubernetes and Big Data Services

Key Features

A Cloud Native Computing Foundation (CNCF) certified Kubernetes distribution, SUSE CaaS Platform automates the orchestration and management of your containerized applications and services with powerful Kubernetes capabilities, including:

  • Workload scheduling places containers according to their needs while improving resource utilization
  • Service discovery and load balancing provides an IP address for your service, and distributes load behind the scenes
  • Application scaling up and down, accommodates changing load
  • Non-disruptive Rollout/Rollback of new applications and updates enables frequent change without downtime
  • Health monitoring and management supports application self-healing and ensures application availability

In addition, SUSE CaaS Platform simplifies the platform operator’s experience, with everything you need to get up and running quickly, and to manage the environment effectively in production. It provides:

  • Application ecosystem support with SUSE Linux container base images, and access to tools and services offered by SUSE Ready for CaaS Platform partners and the Kubernetes community
  • Enhanced datacenter integration features that enable you to plug Kubernetes into new or existing infrastructure, systems, and processes
  • A complete container execution environment, including a purpose-built container host operating system, container runtime, and container image registries
  • End-to-End security, implemented holistically across the full stack
  • Advanced platform management that simplifies platform installation, configuration, re-configuration, monitoring, maintenance, updates, and recovery
  • Enterprise hardening including comprehensive interoperability testing, support for thousands of platforms, and world-class platform maintenance and technical support

Cisco and SUSE have collaborated for years on solutions that improve efficiencies and lower costs in the data center by leveraging the flexibility and value of the UCS platform and the performance and reliability of the SUSE Linux Enterprise Server.

With focus and advancement in the areas of compute, storage and networking, Cisco and SUSE are now looking to help organizations tackle the challenges associated with the ‘5 Vs’ of Big Data:

1. Volume
2. Variety
3. Velocity
4. Veracity (of data)
5. Value

Ian Chard of Cisco recently published a great read untangling these challenges, and pointing to areas that help harness the power of data analytics.
Article content Below:

The harnessing of data through analytics is key to staying competitive and relevant in the age of connected computing and the data economy.

Analytics now combines statistics, artificial intelligence, machine learning, deep learning and data processing in order to extract valuable information and insights from the data flowing through your business.

Unlock the Power of Kubernetes for Big Data by Joey Zwicker, Pachyderm

Your ability to harness analytics defines how well you know your business, your customers, and your partners – and how quickly you understand them.

But it’s still hard to gain valuable insights from data. Collectively the challenges are known as the ‘5 Vs of big data’:

The Volume of data has grown so much that traditional relational database management software running on monolithic servers is incapable of processing it.

The Variety of data has also increased. There are many more sources of data and many more different types.

Velocity describes how fast the data is coming in. It has to be processed, often in real time, and stored in huge volume.

Veracity of data refers to how much you can trust it. Traditional structured data (i.e. in fixed fields or formats) goes through a validation process. This approach does not work with unstructured (i.e. raw) data.

Deriving Value from the data is hard due to the above.

If you’re wrestling with the 5 Vs, chances are you’ll be heading to ExCeL London for the annual Strata Data Conference on 22-24 May 2018.

We’ll be there on Booth 316, together with our partners including SUSE, where we’ll be showcasing how much of the progress made in compute, storage, and networking, as well as distributed data processing frameworks can help to address these challenges.

1) The Infrastructure evolution
Compute demands are growing in direct response to data growth. More powerful servers or, more servers working in parallel – aka scale-out – are needed.

Deep learning techniques for example can absorb an insatiable amount of data, making a robust HDFS cluster a great way to achieve scale out storage for the collection and preparation of the data. Machine learning algorithms can run on traditional x86 CPUs, but GPUs can accelerate these algorithms by up to a factor of 100.

New approaches to data analytics applications and storage are also needed because the majority of the data available is unstructured. Email, text documents, images, audio, and video are data types that are a poor fit for relational databases and traditional storage methods.

Google Cloud Dataproc - Easier, faster, more cost-effective Spark and Hadoop

Storing data in the public cloud can ease the load. But as your data grows and you need to access it more frequently, cloud services can become expensive, while the sovereignty of that data can be a concern.

Software-defined storage is a server virtualisation technology that allows you to shift large amounts of unstructured data to cost-effective, flexible solutions located on-premises. This assures performance and data sovereignty while reducing storage costs over time.

You can use platforms such as Hadoop to create shared repositories of unstructured data known as data lakes. Running on a cluster of servers, data lakes can be accessed by all users. However, they must be managed in a way that’s compliant, using enterprise-class data management platforms that allow you to store, protect and access data quickly and easily.

2) Need for speed
The pace of data analytics innovation continues to increase. Previously, you would define your data structures and build an application to operate on the data. The lifetime of such applications was measured in years.

Today, raw data is collected and explored for meaningful patterns using applications that are rebuilt when new patterns emerge. The lifetime of these applications is measured in months – and even days.

The value of data can also be short-lived. There’s a need to analyse it at source, as it arrives, in real time. Data solutions that employ in-memory processing for example, give your users immediate, drill-down access to all the data across your enterprise applications, data warehouses and data lakes.

3) Come see us at Strata Data Conference
Ultimately, your ability to innovate at speed with security and governance assured comes down to your IT infrastructure.

Cisco UCS is a trusted computing platform proven to deliver lower TCO and optimum performance and capacity for data-intensive workloads.

85% of Fortune 500 companies and more than 60,000 organisations globally rely on our validated solutions. These combine our servers with software from a broad ecosystem of partners to simplify the task of pooling IT resources and storing data across systems.

Modern big data and machine learning in the era of cloud, docker and kubernetes

Crucially, they come with role- and policy-based management, which means you can configure hundreds of storage servers as easily as you can configure one, making scale-out a breeze as your data analytics projects mature.

If you’re looking to transform your business and turn your data into insights faster, there’s plenty of reasons to come visit us on booth 316:

4) Accelerated Analytics
If your data lake is deep and your data scientists are struggling to making sense of what lies beneath, then our MapD demo powered by data from mobile masts will show you how to cut through the depths and find the enlightenment you seek fast.

5) Deep learning with Cloudera Data Science Workbench
For those with a Hadoop cluster to manage their data lakes and deep learning framework, we’ll be demonstrating how to accelerate the training of deep learning modules with Cisco UCS C240 and C480 servers equipped with 2 and 6 GPUs respectively. We’ll also show you how to support growing cluster sizes using cloud-managed service profiles rather than more manpower.

6) Get with the Cisco Gateway
If you’re already a customer and fancy winning some shiny new tech, why not step through the Gateway to grow your reputation as a thought leader and showcase the success you’ve had?

7) Find your digital twin
To effectively create a digital twin of the enterprise, data scientists have to incorporate data sources inside and outside of the data centre for a holistic 360-view. Come join our resident expert Han Yang for his session on how we’re benefiting from big data and analytics, as well as helping our customers to incorporate data sources from Internet of Things and deploy machine learning at the edge and at the enterprise.

8) Get the scoop with SUSE
We’re set to unveil a new integration of SUSE Linux Enterprise Server and Cisco UCS. There’ll be SUSE specialists on our booth, so you can be the first to find out more about what’s in the pipeline.

What is Kubernetes?

Kubernetes is an open source system for automatically orchestrating and managing containerized applications.

AI meets Big Data

Designing applications using open source Linux containers is an ideal approach for building cloud-native applications for hosting in private, public or hybrid clouds. Kubernetes automates the deployment, management and scaling of these containerized applications, making the whole process easier, faster and more efficient.

Businesses of all types are looking for a new paradigm to drive faster innovation and agility. This is changing forever how applications are architected, deployed, scaled and managed to deliver new levels of innovation and agility. Kubernetes has become widely embraced by almost everyone interested in dramatically accelerating application delivery with containerized and cloud-native workloads.

Kubernetes is now seen as the outright market leader by software developers, operations teams, DevOps professionals and IT business decision makers.

Manage Microservices & Fast Data Systems on One Platform w/ DC/OS

Kubernetes Heritage

Kubernetes was originally the brainchild of Google. Google has been building and managing container-based applications and cloud-native workloads in production and at scale for well over a decade. Kubernetes emerged from the knowledge and experience gained with earlier Google container management systems called Borg and Omega.

Extending DevOps to Big Data Applications with Kubernetes

Now an open source project, Kubernetes is under the stewardship of the Cloud Native Computing Foundation (CNCF) and The Linux Foundation. This ensures that the project benefits from the best ideas and practices from a huge open source community and makes sure the danger of vendor lock-in is avoided.

Key Features:

  • Deploy applications rapidly and predictably to private, public or hybrid clouds
  • Scale applications non-disruptively
  • Roll out new features seamlessly
  • Lean and efficient use of computing resources
  • Keep production applications up and running with self-healing capabilities

SUSE and Kubernetes

SUSE believes Kubernetes will be a key element of the application delivery solutions needed to drive the enterprise business of the future.

Big data and Kubernetes

Here is a selection of SUSE products built using Kubernetes:

SUSE Cloud Application Platform brings advanced Cloud Foundry productivity to modern Kubernetes infrastructure, helping software development and operations teams to streamline lifecycle management of traditional and new cloud-native applications. Building on
SUSE CaaS Platform, SUSE Cloud Application Platform adds a unique Kubernetes-based implementation of Cloud Foundry, introducing a powerful DevOps workflow into a Kubernetes environment. Built on enterprise-grade Linux and with full Cloud Foundry and Kubernetes certification, it is an outstanding platform to support the entire development lifecycle for traditional and new cloud-native applications.

SUSE OpenStack Cloud makes it easy to spin up Kubernetes clusters in a full multi-tenant environment, allowing different users to have their own Kubernetes cluster. Customers can use either the built-in support for OpenStack Magnum or leverage SUSE CaaS Platform, which gives the added benefits of ready-to-run images, templates and heat automation. With these Kubernetes-as-a-Service capabilities, it’s no wonder that OpenStack users are reported to be adopting containers 3 times faster than the rest of the enterprise market.

SUSE CaaS Platform is a certified Kubernetes software distribution. It provides an enterprise-class container management solution that enables IT and DevOps professionals to more easily deploy, manage, and scale container-based applications and services. Using SUSE CaaS Platform, enterprises can reduce application delivery cycle times and improve business agility.

What's the Hadoop-la about Kubernetes?


Big data, long an industry buzzword, is now commonplace among most businesses. A 2014 survey from Gartner found 73 percent of organizations had already invested or planned to invest in big data by 2016. For many companies, the question now is not how to manage and harness data, but how to do it even more effectively. The next frontier for big data is to master speed. If you can’t analyze big data in real time, you lose much of the value of the information passing through databases.

What is fast data?

Fast Data with Apache Ignite and Apache Spark - Christos Erotocritou

While big data refers to the massive fire hose of information generated each hour, fast data refers to data that provides real-time insights. In many industries, especially the payment industry, making quick analyses of information is crucial to the bottom line. For example, fast data could prevent a massive breach that would release sensitive customer information. In this case, analyzing data in real time is far more important than storing it in massive quantities. When it comes to ecommerce fraud, the insights happening in the moment matter the most.

Kubernetes vs Docker Swarm | Container Orchestration War | Kubernetes Training | Edureka

As a Wired article put it, where in the past, gaining insights from big data was like finding a needle in a haystack, fast data is like finding the needle as soon as it’s dropped.

Fast data for payments

“For payment systems, decisions must be made in the sub-second range,” Richard Harris, head of international operations at Feedzai, recently told Payment Cards and Mobile. “Our clients typically require 20-50 millisecond response times. So we’ve overcome this by using technology founded in the Big Data era, such as Hadoop and Cassandra.”

Apache Spark on Kubernetes - Anirudh Ramanathan & Tim Chen

Payment processor First Data and Feedzai has teamed up to use machine learning to fight fraud. Feedzai monitors the company’s STAR Network, which enables debit payments for First Data’s clients.

Todd Clark, Senior Vice President and Head of STAR Network and Debit Processing at First Data explained “The combination of Feedzai’s machine learning software and First Data’s experience, [Clark says], has made the STAR Network capable of scoring over 3,000 transactions per second.”

 “This big speed and accuracy advantage means the STAR network is less of an attractive target for fraud,” Harris said.

Infrastructure challenges

Not all systems are set up to handle fast data. Without the right tools to manage the data flow quickly, valuable insights are lost or gained too late to be of use. While many existing platforms can handle and store large quantities of data, most fall behind when it comes to analyzing the information in real time. To begin with, organizations need to move beyond systems that only allow batch processing, according to Wired. In this case, companies need to tell computers to analyze large batches of information, which it processes one at a time – similar to the way credit card bills are processed at the end of each month.

With most companies now set up to gain insights from big data, the next step is to enable real-time insights. In the payment world, this means catching potential fraud as it’s happening, not waiting until it has already happened.

Beyond Hadoop: The Rise of Fast Data

Over the past two to three years, companies have started transitioning from big data, where analytics are processed after-the-fact in batch mode, to fast data, where data analysis is done in real-time to provide immediate insights. For example, in the past, retail stores such as Macy’s analyzed historical purchases by store to determine which products to add to stores in the next year. In comparison, Amazon drives personalized recommendations based on hundreds of individual characteristics about you, including what products you viewed in the last five minutes.

Containerized Hadoop beyond Kubernetes

Big data is collected from many sources in real-time, but is processed after collection in batches to provide information about the past. The benefits of data are lost if real-time streaming data is dumped into a database because of the inability to act on data as it is collected.

Super Fast Real-time Data Processing on Cloud-Native Architecture [I] - Yaron Haviv, iguazio

Modern applications need to respond to events happening now, to provide insights in real time. To do this they use fast data, which is processed as it is collected to provide real-time insights. Whereas big data provided insights into user segmentation and seasonal trending using descriptive (what happened) and predictive analytics (what will likely happen), fast data allows for real-time recommendations and alerting using prescriptive analytics (what should you do about it).

Big help for your first big data project

It’s clear. Today, big data is changing the way companies work. What hasn’t been clear is how companies should go about implementing big data projects.

Until now.

Our highly practical workbook is full of advice about big data that’ll help you keep your project on track. From setting clear goals to strategic resourcing and ideal big data architectures, we’ve covered everything you need to know about big data.

Streaming Big Data with Heron on Kubernetes Cluster

Read “The Big, Big Data Workbook” to gain insights into:

  • How to choose the right project and set up the right goals
  • How to build the right team and maximize productivity
  • What your data governance framework should look like
  • The architecture and processes you should aim to build
  • “The Big, Big Data Workbook” is a comprehensive guide about the practical aspects of big data and an absolute must-read if you’re attempting to bring greater insights to your enterprise.

More Information:















22 August 2018

Oracle Database 18.3.0 on premises is Released in July

Oracle Database 18.3.0 on premises

Today we have a guest blogger, Dominic Giles, Master Product Manager from Oracle Database providing us with insights into what to expect from Oracle Database 18c.

Oracle Database 18c

Oracle Database 18c's arrival marks a change in the way the world’s most popular database is released. It brings new functionality and improvements to features already available in Oracle Database 12c. In this blog, I'll highlight what you can expect from this new release and where you can get additional information but first let me address the new release model that the Database team has adopted.

Release schedule

Oracle Database 18c is the first version of the product to follow a yearly release pattern. From here onwards the Oracle Database will be released every year along with quarterly updates. You can find more details on this change by visiting Oracle Support and taking a look at the support Document 2285040.1 or on Mike Dietrich’s blog. If you’re confused as to why we’ve apparently skipped 6 releases of Oracle it may be simpler to regard “Oracle Database 18c” as “Oracle Database 12c Release 2”, where we’ve simply changed the naming to reflect the year in which the product is released.

Into the Future with Oracle Autonomous Database

We believe the move to a yearly release model and the simplification of the patching process will result in a product that introduces new smaller changes more frequently without the potential issues that a monolithic update brings.

New Release and Patching Model for Oracle Database


Building on a strong foundation

Oracle Database 18c, as I mentioned earlier, is the next iteration of Oracle Database 12c Release 2 and as a result, it has a lot of incremental enhancements aimed to improve upon this important release. With that in mind, let’s remind ourselves what was in Oracle Database 12c Release 2.

Oracle Autonomous Data Warehouse Cloud Demo

The release itself focused on 3 major areas:

Multitenant is Oracle’s strategic container architecture for the Oracle Database. It introduced the concept of a pluggable database (PDB) enabling users to plug and unplug their databases and move them to other containers either locally or in the cloud. The architecture enables massive consolidation and the ability to manage/patch/backup many databases as one. We introduced this architecture in Oracle Database 12c and extended it capabilities in Oracle Database 12c Release 2 with the ability to hot clone, online relocate and provide resource controls for IO, CPU and Memory on a PDB basis. We also ensured that all of the features available in a non-container are available for a PDB (Flashback Database, Continuous Query etc.).

Database In-Memory enables users to perform lightning fast analytics against their operational databases without being forced to acquire new hardware or make compromises in the way they process their data. The Oracle Database enables users to do this by adopting a dual in-memory model where OLTP data is held both as rows, enabling it to be efficiently updated, and in a columnar form enabling it to be scanned and aggregated much faster. This columnar in-memory format then leverages compression and software in silicon to analyze billions of rows a second, meaning reports that used to take hours can now be executed in seconds. In Oracle Database 12c Release 2 we introduced many new performance enhancements and extended this capability with new features that enabled us to perform in In-Memory analytics on JSON documents as well as significantly improving the speed at which the In-Memory column store is available to run queries after at startup.

Oracle Database Sharding, released in Oracle Database 12c Release 2, provides OLTP scalability and fault isolation for users that want to scale outside of the usual confines of a typical SMP server. It also supports use cases where data needs to be placed in geographic location because of performance or regulatory reasons. Oracle Sharding provides superior run-time performance and simpler life-cycle management compared to home-grown deployments that use a similar approach to scalability. Users can automatically scale up the shards to reflect increases in workload making Oracle one of the most capable and flexible approaches to web scale workloads for the enterprise today.

Oracle Database 18c (autonomous database)

Oracle 12c Release 2 also included over 600 new features ranging from syntax improvements to features like improved Index Compression, Real Time Materialized views, Index Usage Statistics, Improved JSON support, Enhancements to Real Application Clusters and many many more. I’d strongly recommend taking a look at the “New Features guide for Oracle Database 12c Release 2” available here below.

Oracle Autonomous Data Warehouse – How It Works

Incremental improvements across the board

As you’d expect from a yearly release Oracle Database 18c doesn’t contain any seismic changes in functionality but there are lots of incremental improvements. These range from syntax enhancements to improvements in performance, some will require that you explicitly enable them whilst others will happen out of the box. Whilst I’m not going to be able to cover all of the many enhancements in detail I’ll do my best to give you a flavor of some of these changes. To do this I’ll break the improvements into 6 main areas : Performance, High Availability, Multitenant, Security, Data Warehousing and Development.

Oracle database in cloud, dr in cloud and overview of oracle database 18c


For users of Exadata and Real Application Clusters (RAC), Oracle Database 18c brings changes that will enable a significant reduction in the amount of undo that needs to be transferred across the interconnect. It achieves this my using RDMA, over the Infiniband connection, to access the undo blocks in the remote instance. This feature combined with a local commit cache significantly improves the throughput of some OLTP workloads when running on top of RAC. This combined with all of the performance optimization that Exadata brings to the table, cements its position as the highest performance Database Engineered System for both OLTP and Data Warehouse Workloads.

To support applications that fetch data primarily via a single unique key Oracle Database 18c provides a memory optimized lookup capability. Users simply need to allocate a portion of Oracle’s Memory (SGA) and identify which tables they want to benefit from this functionality, the database takes care of the rest. SQL fetches are significantly faster as they bypass the SQL layer and utilize an in-memory hash index to reduce the number or operations that need to be performed to get the row. For some classes of application this functionality can result in upwards of 4 times increase in throughput with a halving of their response times.

Oracle PL/SQL 12c and 18c New Features + RADstack + Community Sites

To ease the maintenance work for In-Memory it’s also now possible to have tables and partitions automatically populated into and aged out of the column store. It does this by utilizing the Heat Map such that when the Column Store is under memory pressure it evicts inactive segments if more frequently accessed segments would benefit from population.

Oracle Database In-Memory gets a number of improvements as well. It now uses parallel light weight threads to scan its compression units rather than a process driven serial scans. This is available for both serial and parallel scans of data and it can double the speed at which data is read. This improves the already exceptional scan performance of Oracle Database In-Memory. Alongside this feature, Oracle Database In-Memory also enables Oracle Number types to be held in their native binary representation (int, float etc). This enables the data to be processed by the vector processing units on processors like Intel’s Xenon CPU much faster than previously. For some aggregation and arithmetic operations this can result in a staggering 40 times improvement in performance.

Finally, In-Memory in Oracle Database 18c also allows you to place data from external tables in the column store, enabling you to execute high performance analytics on data outside of the database.

High Availability

Whether you are using Oracle Real Application Clusters or Oracle DataGuard we continue to look for ways to improve on the Oracle Database’s high availability functionality. With Oracle Database 18c we’re rolling out a few significant upgrades.

Oracle Real Application Clusters also gets a hybrid sharding model. With this technology you can enjoy all of the benefits that a shared disk architecture provides whilst leverage some of the benefits that Sharding offers. The Oracle Database will affinitize table partitions/shards to nodes in the cluster and route connections using the Oracle Database Sharding API based on a shard key. The benefit of this approach is that it formalizes a technique often taken by application developers to improve buffer cache utilization and reduce the number of cross shard pings between instances. It also has the advantage of removing the punitive cost of cross shard queries simply by leveraging RAC’s shared disk architecture.

Sharding also gets some improvements in Oracle Database 18c in the form of “User Defined Sharding” and “Swim Lanes”. Users can now specify how shards are to be defined using either the system managed approach, “Hashing”, or by using an explicit user defined model of “Range” and “List” sharding. Using either of these last two approaches gives users the ability to ensure that data is placed in a location appropriate for its access. This might be to reduce the latency between the application and the database or to simply ensure that data is placed in a specific data center to conform to geographical or regulatory requirements. Sharded swim lanes also makes it possible to route requests through sharded application servers all the way to a sharded Oracle Database. Users do this by having their routing layer call a simple REST API. The real benefit of this approach is that it can improve throughput and reduce latency whilst minimizing the number of possible connections the Oracle Database needs to manage.

For the users of Java in the Database we’re rolling out a welcome fix that will make it possible to perform rolling patching of the database.


Multitenant in Oracle Database 18c got a number of updates to continue to round out the overall architecture.  We’re introducing the concept of a Snapshot Carousel. This enables you to define regular snapshots of PDBs. You can then use these snapshots as a source for PDB clones from various points of time, rather than simply the most current one. The Snapshot Carousel might be ideal for a development environment or to augment a non-mission critical backup and recovery process.

I’m regularly asked if we support Multitenant container to container active/active Data Guard Standbys. This is where some of the primary PDBs in one container have standby PDBs in an opposing container and vice versa. We continue to move in that direction and in Oracle Database 18c we move a step closer with the introduction of “Refreshable PDB Switchover”. This enables users to create a PDB which is an incrementally updated copy of a “master” PDB. Users may then perform a planned switchover between the PDBs inside of the container. When this happens the master PDB becomes the clone and the old clone the master. It’s important to point out that this feature is not using Data Guard; rather it extends the incremental cloning functionality we introduced in Oracle Database 12c Release 2.

In Oracle Database 18c Multitenant also got some Data Guard Improvements. You can now automatically maintain standby databases when you clone a PDB on the primary. This operation will ensure that the PDB including all of its data files are created on the standby database. This significantly simplifies the process needed to provide disaster recovery for PDBs when running inside of a container database. We also have made it possible to clone a PDB from a Active Data Guard Standby. This feature dramatically simplifies the work needed to provide copies of production databases for development environments.

Multitenant also got a number of small improvements that are still worth mentioning. We now support the use of backups performed on a PDB prior to it being unplugged and plugged into a new container. You can also expect upgrades to be quicker under Multitenant in Oracle Database 18c.


The Oracle Database is widely regarded as the most secure database in the industry and we continue to innovate in this space. In Oracle Database 18c we have added a number or small but important updates. A simple change that could have a big impact for the security of some databases is the introduction of schema only accounts. This functionality allows schemas to act as the owners of objects but not allow clients to log in potentially reducing the attack surface of the database.

Database Security 18c New Features

To improve the isolation of Pluggable Databases (PDBs) we are adding the ability for each PDB to have its own key store rather than having one for the entire container. This also simplifies the configuration of non-container databases by introducing explicit parameters and hence removing the requirement to edit the sqlnet.ora file

DB Security; Secure your Data

A welcome change for some Microsoft users is the integration of the Oracle Database with Active Directory. Oracle Database 18c allows Active Directory to authenticate and authorize users directly without the need to also use Oracle Internet Directory. In the future we hope to extend this functionality to include other third-party LDAP version 3–compliant directory services. This change significantly reduces the complexity needed to perform this task and as a result improves the overall security and availability of this critical component.

Data Warehousing

Oracle Database 18c’s support for data warehousing got a number of welcome improvements.

Whilst machine learning has gotten a lot of attention in the press and social media recently it’s important to remind ourselves that the Oracle Database has had a number of these algorithms since Oracle 9i.  So, in this release we’ve improved upon our existing capability by implementing some of them directly inside of the database without the need for callouts, as well as added some more.

One of the compromises that data warehouse users have had to accept in the past was that if they wanted to use a standby database, they couldn’t use no-logging to rapidly load data into their tables. In Oracle Database 18c that no longer has to be the case. Users can make a choice between two modes whilst accommodating the loading of non-logged data. The first ensures that standbys receive non-logged data changes with minimum impact on loading speed at the primary but at the cost of allowing the standby to have transient non-logged blocks. These non-logged blocks are automatically resolved by managed standby recovery. And the the second ensures all standbys have the data when the primary load commits but at the cost of throttling the speed of loading data at the primary, which means the standbys never have any non-logged blocks.

Using Oracle Database as a Document Store

One of the most interesting developments in Oracle Database 18c is the introduction of Polymorphic table functions. Table functions are a popular feature that enables a developer to encapsulate potentially complicate data transformations, aggregations, security rules etc. inside of a function that when selected from returns the data as if it was coming from a physical table. For very complicated ETL operations these table functions can be pipelined and even executed in parallel. The only downside of this approach was that you had to declare the shape of the data returned as part of the definition of the function i.e. the columns to be returned. With Polymorphic tables, the shape of the data to be returned is determined by the parameters passed to the table function. This provides the ability for polymorphic table functions to be more generic in nature at the cost of a little more code.

One of my personal favorite features of this release is the ability to merge partitions online. This is particularly useful if you partition your data by some unit of time e.g. minutes, hours, days weeks and at some stage as the data is less frequently updated you aggregate some of the partitions into larger partitions to simplify administration. This was possible in previous versions of the of the database, but the table was inaccessible whilst this took place. In Oracle Database 18c you merge your partitions online and maintain the indexes as well. This rounds out a whole list of online table and partition operations that we introduced in Oracle Database 12c Release 1 and Release 2 e.g. move table online, split partition online, convert table to partition online etc.

Oracle Database MAD: 2.1 Data Management Trends: SQL, NoSQL y Big Data

For some classes of queries getting a relatively accurate approximate answer fast is more useful than getting an exact answer slowly. In Oracle Database 12c we introduced the function APPROX_COUNT_DISTINCT which was typically 97% or greater but can provide the result orders of magnitudes faster. We added additional functions in Oracle Database 12c Release 2 and in 18c we provide some additional aggregation (on group) operations APPROX_COUNT(), APPROX_SUM() and APPROX_RANK().

Oracle Spatial and Graph also added some improvements in this release. We added support for Graphs in Oracle Database 12c Release 2. And now in Oracle Database 18c you can use Property Graph Query Language (PGL) to simplify the querying of the data held within them. Performance was also boosted with the introduction of support for Oracle In Memory and List Hash partitioning.

We also added a little bit of syntax sugar when using external tables. You can now specify the external table definition inline on an insert statement. So no need to create definitions that are used once and then dropped anymore.


As you’d expect there were a number of Oracle Database 18c improvements for developers, but we are also updating to our tools and APIs.

JSON is rapidly becoming the preferred format for application developers to transfer data between the application tiers. In Oracle Database 12c we introduced support that enabled JSON to be persisted to the Oracle Database and queried using dot notation. This gave developers a no compromise platform for JSON persistence with the power and industry leading analytics of the Oracle Database. Developers could also treat the Oracle Database as if it was a NoSQL Database using the Simple Oracle Document Access (SODA) API. This meant that whilst some developers could work using REST or JAVA NoSQL APIs to build applications, others could build out analytical reports using SQL. In Oracle Database 18c we’ve also added a new SODA API for C and PL/SQL and included a number of improvements to functions to return or manipulate JSON in the database via SQL. We’ve also enhanced the support for Oracle Sharding and JSON.

Developer day v2

Global Temporary Tables are an excellent way to hold transient data used in reporting or batch jobs within the Oracle Database. However, their shape, determined by their columns, is persisted across all sessions in the database. In Oracle Database 18c we’ve provide a more flexible approach with Private Temporary Tables. These allow uses to define the shape of the table that is only visible for a given session or even just a transaction. This approach provides more flexibility in the way developers write code and can ultimately lead to better code maintenance.

Oracle Application Express, Oracle SQL Developer, Oracle SCLCl, ORDS have all been tested with 18c and in some instance get small bumps in functionality such as support for Sharding.

We also plan to release an REST API for the Oracle Database. This will ship with ORDS 18.1 a little later this year.

And One Other Thing…

We’re also introducing a new mode for Connection Manager. If you’re not familiar with what Connection Manager (CMAN) does today, I’d recommend taking a look here. Basically, CMAN allows you to use it as a connection concentrator enabling you to funnel thousands of sessions into a single Oracle Database. With the new mode introduced in Oracle Database 18c, it’s able to do a lot more. It can now automatically route connections to surviving database resources in the advent of some outage. It can also redirect connections transparently if you re-locate a PDB. It can load-balance connections across databases and PDBs whilst also transparently enabling connection performance enhancements such as statement caching and pre-fetching. And it can now significantly improve the security of incoming connections to the database.

Oracle OpenWorld Major Announcements

All in all, an exciting improvement to a great networking resource for the Oracle Database.

Oracle Database Release 18c New Features Complete List:  


Oracle Database 18.3.0 is available on Linux since July 23, 2018. And I wanted to quickly sneak into the Oracle Database 18.3.0 installation on premises. I did blog about the Oracle 18c installation a few weeks ago but this was a plain 18.1.0. This time I install the 18.3.0 on-prem edition for Linux.

Download Location:  


Upgrade process:


Oracle Database 18.3.0 installation on premises

Are the any differences between an 18.1.0 and the 18.3.0 installation? No, there aren’t any (at least not anything I recognized). The most important thing: you must unzip the downloaded file into your future destination directory.

In my case I unzip:

mkdir /u01/app/oracle/product/18
cd /u01/app/oracle/product/18
unzip /media/sf_TEAM/LINUX.X64_180000_db_home.zip
Then call the install script:


This short video demonstrates the installation process:

And finally run root.sh:

su root
passwd: oracle

cd /u01/app/oracle/product/18
That’s it.

One addition: If you wonder about the new environment variables ORACLE_BASE_HOME and ORACLE_BASE_CONFIG: Those are used to the Read-Only Home feature since Oracle 18c. Find more information in MOS Note:2409465.1 (Oracle 18c – Configuring Read Only OracleHome / DBCA / Patching / Upgrade).

More Information:




















If you’d like to try out Oracle Database 18c you can do it here with LiveSQL



For More information on when Oracle Database 18c will be available on other platforms please refer to Oracle Support Document 742060.1

Installing Oracle Database 18c (18.1.0)
Upgrading to Oracle Database 18.3.0 on-prem (will be available on July 26, 2018)
Why does the Oracle 18.3.0 on premises include 1.4GB patches? (will be available on July 27, 2018)

20 July 2018

Capsule Neural Networks (CNN) a Better alternative for Convolutional Neural Networks (CNN)

Capsule Neural Networks (CNN) a Better alternative

Geoffrey Hinton and his team published two papers that introduced a completely new type of neural network based on so-called capsules. In addition to that, the team published an algorithm, called dynamic routing between capsules, that allows to train such a network.

Introduction to Capsule Networks (CapsNets)

For everyone in the deep learning community, this is huge news, and for several reasons. First of all, Hinton is one of the founders of deep learning and an inventor of numerous models and algorithms that are widely used today. Secondly, these papers introduce something completely new, and this is very exciting because it will most likely stimulate additional wave of research and very cool applications.

Capsule Neural Networks

What is a CapsNet or Capsule Network?

Introduction to How Faster R-CNN, Fast R-CNN and R-CNN Works

Faster R-CNN Architecture

How RPN (Region Proposal Networks) Works

What is a Capsule Network? What is a Capsule? Is CapsNet better than a Convolutional Neural Network (CNN)? In this article I will talk about all the above questions about CapsNet or Capsule Network released by Hinton.
Note: This article is not about pharmaceutical capsules. It is about Capsules in Neural Networks or Machine Learning world.
There is an expectation from you as a reader. You need to be aware of CNNs. If not, I would like you to go through this article on Hackernoon. Next I will run through a small recap of relevant points of CNN. That way you can easily grab on to the comparison done below. So without further ado lets dive in.

CNN are essentially a system where we stack a lot of neurons together. These networks have been proven to be exceptionally great at handling image classification problems. It would be hard to have a neural network map out all the pixels of an image since it‘s computationally really expensive. So convolutional is a method which helps you simplify the computation to a great extent without losing the essence of the data. Convolution is basically a lot of matrix multiplication and summation of those results.

Capsule Networks Are Shaking up AI — An Introduction

After an image is fed to the network, a set of kernels or filters scan it and perform the convolution operation. This leads to creation of feature maps inside the network. These features next pass via activation layer and pooling layers in succession and then based on the number of layers in the network this continues. Activation layers are required to induce a sense of non linearity in the network (eg: ReLU). Pooling (eg: max pooling) helps in reducing the training time. The idea of pooling is that it creates “summaries” of each sub-region. It also gives you a little bit of positional and translational invariance in object detection. At the end of the network it will pass via a classifier like softmax classifier which will give us a class. Training happens based on back propagation of error matched against some labelled data. Non linearity also helps in solving the vanishing gradient in this step.

What is the problem with CNNs?

CNNs perform exceptionally great when they are classifying images which are very close to the data set. If the images have rotation, tilt or any other different orientation then CNNs have poor performance. This problem was solved by adding different variations of the same image during training. In CNN each layer understands an image at a much more granular level. Lets understand this with an example. If you are trying to classify ships and horses. The innermost layer or the 1st layer understands the small curves and edges. The 2nd layer might understand the straight lines or the smaller shapes, like the mast of a ship or the curvature of the entire tail. Higher up layers start understanding more complex shapes like the entire tail or the ship hull. Final layers try to see a more holistic picture like the entire ship or the entire horse. We use pooling after each layer to make it compute in reasonable time frames. But in essence it also loses out the positional data.

Pooling helps in creating the positional invariance. Otherwise CNNs would fit only for images or data which are very close to the training set. This invariance also leads to triggering false positive for images which have the components of a ship but not in the correct order. So the system can trigger the right to match with the left in the above image. You as an observer clearly see the difference. The pooling layer also adds this sort of invariance.

Depthwise Separable Convolution - A FASTER CONVOLUTION!

This was never the intention of pooling layer. What the pooling was supposed to do is to introduce positional, orientational, proportional invariances. But the method we use to get this uses is very crude. In reality it adds all sorts of positional invariance. Thus leading to the dilemma of detecting right ship in image 2.0 as a correct ship. What we needed was not invariance but equivariance. Invariance makes a CNN tolerant to small changes in the viewpoint. Equivariance makes a CNN understand the rotation or proportion change and adapt itself accordingly so that the spatial positioning inside an image is not lost. A ship will still be a smaller ship but the CNN will reduce its size to detect that. This leads us to the recent advancement of Capsule Networks.

Hinton himself stated that the fact that max pooling is working so well is a big mistake and a disaster:

Hinton: “The pooling operation used in convolutional neural networks is a big mistake and the fact that it works so well is a disaster.”

Of course, you can do away with max pooling and still get good results with traditional CNNs, but they still do not solve the key problem:

Internal data representation of a convolutional neural network does not take into account important spatial hierarchies between simple and complex objects.

In the example of a Dog, a mere presence of 2 eyes, a mouth and a nose in a picture does not mean there is a face, we also need to know how these objects are oriented relative to each other.

What is a Capsule Network?

Every few days there is an advancement in the field of Neural Networks. Some brilliant minds are working on this field. You can pretty much assume every paper on this topic is almost ground breaking or path changing. Sara Sabour, Nicholas Frost and Geoffrey Hinton released a paper titled “Dynamic Routing Between Capsules” 4 days back. Now when one of the Godfathers of Deep Learning “Geoffrey Hinton” is releasing a paper it is bound to be ground breaking. The entire Deep Learning community is going crazy on this paper as you read this article. So this paper talks about Capsules, CapsNet and a run on MNIST. MNIST is a database of tagged handwritten digit images. Results are showing a significant increase in performance in case of overlapped digits. The paper compares to the current state-of-the-art CNNs. In this paper the authors project that human brain have modules called “capsules”. These capsules are particularly good at handling different types of visual stimulus and encoding things like pose (position, size, orientation), deformation, velocity, albedo, hue, texture etc. The brain must have a mechanism for “routing” low level visual information to what it believes is the best capsule for handling it.

Capsule Networks

Capsule is a nested set of neural layers. So in a regular neural network you keep on adding more layers. In CapsNet you would add more layers inside a single layer. Or in other words nest a neural layer inside another. The state of the neurons inside a capsule capture the above properties of one entity inside an image. A capsule outputs a vector to represent the existence of the entity. The orientation of the vector represents the properties of the entity. The vector is sent to all possible parents in the neural network. For each possible parent a capsule can find a prediction vector. Prediction vector is calculated based on multiplying it’s own weight and a weight matrix. Whichever parent has the largest scalar prediction vector product, increases the capsule bond. Rest of the parents decrease their bond. This routing by agreement method is superior than the current mechanism like max-pooling. Max pooling routes based on the strongest feature detected in the lower layer. Apart from dynamic routing, CapsNet talks about adding squashing to a capsule. Squashing is a non-linearity. So instead of adding squashing to each layer like how you do in CNN, you add the squashing to a nested set of layers. So the squashing function gets applied to the vector output of each capsule.

Why Deep Learning Works: Self Regularization in Deep Neural Networks

The paper introduces a new squashing function. You can see it in image 3.1. ReLU or similar non linearity functions work well with single neurons. But the paper found that this squashing function works best with capsules. This tries to squash the length of output vector of a capsule. It squashes to 0 if it is a small vector and tries to limit the output vector to 1 if the vector is long. The dynamic routing adds some extra computation cost. But it definitely gives added advantage.

Now we need to realise that this paper is almost brand new and the concept of capsules is not throughly tested. It works on MNIST data but it still needs to be proven against much larger dataset across a variety of classes. There are already (within 4 days) updates on this paper who raise the following concerns:
1. It uses the length of the pose vector to represent the probability that the entity represented by a capsule is present. To keep the length less than 1 requires an unprincipled non-linearity that prevents there from being any sensible objective function that is minimized by the iterative routing procedure.
2. It uses the cosine of the angle between two pose vectors to measure their agreement for routing. Unlike the log variance of a Gaussian cluster, the cosine is not good at distinguishing between quite good agreement and very good agreement.
3. It uses a vector of length n rather than a matrix with n elements to represent a pose, so its transformation matrices have n 2 parameters rather than just n.

The current implementation of capsules has scope for improvement. But we should also keep in mind that the Hinton paper in the first place only says:

The aim of this paper is not to explore this whole space but to simply show that one fairly straightforward implementation works well and that dynamic routing helps.

Capsule Neural Networks: The Next Neural Networks?  CNNs and their problems.
Convolutional (‘regular’) Neural Networks are the latest hype in machine learning, but they have their flaws. Capsule Neural Networks are the recent development from Hinton which help us solve some of these issues.

Neural Networks may be the hottest field in Machine Learning. In recent years, there were many new developments improving neural networks and building making them more accessible. However, they were mostly incremental, such as adding more layers or slightly improving the activation function, but did not introduce a new type of architecture or topic.

Geoffery Hinton is one of the founding fathers of many highly utilized deep learning algorithms including many developments to Neural Networks — no wonder, for having Neurosciences and Artificial Intelligence background.

Capsule Networks: An Improvement to Convolutional Networks

Neural Networks may be the hottest field in Machine Learning. In recent years, there were many new developments improving neural networks and building making them more accessible. However, they were mostly incremental, such as adding more layers or slightly improving the activation function, but did not introduce a new type of architecture or topic.

Geoffery Hinton is one of the founding fathers of many highly utilized deep learning algorithms including many developments to Neural Networks — no wonder, for having Neurosciences and Artificial Intelligence background.

 At late October 2017, Geoffrey Hinton, Sara Sabour, and Nicholas Frosst Published a research paper under Google Brain named “Dynamic Routing Between Capsules”, introducing a true innovation to Neural Networks. This is exciting, since such development has been long awaited for, will likely spur much more research and progress around it, and is supposed to make neural networks even better than they are now.

Capsule networks: overview

The Baseline: Convolutional Neural Networks

Convolutional Neural Networks (CNNs) are extremely flexible machine learning models which were originally inspired by principles from how our brains are theorized to work.
Neural Networks utilize layers of “neurons” to process raw data into patterns and objects.
The primary building blocks of a Neural Network is a “Convolutional” layer (hence the name). What does it do? It takes raw information from a previous layer, makes sense of patterns in it, and send it onward to the next layer to make sense of a larger picture.

 If you are new to neural networks and want to understand it, I recommend:

  • Watching the animated videos by 3Blue1Brown.
  • For a more detailed textual/visual guide, you can check out this beginner’s blogpost
  • If you can deal with some more math and greater details, you can read instead this guide from CS231 at Stanford. 

 In case you didn’t do any of the above, and plan to continue, here is a hand-wavy brief overview.

The Intuition Behind Convolutional Neural Networks

Let’s start from the beginning.
 The Neural Net receives raw input data. Let’s say it’s a doodle of a dog. When you see a dog, you brain automatically detects it’s a dog. But to the computer, the image is really just an array of numbers representing the colors intensity in the colors channels. Let’s say it’s just a Black&White doodle, so we can represent it with one array where each cell represents the brightness of the pixel from black to white.

Understanding Convolutional Neural Networks.

Convolutional Layers. The first convolutional layer maps the image space to a lower space — summarizing what’s happening in each group of, say 5x5 pixels — is it a vertical line? horizontal line? curve of what shape? This happens with element wise multiplication and then summation of all the values in the filter with the original filter value and summing up to a single number.

This leads to the Neuron, or convolutional filters. Each Filter / Neuron is designed to react to one specific form (a vertical line? a horizontal line? etc…). The groups of pixels from layer 1 reach these neurons, and lights up the neurons that match its structure according to how much this slice is similar to what the neuron looks for.

Activation (usually “ReLU”) Layers — After each convolutional layer, we apply a nonlinear layer (or activation layer), which introduces non-linearity to the system, enabling it to discover also nonlinear relations in the data. ReLU is a very simple one: making any negative input to 0, or if it’s positive — keeping it the same. ReLU(x) = max(0,x).

Pooling Layers. This allows to reduce “unnecessary” information, summarize what we know about a region, and continue to refine information. For example, this might be “MaxPooling” where the computer will just take the highest value of the passed this patch — so that the computer knows “around these 5x5 pixels, the most dominant value is 255. I don’t know exactly in which pixel but the exact location isn’t as important as that it’s around there. → Notice: This is not good. We loose information here. Capsule Networks don’t have this operation here, which is an improvement.

Dropout Layers. This layer “drops out” a random set of activations in that layer by setting them to zero. This makes the network more robust (kind of like you eating dirt builds up your immunity system, the network is more immune to small changes) and reduces overfitting. This is only used when training the network.

Last Fully Connected Layer. For a classification problem, we want each final neuron represents the final class. It looks at the output of the previous layer (which as we remember should represent the activation maps of high level features) and determines which features most correlate to a particular class.

SoftMax — This layer is sometimes added as a another way to represent the outputs per classes that we can later pass on in a loss function. Softmax represents the distribution of probabilities to the various categories.
Usually, there are more layers which provide nonlinearities and preservation of dimensions (like padding with 0’s around the edges) that help to improve the robustness of the network and control overfitting. But these are the basics you need to understand what comes after.

Capsule Network

Usually, there are more layers which provide nonlinearities and preservation of dimensions (like padding with 0’s around the edges) that help to improve the robustness of the network and control overfitting. But these are the basics you need to understand what comes after.

 Now, importantly, these layers are connected only SEQUENTIALLY. This is in contrast to the structure of capsule networks.

What is The Problem With Convolutional Neural Networks?

If this interests you, watch Hinton's lecture explaining exactly what it wrong with them. Below you'll get a couple of key points that are improved by Capsule Networks.

Hinton says that they have too few levels of substructures (nets are composed from layers composed from neurons, that's it); and that we need to group the neurons in each layer into “capsules”, like mini-columns, that do a lot of internal computations, and then output a summary result.

Problems with CNNs and Introduction to capsule neural networks

Problem #1: Pooling looses information

CNN use “pooling” or equivalent methods to “summarize” what's going on in the smaller regions and make sense of larger and larger chunks of the image. This was a solution that made CNNs work well, but it looses valuable information.

 Capsule networks will compute a pose (transnational and rotational) relationship between smaller features to make up a larger feature.
 This loss of information leads to loss of spatial information.

Problem #2: CNNs don't account for the spatial relations between the parts of the image. Therefore, they also are too sensitive to orientation.

Subsampling (and pooling) loses the precise spatial relationships between higher-level parts like a nose and a mouth. The precise spatial relationships are needed for identity recognition.

(Hinton, 2012, in his lecture).

Geoffrey Hinton Capsule theory

 CNNs don't account for spatial relationships between the underlying objects. By having these flat layers of neurons that light up according to which objects they've seen, they recognize the presence of such objects. But then they are passed on to other activation and pooling layers and on to the next layer of neurons (filters), without recognizing what are the relations between these objects we identified in that single layer.
 They just account for their presence.

Hinton: Dynamic Routing Between Capsules

So a (simplistic) Neural network will not hesitate about categorizing both these dogs, Pablo and Picasso, as similarly good representations of “corgi-pit-bull-terrier mix”.

Capsule Networks (CapsNets) – Tutorial

Problem #3: CNNs can't transfer their understanding of geometric relationships to new viewpoints.

This makes them more sensitive to the original image itself in order to classify images as the same category.

 CNNs are great for solving problems with data similar to what they have been trained on. It can classify images or objects within them which are very close to things it has seen before.

 But if the object is slightly rotated, photographed from a slightly different angle, especially in 3D, is tilted or in another orientation than what the CNN has seen - the network won't recognize it well.

 One solution is to artificially create tilted representation of the image or groups and add them to the “training” set. However, this still lacks a fundamentally more robust structure.

What is a Capsule?

Capsule Networks Explained in detail ! (Deep learning)

In order to answer this question, I think it is a good idea to refer to the first paper where capsules were introduced — “Transforming Autoencoders” by Hinton et al. The part that is important to understanding of capsules is provided below:

“Instead of aiming for viewpoint invariance in the activities of “neurons” that use a single scalar output to summarize the activities of a local pool of replicated feature detectors, artificial neural networks should use local “capsules” that perform some quite complicated internal computations on their inputs and then encapsulate the results of these computations into a small vector of highly informative outputs. Each capsule learns to recognize an implicitly defined visual entity over a limited domain of viewing conditions and deformations and it outputs both the probability that the entity is present within its limited domain and a set of “instantiation parameters” that may include the precise pose, lighting and deformation of the visual entity relative to an implicitly defined canonical version of that entity. When the capsule is working properly, the probability of the visual entity being present is locally invariant — it does not change as the entity moves over the manifold of possible appearances within the limited domain covered by the capsule. The instantiation parameters, however, are “equivariant” — as the viewing conditions change and the entity moves over the appearance manifold, the instantiation parameters change by a corresponding amount because they are representing the intrinsic coordinates of the entity on the appearance manifold.”

The paragraph above is very dense, and it took me a while to figure out what it means, sentence by sentence. Below is my version of the above paragraph, as I understand it:

Artificial neurons output a single scalar. In addition, CNNs use convolutional layers that, for each kernel, replicate that same kernel’s weights across the entire input volume and then output a 2D matrix, where each number is the output of that kernel’s convolution with a portion of the input volume. So we can look at that 2D matrix as output of replicated feature detector. Then all kernel’s 2D matrices are stacked on top of each other to produce output of a convolutional layer.

Then, we try to achieve viewpoint invariance in the activities of neurons. We do this by the means of max pooling that consecutively looks at regions in the above described 2D matrix and selects the largest number in each region. As result, we get what we wanted — invariance of activities. Invariance means that by changing the input a little, the output still stays the same. And activity is just the output signal of a neuron. In other words, when in the input image we shift the object that we want to detect by a little bit, networks activities (outputs of neurons) will not change because of max pooling and the network will still detect the object.

Dynamic routing between capsules

The above described mechanism is not very good, because max pooling loses valuable information and also does not encode relative spatial relationships between features. We should use capsules instead, because they will encapsulate all important information about the state of the features they are detecting in a form of a vector (as opposed to a scalar that a neuron outputs).

[PR12] Capsule Networks - Jaejun Yoo

Capsules encapsulate all important information about the state of the feature they are detecting in vector form.

Capsules encode probability of detection of a feature as the length of their output vector. And the state of the detected feature is encoded as the direction in which that vector points to (“instantiation parameters”). So when detected feature moves around the image or its state somehow changes, the probability still stays the same (length of vector does not change), but its orientation changes.

t.1 – Capsules and routing techniques (part 1/2)

Imagine that a capsule detects a face in the image and outputs a 3D vector of length 0.99. Then we start moving the face across the image. The vector will rotate in its space, representing the changing state of the detected face, but its length will remain fixed, because the capsule is still sure it has detected a face. This is what Hinton refers to as activities equivariance: neuronal activities will change when an object “moves over the manifold of possible appearances” in the picture. At the same time, the probabilities of detection remain constant, which is the form of invariance that we should aim at, and not the type offered by CNNs with max pooling.

PR-012: Faster R-CNN : Towards Real-Time Object Detection with Region Proposal Networks

More Information:

“Understanding Dynamic Routing between Capsules (Capsule Networks)”

Understanding Hinton’s Capsule Networks. Part I: Intuition.   https://medium.com/ai³-theory-practice-business/understanding-hintons-capsule-networks-part-i-intuition-b4b559d1159b

Understanding Capsule Networks — AI’s Alluring New Architecture.  https://medium.freecodecamp.org/understanding-capsule-networks-ais-alluring-new-architecture-bdb228173ddc

What is a CapsNet or Capsule Network?   https://hackernoon.com/what-is-a-capsnet-or-capsule-network-2bfbe48769cc



A “weird” introduction to Deep Learning.  https://towardsdatascience.com/a-weird-introduction-to-deep-learning-7828803693b0

Faster R-CNN Explained   https://medium.com/@smallfishbigsea/faster-r-cnn-explained-864d4fb7e3f8

A Brief History of CNNs in Image Segmentation: From R-CNN to Mask R-CNN   https://blog.athelas.com/a-brief-history-of-cnns-in-image-segmentation-from-r-cnn-to-mask-r-cnn-34ea83205de4

Convolutional Neural Network (CNN)   https://skymind.ai/wiki/convolutional-network

Capsule Neural Networks: The Next Neural Networks? Part 1: CNNs and their problems  https://towardsdatascience.com/capsule-neural-networks-are-here-to-finally-recognize-spatial-relationships-693b7c99b12

Understanding Hinton’s Capsule Networks. Part II: How Capsules Work    https://medium.com/ai³-theory-practice-business/understanding-hintons-capsule-networks-part-ii-how-capsules-work-153b6ade9f66

Understanding Hinton’s Capsule Networks. Part III: Dynamic Routing Between Capsules   https://medium.com/ai³-theory-practice-business/understanding-hintons-capsule-networks-part-iii-dynamic-routing-between-capsules-349f6d30418

Understanding Hinton’s Capsule Networks. Part IV: CapsNet Architecture   https://medium.com/@pechyonkin/part-iv-capsnet-architecture-6a64422f7dce