Tuning up open source for telecoms

Features

Senior execs from Cisco, Ericsson Intel, Mavenir and Metaswitch (now acquired by Microsoft) cleared a lot of fog discussing developments and obstacles in operators’ move from virtual to cloud-native at a roundtable sponsored by Red Hat.

The group started by discussing open source models for service providers.As Timo Jokiaho, Red Hat’s global lead for Telco Partner Technology Development noted, there are many open source development models, as well as different approaches to consuming and contributing to it.

In-house versus value of a vendor

William Crowe, Partner Solution Architect – Telco Sales EMEA at Intel, commented, “It largely comes down to the kind of capability and expertise the end customer has with open source.

“If a service provider has several hundred internal software engineering resources and they are comfortable to consume [the work done in the] upstream community, they have that choice, but few have that capability.

“Everybody else is more likely to benefit from taking a supported, integrated product from a vendor. The value a vendor offers is bring- ing leading-edge innovation to market quickly, and giving customers an easier migration path to new platform versions through automated tooling and so on.”

Giles Heron, Principal Engineer, Cisco, added, “The carriers need to benefit from what vendors create – so it’s more about everyone being motivated to get carriers involved since we need their input.”

He continued, “The wider dilemma for some who are worried about contributing to open source is the possibility that each individual or organisation that contributes won’t necessarily see a proportional return on their effort, but if everyone contributes, then everyone will get back more than they individually put in.”

Several people expressed concerns that companies can leverage competitors’ investment and expertise when it is shared in the community. However, as Jokiaho pointed out, “Enabling our competitors is exactly what Red Hat does every day. What we contribute and develop in the open source community – OpenStack for example – is available to every one of our competitors.”

Further, in the interests of protecting the integrity and ongoing compatibility of its portfolio, and therefore its customers’ investments, Red Hat’s policy is to never deviate from the work done in various open source communities. This means that its portfolio always has the resources and strength of the entire community behind it.

That is not the case when a vendor or operator diverges from the upstream community’s work, because that effectively turns open source into a proprietary solution, with all that implies, including vendor lock-in, and potential integration and upgrade issues.

Virtyt Koshi, Ph.D., Senior Vice President and General Manager EMEA at Mavenir, agreed, saying, “It can be a trap because customers lose innovation if they are not looking to open source for new ideas. You become a prisoner of one supplier – you get people who build a comfortable career on tuning existing solutions, but innovation is lacking.

“In my experience, pick any vendor you can think of and, if it wasn’t for open source, the total cost of ownership would be much higher.

“Another problem is that, in procurement, everyone is incentivised to commoditise everything. Then we talk about innovation – such as we’re now looking at with applications at the edge – but if you have only a compliant box instead of innovative software, it is very difficult to achieve gains.”

Community and commercial

The group agreed that many operators are in early stages of understanding the difference between upstream and commercial open source. The roundtable attendees conceded that a good analogy is water. Rain is free, but collecting it, ensuring it is safe to drink, distributing it and recycling waste water costs money.

In the same way, the amount of engineering that goes into open source to enable innovation is not well appreciated by service providers, nor is the collaborative upstream development process, which has to be completed before developments are sent downstream to be leveraged for commercial use.

Crowe said, “They assume stuff just happens and it’s there for free and you can download it, but… we invest a huge amount of effort. All of our innovation in silicon is enabled first in open source, because we can get things done upstream and exposed, and then get people using it, feeding back and contributing to it.

“What we would ideally like to see is more telcos getting involved in the upstream development and engineering operators – that is, as developers – rather than their more traditional role of defining requirements.”

He continued, “Any innovation we create on a silicon platform is typically made available in open source first. It may be 12 to 18 months before it becomes available with a proprietary software stack.”

As an example, he highlighted that Intel contributed some critical networking features to Kubernetes that are essential for networking workloads – for example, Multus, the multi-cloud network interface. They were upstreamed for development and have now been downstreamed into commercially-supported distributions such as Red Hat OpenShift Container Platform, the company’s enterprise Kubernetes platform.

Crowe concluded, “The process for getting new technology upstreamed into community code and then downstreamed into commercially supported software is critical to understand.

"We need to make sure that the engineering effort into upstream contribution is enabling the right features that customers care about and that it leads to a material benefit to their operational model.”

Security at the forefront

Security is another critical advantage of working with an ecosystem of trusted technology partners. The power of such an ecosystem plays a big role in exceptional cases, such as the massive, so-called ‘side-channel attacks’, known as Spectre and Meltdown, that created worldwide headlines in 2018.

They overturned decades of perceived wisdom about how to design secure silicon, affecting chip makers ARM and Intel, among others. Red Hat’s Product Security team, which constantly monitors possible vulnerabilities, worked alongside other industry leaders to make mitigations available to customers as soon as possible after the issues were disclosed.

Crowe believes that the biggest kind of security vulnerability that service providers face today is inevitable in the shift to network functions virtualisation (NFV) and open architecture. He said, “Security is an area that needs huge education. There is technology capability in platforms that service providers simply don’t use. It goes back to a point raised earlier – that procurement folks don’t value this stuff and they strip it out of requirements.

“Operators need to have the right processes in place so that when a patch becomes available, they can operationalise it. Customers need to understand how to design security into business processes.”

The industry only wins when it acts together, which brought us to our next topic for discussion: how operators and vendors will succeed or fail together in the quest to deploy cloud-native infrastructure.

Stand or fall together

Gary Ashworth, Senior Director of Network Ecosystem Partner Sales at Red Hat, noted, “In the community, we’re seeing a lot more push from service providers who want a horizontal platform that can handle different workloads and different virtual network functions (VNFs) or containers to leverage their value.

"Some of the bigger ones, like AT&T, are driving their own ecosystem. Every day, we see the ecosystem becoming increasingly relevant.”

The most striking and radical instance of an ecosystem in action is Japan’s Rakuten Mobile network. A subsidiary of the Japanese-owned e-commerce giant of the same name, it has built the world’s first end- to-end cloud-native network from scratch.

This [was] scheduled to go live this spring and the revolutionary infrastructure has the same horizontal, open source platform across the network, from the disaggregated base stations to the core.

Red Hat, alongside an ecosystem of more than 50 vendor partners brought this to reality, and Rakuten has already said it intends to take the technology and business model global – and gave details of plans to do this in early June.

Extending beyond the edge

A couple of participants believe, as was the case with OpenStack, they may be able to advance beyond their original boundaries and move open source contributions for telecoms use cases into the wider community. For example, Red Hat and Intel, among others, plan to invest heavily in developing the networking aspects of Kubernetes within the community.

Jokiaho said, “We are following the same path for Kubernetes as we did for OpenStack five years ago. OpenStack was not designed for NFV or even for telecommunications. Still, today, service providers rely on OpenStack for both NFV and edge computing following significant feature developments over the past few years.

“Kubernetes now has many of the same features as OpenStack, and has become critical for NFV use cases and 5G core specifically, which is where containers and microser- vices come in. Edge and Kubernetes will be a perfect match, I believe.”

A special case?

Most attendees agreed that telecoms are something of a special case – with needs unlike those of general enterprise networks. For example, they need to offer 99.999% network availability and maintain huge, legacy systems (such as highly integrated, monolithic OSS and BSS) which are expensive to run and stifle innovation.

Martin Taylor, Chief Technical Officer at Metaswitch Networks said, “Telcos need a critical mass of technical capability, but over the last 20 years the technology capabilities within telcos have been totally hollowed out. The investment has mainly been in procurement. They know with NFV, for example, they have to staff up (but the right talent is hard for them to attract) or rely on their normal incumbent vendors.”

The larger Tier 1 carriers have a greater critical mass – sufficient resources to at least work on deploying horizontal platforms. Smaller operators are generally obliged to rely on vertical, integrated systems.

Crowe feels that the vendors are moving to make horizontal platform features more accessible: “I think the traditional Tier 1 network equipment vendors are now innovating in software development.

“For example, new use categories such as 5G core appear [in] many of the solutions [from] vendors with containerised offerings, following cloud-native principles. When NFV started, there was always going to be a lag, as software solutions that were already in existence had to be amortised and written down before they could start a new development phase, but there is strong evidence that it is happening across the vendor community.

“This raises some additional challenges, not least of which is that we don’t have an industry standardised agreement on what ‘cloud-native’ software architecture means for the telco cloud. This is where we need industry collaboration and education.”

Do it once, replicate everywhere

This lack of a common definition or understanding has big implications for the debate topic: the desirability and possibility of developing some- thing once, then being able to replicate and deploy it everywhere. The idea is to make changes at scale and fast.

Richard Miedzinski, Telco Partner Account Director at Red Hat, outlined the challenge: “Most network services require a lot of testing and pre-production, so it can be months before you are in the field.” He feels that community members, such as Red Hat, need to be better aligned with field operations, where the community’s work is operationalised by operators.

Taylor said all the testing is, “partly because a new release of software on a traditional network function has hundreds of changes in it and any one could disrupt something. If you’re in a microservice environment where the piece of software you’re changing is individual and small, and you haven’t changed the API [application programmable interface], the chances of a disruption are much lower.

“So instead of having this massive amount of testing in one go, you still have to test it before putting it into production, but you can mitigate by doing ‘canary testing’ and that sort of thing.”

Pods and interfaces

The consensus around the table was that the industry will come to a decision to meet the need for more than one interface for pods, which was recognised long ago. At MWC 2018, Telenor and Metaswitch published a white paper to demonstrate that containers are the only viable route to building and scaling 5G networks, based on a proof of concept (PoC) and a live trial that delivered voice-over-LTE (VoLTE) at a remote location in Norway.

In a commentary on the conference, Container Journal concluded, “Containers as a technology may have caught many network service providers off-guard. But the days when network services took weeks to months to provision are now officially over.”

The Telenor PoC integrated technology from multiple vendors, including Metaswitch’s containerised virtual IMS deployed on Red Hat OpenShift Container Platform, which performs container orchestration and has tools to automate the building and delivery of applications over public, private and hybrid cloud infrastructures.

The participants used the multi-cloud network interface, Multus, which provides multiple interfaces into a container pod and was originally proposed by Intel.

Jokiaho noted, “That PoC is a good example of how feature development in open source communities – Kubernetes in this case – works. We were looking at interfaces and we realised Multus was gaining the most momentum.

"That drives features to accelerate the data plane, which is what happened in OpenStack earlier and now is happening in Kubernetes too.”

CNTT’s mission

Still, progress on interfaces has not been as fast as some would like. The Common NFVI Telco Taskforce (CNTT) was set up in June 2019 to create a standard set of infrastructure profiles to streamline the many versions of NFV infrastructure (NFVi), so vendors won’t have to generate a different version of their VNFs for each one.

There is some confusion about the relationship between the Cloud Native Computing Foundation Telecom User Group (CNCF TUG) and CNTT, and each group’s purpose, Taylor explained that CNCF specifi- cally hosts Kubernetes’ development – in the same way that the Linux Foundation oversees development of the open source kernel. CNTT, at least initially, is focused on OpenStack.

Although CNCF TUG appears to have limited traction, perhaps surprisingly, CNTT statistics [in late 2019 showed] there is the equivalent of 280 full-time people contributing to it, according to Taylor. For some attendees, this begged the question of whether so much effort is best spent on OpenStack rather than cloud-native infrastructure.”

Crowe sees real benefit in CNTT’s work to create profiles that enable the VNF’s decoupling from the infrastructure and to introduce more standard interfaces or APIs that allow multiple vendor implementations. These efforts enable differentiation and innovation, which creates a model around compliance and testing.

He said, “We view these efforts as important steps in the right direc- tion. Once these models are adopted and implemented by the industry, it will give the telco ecosystem the potential to ingest new technologies at a pace similar to the major public cloud providers and enable the industry to address the automation – OpEx – challenges.”

Verification programmes

The CNTT profiles will be tested by the Open Platform for NFV (OPNFV) Verification Programs (OVP) and will benefit all parties, from NFVi vendors to VNF suppliers, service providers and their customers.

Crowe was optimistic: “For the initial wave of NFV enablement, the industry didn’t have a chance to immediately re-architect existing physical functions software or VNF software. Our initial goal with NFV was to work within OpenStack to expose many platform features that the VNF and orchestration layer needed to integrate with.

“Learning from the past, and benefiting from new technologies available, we are working with partners to enable this functionality in Kubernetes, but in a way that is architecturally much more rigorous on how platform details are exposed, with the goal of getting us to a simpler and easier onboarding and lifecycle management model.”

He compared this to what has gone before: from the outset, NFV was designed to use volume x86-based IT hardware and many work- loads were duly moved, but by and large they were virtualised versions of the physical appliance, still running legacy software.

He said, “The service providers have seen limited value from it as they haven’t been able to address the Opex challenge – with virtual appliances, there’s no common way to drive a horizontal infrastructure agenda. To date, the service providers have largely experienced Capex benefits from being able to consolidate hardware vendors between the IT and network domains.”

The disaggregation opportunity

Hans Hammar, Head of Radio for Europe and Latin America at Ericsson, noted, “The disaggregation of hardware and software in the core network has been ongoing since 2012, but still the implementation of cloud-native core applications is not widespread. Now the industry is talking about Open RAN, and how to repeat what has been done in the core network in radio.

“Needing to lower the entry barrier [in the RAN] has been on the agenda for a long time and I think it actually could open up a bigger market for Ericsson software. Based on our experience from the core network, it will be a long journey though. Ericsson is partnering and collaborating to make this happen, but realising the efficiencies is less certain.

“There have already been rumblings in the industry about telcos swapping dependence on big vendors for one on system integrators – that wouldn’t make anyone happier.”

Jokiaho suggested radio units and radio-related software at the edge were two fundamentally different things, saying, “The main motivation for disaggregation of the base station is to move functionality from the antenna site to the edge data centre so the antenna site only has radio units, making it as simple as possible, cheaper and more energy-efficient. That’s one part of Open RAN and ORAN.

“The other advantage is to establish a standard interface between the remote radio and software functionality at the edge, so operators can acquire radio units and radio-related software from different vendors. I like to talk about this as the disaggregation of base stations rather than virtualising RANs, because vRAN puts the mindset back to virtualisation.

“It might not necessarily have anything to do with this whole disaggregation trend which, incidentally, represents an interesting opportunity for Red Hat and other software companies.” 

This article first appeared in the Q1 edition of Mobile Europe | European Communications and has been updated to reflect recent events, as indicated.