Rakuten Mobile’s DNA – template, replicate, automate, accelerate

CXO Interviews

Rahul Atri has “worked for two or three greenfield companies” including Reliance Jio, which totally disrupted the Indian market, and various vendors.

Rahul Atri is now Managing Director, Rakuten Mobile Singapore, Head of Products & Engineering, Rakuten Communications Platform (RCP), Rakuten Mobile, and  works for the most disruptive greenfield of all. Some think the business model behind his particular area of responsibility, the RCP, is more revolutionary than the network build. Atri says Rakuten Mobile’s network is foundational to RCP’s success .

This article first appeared on FutureNet World and is reproduced with kind permission.

How do you go about building a communication infrastructure unlike any other in the world? Atri says that although Rakuten Mobile sees itself very much as an IT company rather than a telco, it faced considerable challenges. Hence, “when we began, we focused primarily on the basic components – people, process then technology. Automation for us has been a necessity and part of the culture from the start,” he says.

Most of his team comes from a software engineering background. Even so, “We invested a lot in people to build the entrepreneurship, mindset and DNA to think differently, not as typical telcos. Then we focused a lot on processes to define how we go about ‘solutioning’ anything, and how to make the process more digital. We didn’t start with operations on day one,” he explains

Fundamental to automation

The team soon realised that to automate everything demand precision: “For instance, if you want to auto-commission the RAN, you need to be very sure which server on which radio site, in what location and with what serial number is out there and that you want to push your configuration on,” Atri says.

He claims this is in contrast to operators’ often piecemeal approach to automation: they deploy everything then scramble to figure out exactly which inventory is where and what configuration is needed.

In the interests of speed, Rakuten mobile was designing, building and running the network at the same time. Atri explains, “We realised the whole idea of automation is not about integrating a couple of systems with APIs north and southbound, but to manage the lifecycle from scratch; to do the operational day to day.

“Automation was absolute necessity because we had to launch this network and that’s how we were able to do that in one and a half years. We launched 200 to 300 sites live on air every day. We auto-configured 20 to 30 cloud edges clouds every day, and then operations, especially in this COVID situation, became a little easier for us because we invested heavily in automation.”

He states, “For us there was no playbook, no cheat code and no tools to configure the cloud-native network, and that’s…why I mentioned people and processes before the technology.”

Atri says, “[The team] went into details of each and every call flow each and every integration. There’s not a single member in my team who doesn’t understand that when a OSS system talks to a MANO system, and a MANO system talks to a cloud, what parameters are exchanged between them. So we went into that level of detail and that’s how we kind of templatized everything, and this is the platform that we are now taking into the global arena”.

Open to the world

Rakuten Mobile is making its architecture and best practices available to operators round the world through the RCP to help them replicate Rakuten Mobile’s model and success. Rakuten Mobile will not discuss customers and prospective customers at the moment, but the RCP allows customers to visit an online marketplace where they can purchase and deploy everything they want to run their private, cloud-native, virtualised 5G mobile network, regardless of where in the world they are based.

Atri elaborates: “After the infrastructure, you have the cloud and you can deploy Kubernetes or OpenStack, whichever version [of them] you want, and on top of that you have the applications and orchestration. You can design the application and manage the lifecycle.

“The BSS/OSS layer runs on top of that. OSS is more than the ordinary OSS: we offer 45 systems…You can [choose and] register your vendor digitally, register their material hardware/software services, then you can do procurement – you can raise RFPs and track them.
“You have a warehouse services, inventory management inventory across hardware and the physical-logical topology. You then you have the automation of configuration management, fault management, performance management and I could keep talking for another 45 minutes…”.

To get to this point with the RCP Atri says the team “burned through a lot of nights focusing on how to build templates and standard interfaces, so now when they meet potential operator customers the RCP knows what to ask. It doesn’t need to speak to their vendors and the usual things, it just asks them to fill in the template and we know how to integrate the application from there.”

Atri says, “With the success of Rakuten in Japan, people are getting more open, they talk to us a lot more…Coming to the business side, we really think RCP is the pivotal point where the ecosystem and the industry will change.

“I personally love our network in Japan because it’s our baby, we built it from scratch and whatever technology we are taking to the world has been tried and tested in Japan”.

Rakuten Mobile runs Open RAN and virtual RAN (vRAN) commercially to carry 4 million users’ traffic on its end-to-end cloud network. He states, “[We] are managing the first realistic CI/CD [continuous integrated/continuous deployment] pipeline of automation where you have auto-rollbacks and upgrades when you’re talking about the [biggest] number of edge clouds for a telecom network anywhere in the world, where you’re talking about having shifted everything from bare metal to a cloud-native approach on VNFs.”

Embracing the edge

While many telcos are shying away from or taking cautious steps towards the edge, Atri says that from the earliest stages of network design Rakuten planned to deploy regional data centres with “media services, caches, storage, and all the end services users want. It was always about customer experience, right from the start. It was about video, super low latency, and really superior service: being more reliable – all that stuff.”

He thinks the challenge for operators is that they are used to having perhaps three data centres, not maintaining and deploying thousands of them. Atri explains, “For us it comes to replicating templates. We have different regional data centres [that] are all same size, the same capacity and same type of deployment.

“On the far edge, there are five or eight or 10 different types and sizes of edge data centres,” which are templated in terms of the number of racks, how many sites are terminating at the same edge, which version of cloud is involved, how many configurations, the cluster and port size, the number of IP addresses and so on.

Equipment is tracked from the warehouse to deployment where the field engineer photographs each piece of equipment’s QR code to get the serial number and sends it back to base to provide a digital record. The equipment is allocated an IP address, and “I’m ready to do my upgrades or deploy cloud or anything,” Atri says. Now for us, rolling out the feature and services is much faster.

Do standards help or hinder?

Atri says in principle standards are good, but they can hinder innovation. This is particularly true when it comes to complex lifecycle management, like self-healing because while in a data centre there is enough spare capacity to shift applications onto another server if there is a problem, it is not always possible at the edge, especially when the issue is power related or where it co-exists with other facilities

Atri continues, “Also you have hardware and cloud to monitor, and virtual machines or containers, and vRAN. Self-healing is super easy where you only need to take care of one layer, but sometimes I have to correlate…up to the fourth layer if my KPIs aren’t right. These are practical examples and challenges, and I don’t see any practical standards that are mature enough.”

“On a scale of one to 10, I’d love to be standards compliant on eight, but still have the bandwidth to innovate”.

Raising the bar

Rakuten Mobile can carry out certain auto-upgrades across its whole network in eight minutes. While devising the solution to achieve this, Atri says “We thought, ‘This can be a product where you can schedule things, they can auto-upgrade…and that’s how we think in here about automation and use cases.

“There’s a long, long journey we think about…to let the network run on its own,” but nothing daunted he adds, “The bar keeps going up. People were so happy with auto-configuration, then they want self-healing, then AI in the network to tell them everything that is happening, so we are pushing the bar every time, but there is still a lot to do.”

He concludes, “It will be very interesting to see [the impact of] AI and machine learning as we are only two or three years old as a network and there is a lot of network-related data that we are working on for other use cases such as energy saving – we are looking at how to save 30 or 40% – and customer satisfaction and customer rating…they are very important –we are spending a lot of effort and building a lot of platforms there.”