Oracle OCI: Episode 3: Inside NVIDIA and Oracle’s Partnership on AI and HPC in the Cloud
This podcast was originally published in Oracle Cloud Innovator Series
Oracle Cloud Infrastructure Blog
Welcome to Oracle Cloud Infrastructure Innovators, a series of occasional articles featuring advice, insights, and fresh ideas from IT industry experts and Oracle cloud thought leaders.
Oracle is now offering NVIDIA’s unified artificial intelligence (AI) and high performance computing (HPC) platform on Oracle Cloud Infrastructure.
I recently caught up with Karan Batta, who manages HPC for Oracle Cloud Infrastructure, to find out what this partnership means for Oracle customers who run performance-intensive workloads and are looking to move to the cloud. He also explains how Oracle makes it easy for customers to transfer NVIDIA HPC workloads to the cloud.
Listen to our conversation and read a condensed version:
Why is the partnership between Oracle and NVIDIA such a big deal?
Karan Batta: It’s a big deal in part because we are the first public cloud provider to support NVIDIA HGX-2, the company’s unified AI and HPC platform. But let’s talk about the GPU market for a minute. I would say that the GPU-accelerated market is going to be a huge portion of the future market. Obviously, it doesn’t make sense to move everything to a GPU. But certainly, a lot of computationally intensive tasks like risk modeling, DNA sequencing, and a lot of real-time analysis makes sense for GPUs. The big use cases today are things like AI and ML, and in the future, it will be things like autonomous driving and weather simulation. Many tasks can benefit from GPUs.
Why did Oracle choose to partner with NVIDIA?
Batta: NVIDIA is the global leader right now in terms of not just the GPU hardware but the software ecosystem as well. They’ve done a fantastic job of growing their ecosystem around CUDA and different open source libraries such as cuDNN and cuML. What we’re trying to do at Oracle Cloud Infrastructure is enable the entire ecosystem on our platform. We’re not going to tell people to rip up their application and use our APIs instead of anybody else’s like other cloud providers do. If you’re already invested in the ecosystem, you want to come to Oracle. Not only do we offer the best GPU infrastructure, you can also get the ecosystem along with it. As part of that effort, we also announced that we’ve integrated the NVIDIA GPU Cloud (NGC) container registry. NVIDIA essentially builds, manages, qualifies, certifies, benchmarks, tests, and publishes many containers for deep learning, ML, AI, HPC, and now they’re moving into data analytics as well. We’re supporting all of that in our public cloud.
Are we certified for this?
Batta: Yes. Right now, we’re the only ones that have RAPIDS available on a public cloud certified through NGC. RAPIDS is a suite of open-source software libraries for executing data science training pipelines entirely on NVIDIA GPUs. It’s generally available and you can find documentation on NVIDIA’s and Oracle’s websites.
What do we offer in terms of making it easier for customers to transfer NVIDIA HPC workloads to Oracle Cloud Infrastructure?
Batta: We’ve made it much easier for customers to use the NVIDIA stack on top of Oracle. I think that is one of the biggest things that people are starting to notice. You can take any framework or application that is already running on GPUs and quickly run it on Oracle Cloud Infrastructure without changing the image or anything else. That’s true even if you have an on-premises image. You can run it para-virtualized on Oracle Cloud Infrastructure and it just works. On top of that, we are co-building this hardware with NVIDIA. We’re doing special things in regard to how we build that hardware and especially how we spec that hardware for different types of markets, whether it’s AI or a legacy HPC workload.
Can you tell me how many Oracle Cloud Infrastructure regions have these capabilities right now and what are the future plans?
Batta: This is available today in all of our regions. We have four major regions today – Virginia, Phoenix, London, and Frankfurt. And we’ve announced numerous new regions that will come online in the next 12 months in places like Korea, Japan, and India. We’re also going to have quite a few government regions along with additional regions in Europe and Asia-Pacific—so we are in this for the long term. All of these capabilities are going to be uniform across of our regions.
Okay, I’m sold. I want to take this for a test drive. How do I try it out?
Batta: We offer 300$ in free credits so you can go to our website and try it out. If you have additional questions or if you want to try out something different, feel free to reach out to me and my team. We’d be more than happy to guide you and make sure that you’re successful on Oracle Cloud Infrastructure.