Développez l'apprentissage automatique prédictif avec Flink | Atelier du 18 déc. | S'inscrire
Winner of the Data Streaming Company of the Year award in 2022, NASA has already been recognized for its advanced use of Apache Kafka®. What the organization has already achieved with data streaming is impressive, but the agency’s unique internal structure means its mission teams are adopting this technology at different paces.
At Current 2023: The Next Generation of Kafka Summit, Joe Foster, Cloud Computing Program Manager at NASA, joined Confluent co-founder and CEO Jay Kreps on stage to talk about NASA embracing cloud computing, the GCN project (that won them the Data Streaming Company of the Year award), and NASA’s new Data and Reasoning Fabric initiative aimed at enabling the full potential of future air mobility.
In this Q&A recap, find out about interesting projects NASA has been working on and how innovative technologies like cloud computing and data streaming are helping NASA advance into the future.
Jay: Tell us a little bit about yourself and your role at NASA.
Joe: When I started five years ago, I had a very simple mission: Lowering the barrier for adoption to cloud computing technology. There were no ideas around migrating things from the data center, no application rationalization. It sounds easy enough, right?
But my hiring manager alluded to the challenges: A large geographically dispersed workforce, a large variety of mission types, we didn’t know what the requirements and demands were for each mission, and we didn't know who the customers were.
The good news? We had an operational system within ten months—and here we are four years in operations. We now support 155 projects across all ten NASA centers. It’s been a pretty tremendous growth.
And like a lot of high-performing orgs… we’re constantly reevaluating our tech stack. We had stumbled across this self-organized Kafka community of practice about two years ago, and we said this is something we really want to invest time in. That led us to having conversations with Confluent about FedRAMP and how you can help us with the FedRAMP certification process.
Jay: Last year, you won the Data Streaming Company of the Year award for the GCN project. Can you tell us what that is and why it’s important to NASA?
Joe: GCN stands for General Coordinates Network. It’s a public collaboration platform run by NASA for the astronomy research community to share alerts and rapid communications about high-energy, multimessenger, and transient phenomena. The way it works is it has a publisher and subscriber model. Observatories all over the world publish alerts in real time when they witness any kind of transiting phenomenon in the sky. And anyone can subscribe to those alerts, whether you are a NASA researcher or a researcher at an academic university, or even if you own a telescope. You can subscribe to those real-time alerts and then when you receive those alerts and if you’re in the area of observation, you can go outside and witness the phenomenon. This allows us to crowdsource observations on these kinds of things. It’s a capability we didn’t have before.
There’s another project I would like to mention: DAPHNE. We’re entering the era of petabyte-scale science, which is a challenge for the way that we have traditionally done business as an agency. Most IT organizations don’t have ground stations all over the world as part of their IT footprint, but for NASA we do. And the way we had done business traditionally is, each mission and part of its building commissioning process would build their own mission’s unique hardware and software. They would then get installed at the pedestal of those ground stations. It became a very challenging operations and maintenance process—maintaining that hardware over time and oftentimes it fell to the groups outside of that mission to maintain.
With DAPHNE, we’ve moved to standardized racks of hardware and we’ve offloaded most of the computational heavy work to the cloud using serverless processing. By doing this we’ve saved U.S. taxpayers hundreds of millions of dollars by avoiding those tech refresh costs.
Jay: I would love to hear about what’s next for NASA?
Joe: There’s some really exciting work happening right now.
Let’s imagine the future together. The year is 2035 … you reach over to your smart device and you order a cup of coffee. You say, “Hey, please have it delivered to my front door from Starbucks.” Five minutes later, you see a drone setting down and dropping that cup of coffee off on your front porch and it flies away without spilling a drop.
Now, what does it take for that drone to get from your nearby Starbucks to your front porch? What does the terrain look like, are there trees, what are the buildings, what are the surroundings, what’s the weather? That drone is gonna have to have millions of data points streaming to it every minute to make those kinds of calculations. And there’s not just one drone, there are hundreds of drones potentially doing this.
If this is the world you want to live in, it’s going to take a lot of R&D work to get there. And NASA is the agency that’s equipped to facilitate some of this R&D effort. We’ve started something called the Data and Reasoning Fabric, where we’re trying to democratize data by creating this data ecosystem.
As the Cloud Computing Program Manager at the Goddard Space Flight Center, Joe had unique insight into challenges that individual teams at NASA have faced when adopting new platforms. During this year’s Data in Motion Tour, Confluent Field CTO Will LaForest sat down with Joe to talk about his team’s role in cloud and data streaming adoption across the agency, as well as the importance of immediacy in NASA’s work.
In this Q&A recap, find out how and why the demand for data streaming has grown across the agency.
Will: Can you explain the overall structure of NASA. I think understanding that is really key to realizing how important data streaming has become to the bigger picture.
Joe: Like you mentioned, everybody knows NASA as a brand name, but not everybody understands the ins and outs of how we work. NASA has ten centers plus its headquarters in downtown DC. Our headquarters really operates as the policy shop and the money shop. The centers are where the work is actually done.
And it’s quite diverse, as each center can have multiple physical facilities located across the country. Each center has its own management structure, its own course of capabilities. Being so distributed, we actually have to have distributed IT as well.
The vast majority of the money that flows into IT spending in NASA lives into individual mission directorates. Ultimately, that means that as a government civil servant, my team and I are focused on coming up with new, innovative things to help mission teams with their goals. If we’re not making progress, those teams will just go do things by themselves.
Will: Tell us more about your job – what are your responsibilities as a Cloud Computing Manager at Goddard?
Joe: Around four and half years ago, I became the first full time civil servant at NASA. Before that, everyone else who worked on cloud computing at NASA was doing small-scale projects. It might have been a data center manager who backed some things up in the cloud or an application developer shop that wanted to work in the cloud.
But overall, there was this high barrier to entry. Users had to get an AWS account and then use vanilla, off-the-shelf tools and figure everything out on their own. So my charge was to find ways to accelerate the mission adoption of cloud. I didn't care what's in the data center. Migrating the data center was not my goal.
Instead, I was supposed to find ways to partner with mission teams and find ways to accelerate their adoption of the cloud. So we built something called the Mission Cloud Platform.
The challenge was that there’s no mandate to use the platform, we had to convince people to opt in. And three years in, despite starting with no plan and next to no budget, we have 145 projects across all ten NASA centers.
Will: That’s an amazing feat. And I think this idea of lowering the barrier to entry to adopt technology is going to be an ongoing theme of this conversation. Your responsibility is to get people into the cloud, but where does that fit into the overall strategy for NASA?
Joe: We're moving large volumes of data to the cloud. NASA has an edict to share the science data publicly to the maximum extent possible. To give you an idea of the scale, we now have 60 petals of data in the cloud. And for the month of January 2023 alone, we had almost 4 million compute hours in the cloud.
Getting data into the cloud and sharing that data with our channel partners is a high priority for us.] At the end of the day, we don't want to just do “lift and shift.” We want to actually try to be innovative with things that we do. And data streaming actually feeds into that significantly.
Will: Sometimes it’s actually quite difficult to explain to people just how important immediacy is for specific missions. Often, it’s easier to understand why it matters for things like banking and transportation because people can relate more easily.
With astronomy, they might envision someone taking their time looking into their telescopes and writing papers. But in reality, seconds and even milliseconds makes a huge difference. Can you explain more about the importance of immediacy in this field and how the appetite for data streaming grew within NASA?
Joe: Take a supernova—that’s some of the most transient activity in the sky. It would be there for 30 seconds. So getting that near real time alerting allows us to get as many observations of it as possible. That additional data provides context that’s extremely important, and it's yielding an immediate impact on the community as well.
As for our journey with data streaming, we wanted real-time analytics, and of course that led us to real-time data streaming and Kafka. That’s where we started, and we realized that it was going to be a pain to manage on our own. So we looked at Amazon MSK. Although that managed Kafka service would be a bit better than self-managing open-source Kafka, it still would have required a lot of labor on our part.
Our lives have been a lot better and easier using the managed cloud service from Confluent.
Will: From what I’ve heard, Confluent’s cloud service has actually become really important to NASA broadly, not just for individual projects like GCN. Can you touch on that and why there’s so much internal demand for data streaming?
Joe: Well, when I first met up with the team on the GCN project, I was somewhat surprised to learn that there was a Kafka community of practice that had self-organized inside the agency. At the time, there were already around 15 projects that were all talking to each other about lessons learned and sharing the struggles of using open-source Kafka versus solutions like Confluent Cloud.
So going back to what I explained earlier—my team’s job is to provide services that help the mission teams and ensure they’re not off doing their own thing. It was pretty much a no-brainer decision, so we started looking at how we could design or bring in an enterprise cloud service to fill this need.
The value proposition was strong, and we already had this existing community of practice. We knew we needed to sponsor this project and bring this on as an enterprise service so we could then offer it more broadly across the organization.
And to learn from an even larger community of Kafka practitioners and contributors, and hear about interesting Kafka use cases, watch Current 2023 keynotes and sessions on demand now.
This blog explores how cloud service providers (CSPs) and managed service providers (MSPs) increasingly recognize the advantages of leveraging Confluent to deliver fully managed Kafka services to their clients. Confluent enables these service providers to deliver higher value offerings to wider...
With Confluent sitting at the core of their data infrastructure, Atomic Tessellator provides a powerful platform for molecular research backed by computational methods, focusing on catalyst discovery. Read on to learn how data streaming plays a central role in their technology.