OpenStack Podcast #20: Joe Arnold

From days of setting up internet in his dorm and almost being kicked out of college for it, Joe Arnold’s tech roots are strong, and his passion is contagious. In the latest OpenStack Podcast, join the founder and CEO of SwiftStack as he discusses:

OpenStack Swift: What is it?
How the Enterprise Storage market is changing
How Swiftstack adds value on top of OpenStack
Standalone OpenStack Swift for particular use cases and industries
New O’Reilly Book, “OpenStack Swift”, and how you can get a copy

You can follow the podcast and see the past and future guest schedule at @openstackpod and follow Joe Arnold at @joearnold.

https://www.youtube.com/watch?v=82K9tpgqUps

For a full transcript of the interview, click read more below.

Jeff Dickey: All right. We’re on.

Niki Acosta: (Singing) We were talking about playing music to start off the podcast because we used to do that before we got lazy but were here and we didn’t have music. That’s okay because we have an amazing guest with us today. My name is Niki Acosta with Cisco.

Jeff Dickey: I’m Jeff Dickey with Redapt.

Niki Acosta: Mr. Arnold, one of the nicest guys in OpenStack for sure without a doubt. Please introduce yourself.

Joe Arnold: Too nice Niki. Hi, my name is Joe Arnold and I’m one of the co-founders of SwiftStack and what we do is we make OpenStack Swift which, for those who don’t know, is an object storage component that is part of OpenStack. That’s who I am.

Niki Acosta: We’re going to get into your background story but first we were marveling at the amazing view you have behind you today. Tell us about your view.

Joe Arnold: Well, we’ve been crammed together. It’s so fun starting a startup because every step along the way and particularly in the early days, you change offices all the time. We went from my basement to a shared work station. When we had a co-working space, it was one desk and we managed to recruit John Dickinson from Rackspace. He and his wife walk in and they look at this co-working space and it’s dingy, it’s dark and five dudes all stranded around one desk. He’s like, “Oh my God. What did I get myself into?” We’ve upgraded through the years and now we’re in a great space in downtown San Francisco. It’s beautiful. There’s plenty of empty desks for future hires that we’re bringing onto the team but its need to be a growth inflection.

Jeff Dickey: That’s awesome.

Niki Acosta: We’ll definitely get into that later for sure. Jeff, you can do the honors of asking the first question here. The first real question.

Jeff Dickey: All right. Well, so the real question is, are people going to be stopping by your basement in 10 years to take pictures of where SwiftStack started?

Joe Arnold: Yeah, I don’t know. We’re hard at work and office is one of those necessary evils and we’ve been the master of subletting a sublet. I don’t know.

Jeff Dickey: I’m so happy you guys are growing. That’s awesome. We like to start off with just who you are and how you got into technology so take us through that journey from young Joe getting into technology and how you got into OpenStack from there.

Joe Arnold: I’ve always been pretty geeky. It depends on how far back you want to go. I think when I was a freshman and even high school, I knew okay, here’s the degree I want to get. Here’s the University I want to go to. Get a computer science major. Get some business training. When I got to University, that was around the time when web development was happening and really got into it. The web was just being turned on. I’d gotten myself a couple of trouble spots a couple of times.

One time, I don’t remember but reminders to your head when you had to do a dial-up modem and that was back when you do file sharing and there’s really no hi-speed internet. Well, even in the dorms in my University, they didn’t have wired internet. I couldn’t believe it and so I’m like, “I’m going to fix this.” Wired up the whole dorms in the night when no one was looking. One day, they discovered it and they kicked me out of the dorms. I got put on disciplinary probation but it subsequently landed me a job running some of the labs and IT infrastructure in the computer science department which is pretty cool.

I worked there for awhile and then I just got really obsessed about web and web infrastructure and web servers and things like that but I needed to have a project. Our university didn’t register their .com address. They registered the .edu version of the address. So I registered it. Put up a website. Had fun with it. They didn’t like that. They sued me and tried to kick me out of college but I made a deal with them and they let me stay in the University.

I don’t know. I’ve been going back and forth of trying to build stuff, experiment with stuff through my whole career starting even back then.

Jeff Dickey: That’s good. That’s awesome.

Niki Acosta: We’ve done this twenty times and that is by far the best one we’ve ever heard.

Joe Arnold: When I donated the name back to the University, I did it in such a way that I got a plaque on the donor board in the computer science department. It’s HP, Cisco, and some rich people, rich person and then my name as a college kid donating to the University for the stupid domain name.

Jeff Dickey: That’s awesome.

Niki Acosta: Awesome.

Jeff Dickey: Yeah. It sound like you’re very unique just from hearing about that background as both the technical and the business side to you on that. Negotiating and building?

Joe Arnold: Yeah, you learn a lot about trademark law when you’re forced to. From there you do what a lot of most folks did in around 2000. It was, hey, let’s build a web development shop and start working on that. I quickly realized that you got bored of those projects and I knew I had to go to Silicon Valley so moved down there. Did a number of companies one of which became a part Aruba which is WiFi and we were doing network management. That’s where I met a few of my co-founders for SwifStack. Then I got an opportunity to work at Yahoo. Spent some time at Bangalore in India. Had my first daughter there and then went to a company called Engine Yard. This is really where I learned a lot about open source and how to build a business around it.

For those who don’t know what we did at Engine Yard was we worked on Ruby on Rails. We worked on Ruby itself. This was around 2006, 2008. 2009, stuff started getting started so think the introduction of the iPhone. Suddenly there’s this big push for how do we build and launch these web applications? We basically took it from a, we can help you deploy and manage and operate and scale your Ruby on Rails application. How do you do that? How do you be an open source company, contribute to all these projects while still run a business around it?

We started doing managed hosting. That was one way to do it. Then Amazon, with the web services side, they were just getting started. This was around 2008 is the time frame is when they did that investment. They came up, Hey guys, come up to Seattle. They put us in a conference room and they basically started parading a bunch of the engineers who were building Amazon and like, “Hey, how can you use this? We’re just about to launch EC2.” What we did was we built a product like a web service that allowed you to manage your EC2 environment and then the layer that we put over it was, here’s how to get a Ruby on Rails application running on that Amazon infrastructure. There wasn’t a word for it at the time but just emerged to be called a platform as a service. Heroku was yet our chief competitor around that time and we just got a whole heck lot of experience running and learning how to use cloud computing infrastructure and build and deploy applications on it. That was a really cool experience to have. I was running engineering for them there when I was there.

Niki Acosta: Then OpenStack, how did you make that jump to OpenStack? Who’s brilliant idea was it? Was it the kid who got almost kicked out of college twice?

Joe Arnold: Well, Randy Bias had a lot to do with this. He always falls into the mix here. I got an awesome opportunity to be able to work with Randy Bias. That was right around the time when OpenStack was getting launched and built. I didn’t get a chance to go to the very first Austin Summit but I’ve been to all the other ones since. What Randy was able to do was build ago, okay, let’s get this infrastructure up and running and let’s get out hands dirty around this and work with some folks who are deploying it. We went to Korea Telecom and Internap and I was stuck in this, here’s this OpenStack project. Go make it work.

What I gravitated towards really early was OpenStack Swift. I just loved the technology. I read everything I could about it. Dove into the source code. I just really got enamored with it. A lot of that had to do with I thought I saw something there that wasn’t so obvious to folks where everyone else who first jumped into the OpenStack compute environment really started to take on just Nova and compute and networking and block storage and all those aspects. But the experience I had way before this was, whoa, we had to deploy applications in Amazon. These are big applications. This was Groupon, Seeking Alpha, New Relic, and some of them could go on Amazon, some of them couldn’t go on Amazon. We had to understand what the different tools you had in those environments were. One of them was object storage. When we went to go to deploy customers in that environment, you couldn’t just say, “Well, here’s your distributed file system to use to store documents or profile photos.”

What we did was we changed how they store that data by using Amazon S3 and object storage in order to get their applications to work. When Swift just came out, that was a direct response to S3 from Rackspace. Rewind to when OpenStack was launched, it was Nova and Swift. Swift came from Rackspace. That was already a production object storage system that was up and running and serving customers and so there was a semblance, a kernel, if you will, of something that was already hardened to a certain extent in a production environment. That also attracted to me.

Once we got it up and running in these environments, now I know okay, applications can be built around this. You can scale really well. Then we started doing operational testing around it. Like literally going out and pulling power plugs out of servers. Niki I heard you talking about this on the last podcast. That was like a huge proof point for me just because being in an operational seat dealing with storage when there’s outages is such a hard thing to deal with and the system had the kernel to be able to survive those types of major operational events was pretty cool. That’s really what got me sucked into Swift.

After working with Randy for a little bit of time, then I did the leap and started SwiftStack.

Niki Acosta: For people who don’t what SwiftStack is or maybe for people who’ve heard of it but haven’t really taken the dive, tell us what SwiftStack is and does? Is it a managed offer? Is it all open source?

Joe Arnold: Yeah, all right. Here’s what we do. Number one: We’re an object storage company and that means that for those who are building an application or have a lot of data to manage, an object storage is a great way to store that because it can scale. It’s easy to manage and to operate. Those are what were some of the key tenets. Low cost because you can run it on standard or volume scale hardware. What we do is we work on OpenStack Swift which is an engine, would probably be the best way to describe this. It’s open source. Then what we’ve done is we’ve built out deployment. We built out management, automation and we do that via this thing what we call a controller. The SwiftStack controller. That is what manages and operates that environment. We’ll do things like we go into customer environments.

We can talk about it a couple here. Maybe it might be a good way to bridge in this like with Jeff, with Redapt, we’re working with Ancestry.com. I think it’s good to explain what object storage is. With Ancestry, they’re storing and they’re serving all sorts of documents and images and scanned records of one type or another. What that means is that there’s a lot of simultaneous connections going into that storage environment and they’re serving this content directly out to web pages, mobile devices. Object storage is perfect for that. What we do at SwiftStack is we turn that standard hardware, into a software defined storage platform that allows them to manage and scale that environment.

Now, there’s no OpenStack in the rest of Ancestry yet but they can use the storage system independently of the rest of OpenStack. A distinction here to make is when you go and you want to just consume object storage, consume Swift, you don’t necessarily need to plug that in to the rest of a compute environment that’s up and running. Take for example, Time Warner Cable which is… they’re all in OpenStack. They’re doing compute, they’re doing networking, they’re all in. We’re a component which supports that with backup images and archives and snapshots, things like that. But we’re also a storage target for the next generation of things like Video on Demand or the time-shifted cloud DVR products that they’re building out. Usually we’re brought in to solve the storage problem and sometimes, OpenStack is involved in terms of the compute side but sometimes, it’s just, hey we have a huge amount of data that we need to store. We have a large volume of users that we need to get up, get storing and surfing data from. That’s where we get pulled in.

From a product perspective, what we’ve done is we’ve taken it away from being support or services. When you go to deploy an open source project, and we experienced this in the early days even back at Engine Yard with Ruby on Rails. It really takes a lot of expertise to set it up, deploy it, scale it. In OpenStack, in infrastructure, it’s no different. It takes a lot of specialized knowledge to know how to get the systems up and running, how to integrate with the hardware that it needs to run on, how to manage that, how to tune that and then even once you do all that, you have to still plug in to your existing environment. You have to go, all right, how am I going to monitor this thing? Am I going to write my own SNMP traps when drives fail? Am I going to integrate this into my Active directory and LDAP environment? How am I going to do upgrades?

You have to hit point by point all of these little things that doesn’t necessarily appear at the surface when you first get started with it, which is great. We love people pulling the open source code and getting their hands dirty. We have a book that teaches people how to get it up and running, just the open source bits. It’s not something that we discourage but what we found is that when people get serious and they’re running it in production, it’s something that they want to turn to a company that specializes in that. Has the tools and the software already in place so that they can just plug it in and integrate it and then they can go. That’s where a lot of the draw for us comes from.

Niki Acosta: We partnered with you. Been at cloud now, Cisco OpenStack Private Cloud, we’re partners with you. We rely on you guys, to come in and help us with folks who are really serious about needing object storage which there are many people who do. One of the questions we have on the Metacloud side, which I’m sure you do too, is “If your public cloud already does this does this, if Rackspace cloud files are already does this, if Amazon S3 already does this, then why would I turn to do this in a private environment?”

Joe Arnold: Great question. There’s a couple of things. Cost is a huge reason why people will pull back out of the public cloud or reconsider not using the public cloud because when you begin to operate at very large scales, and by very large, I don’t mean horrendous scale, I mean a few racks worth of equipment. A few hundred terabytes, a petabyte…which in our world isn’t a tremendous amount of storage…it can be really expensive to run on the public cloud. S3 in particular. It’s the one that we model it out against. We have spreadsheets, TCO, calculations that we help people do because often times they’re being challenged from up above. They have a CTO or CIO who’s saying, “Well, let’s investigate what the public cloud has to offer?” That’s a totally valid thing to go out and do. When you actually go pencil the cost out, it can be a lot less expensive to bring the storage and even some of the compute environment on premises. That’s one.

The second thing is, sometimes there’s some data workflows that really just can’t go over public internet or if you did, it would be exorbitantly expensive. Can you imagine you’re shooting 4K video which has this crazy bit depth? There’s so much data going into the workflow. There’s more cameras. If you try to put that over a wire and put that into a public cloud provider, the costs are just game over. The time it takes to upload it is just crazy. You need to develop these workflows on premises just so you can get the speed of the workflow down. We’re saying this across the board from video production to research and new equipment that’s just producing data…just a tremendous amount of data out of some the scientific equipment to even some of the security footage, and storing and retaining and archiving that. That’s category number two why people do on premises.

Number three, there’s security and compliance reasons and very, very valid reasons why they don’t want to put data in public cloud. Regulated industries, things like health care and finance. Those industries, they still want to be on the cutting edge of building an application. They still want to maintain agility for the developers as they’re building out these applications. They have to provide all these tools. They have to give them instances on demand. They have to give them object storage. They have to give them these tools so they can deploy, launch, integrate, build, scale, just as fast as folks in other industries… but they’re highly regulated. They have to do everything on premises and they have to be building out these internal clouds for them to use so that they can get to market.

Niki Acosta: One of the early use cases that came to mind when I was back at Rackspace that I think I was asking John, your CTO, to help me with was a bank that was basically moving to an online banking system and so they were, obviously they scanned your checks or whatever, but they needed to store these checks for 10 years or something. They didn’t want to do it in the public cloud. You’re not going to put someone’s copies of checks in a public cloud. The trust wasn’t there and to some extent may still not be there putting that stuff out of your 4 walls. I see you guys have done a lot in the media space which… there’s a lot obviously happening with mobile and media but how do you guys play well in that space?

Joe Arnold: There’s really two worlds that need to be solved or two really big pain points. It’s first just the amount of data that’s ingesting. You have just a tremendous volume of data that’s being created and that needs to be stored. The original copies need to be stored and while it’s fine to produce something in it, put it on a tape and stick it on a shelf, what’s really neat is happening up on the content distribution side. If you think about when you go to produce and then deploy … Sorry, I’m using programmer words for this but when they go to deploy the product that they have, it’s in a certain language, certain frame rate, built for certain devices. Well, let’s say a year later they want to go back. We’re going to re-translate this into Japanese or another language. What they can do is they can just pull the original source files out into their work flow editors, make those changes and then re-cut the thing and then re-distribute it. What object storage allows them to do is instead of just keeping a linear version of that whole production they’ve made, they can actually keep the project in individual files and then just re-hydrate that back into their environment. It allows them to do more, get more value out of that content that they’ve already produced. That’s one.

Then on the backend is around distribution. This is actually a place where object storage really shines because it can be a content delivery machine. Even if it’s feeding a content delivery network, it’s really good because it’s speaking HTTP already. What that means is that you can serve that content out. Let’s say it’s a long tail content like Ancestry.com, long tail content. Time Warner Cable, you’re going to have some popular shows. The popular shows you can feed into content, whatever the content delivery mechanisms. You can put caching in there to speed the access but that long tail stuff can be served out of the system directly and because it speaks data of what protocols, it just makes building applications around that much easier.

Niki Acosta: You’re making object storage sound so fun. You’re just so passionate about this. It’s really awesome.

Joe Arnold: Yeah, I know. We have a workshop around … Pinterest? Do you know that application? That web application.

Niki Acosta: It’s my demographic. I’m the target demographic for Pinterest.

Joe Arnold: What we built was something entirely written in Swift and we called it Swinterest as a programming exercise to teach people how to use object storage. It has different users and you can upload photos and you can rewrite, tag them and things like that because there’s things like metadata associated with objects. That can be fed into a search index. You can have different users. All of this can be entirely written in Swift because you can store this data, you can store the metadata, you have multiple users in it but it’s a fun exercise to show people the power of how these objects …

Jeff Dickey: We’ve talked about the different use cases and some of the stuff. You guys are involved in some pretty large scale projects around this object storage. What are some lessons that you’ve learned from deploying Swiftstack at scale? What are some the things you’ve learned and obstacles you’ve overcome?

Joe Arnold: Hardware matters and what’s important is the flexibility of that hardware. Often times people think that you can take something a storage system and just put any old hardware in it. That’s true in a technical sense. John Dickinson, the project technical leader for swift, he has a blog post up that he did on his personal website where he got Swift running on a Raspberry Pi. That’s awesome. You can totally go out and do that. What we found is for people to have a good solid experience, particularly in the enterprise, people want a piece of hardware to consume. They want the software to be installed very easily on that environment and then they want to be up and running and have the ability to support that over time. Hardware’s important because you want to know the use case, so if I’m going to go in an archive workload where I want where price-per-gigabyte as the overwhelming factor, then okay great, here’s how to set this up. Or if people are setting up a situation where they want to serve lots of content, a high-throughput environment, then that configuration’s going to be a little bit different. That’s one.

Networking is the second thing that is important. You have to understand the data flows because we’re not deploying a single rack storage environment. It’s multi-rack. Almost every customer is multi-data center for either serving data or remote office or for disaster recovery and so having the understanding of what can go over the WAN link between those environments, the capacity of that WAN link configuring which networks are used for data transfer versus serving data. That would be the second thing I would make sure that if you’re taking on a Swift project that you make sure you spend some time and understand how that’s going to be laid out.

Jeff Dickey: It’s interesting you bring up a disaster recovery scenario. I’m not obviously the foremost expert on Swift but does that sort of ability to have DR, where does it happen? Does it happen at the hardware level? Does it happen at the OpenStack level? Does it happen at the application level? Can it happen at all levels?

Joe Arnold: We tend not to do it at the hardware level. We’re in discussions with some more traditional storage vendors. There’s a moment. There’s like, “Whoa, you guys spend all your time thinking about how to deal with hardware failures. That’s all your exceptions. That’s what your head space is at.” We spend all of our time thinking about how to prevent failures from happening. It’s a little bit of a different mindset. What you do is you rely on either replicas or erasure coded parity bits being distributed across a lot of different places and then you leverage that so when there’s failure on a piece of equipment or you can’t route to a certain location, then you’re still okay because you can serve or store data in these places where you can have access to.

That’s just a different mindset on how you build out a storage environment. You think about all those corner cases, not just about a single rack but how does it affect multiple data centers, multiple racks of equipment. For disaster recovery… maybe disaster recovery isn’t the right word for it, because what people are really doing is they’re putting their infrastructure in two different data centers and using them both. Most of the time, the reason why they want to do this, it starts out with disaster recovery as a reason. What that often ends up into is, hey, how can we provide a better experience for our users whether they’re internal users or they have an application they’ve built. What we can do then is we can send users who are on East Coast versus the West Coast to two different data centers and they can get a better experience and better response times. You can service them much more quickly. Yes, when one of those data centers does go dark, then we can just route everything to one of the other data centers and then the application can pick up where it left off.

There’s another thing too about it’s the difference between object storage and how you build applications. That infrastructure layer is dealing with the failures. It’s not like you have to build into your application, oh now, I need to go over to this data center. You don’t have to put that burden on the developers. The infrastructure team can take that on when they’re using an object storage using Swift. Then that infrastructure can deal with it and the application doesn’t’ necessarily need to. It does have to know how to use object storage instead of a file but the trade-offs are just much better for application developers.

Niki Acosta: I think you’re onto something there for sure. A lot of what we’re hearing especially in regards to platform as a service is at the end of the day, people are just trying to make things easier for developers and I’m sure you see when you have this discussion especially in the enterprise, you probably see all the light bulbs going of like, “Whoa, we can do all that stuff? We can use two data centers at the same time?”

Joe Arnold: It’s funny too because the operators, they love it. They just get excited about the direction and the road map and the future of the next generation of things that are coming out. Then they go and stand it up and they have to go, all right, what can I do with it? There’s usually two phases. One is just taking on operator workloads. Things like back-ups and archives and snapshots. I can’t tell you how many petabytes of just database back-ups we’re storing or virtual machine back-ups that we’re storing. It’s a ton. The reason is because the operators can go, all right, I’m buying into this and they get it up and running. They get it deployed and then they just start unloading all of that type of storage into this environment. That’s the first step. The next step is, okay, let’s evangelize this technology. Let’s start getting it worked into the workflow of what’s already there.

That’s actually the tricky bit because you have developers who can come on and they love it but you might not have an existing applications that support the object API. They bought into this direction. They want everything object in the future but they have this gap in between where not all the applications speak object. One of the things that we’ve built as part of the product that we license is a file system gateway. What the gateway does is it mediates files to objects and objects to file. That let’s them not have to crack open every single application they have. That means they don’t need to change the whole world but it still allows them to get bought into this vision of where they want to take their infrastructure.

Here’s the pain right? You have operators who are under pressure to do in effect more with less or they need to accelerate the, I hate to use the word twice, agility of the teams that they’re trying to support. What they’re often times turning to is this notion of infrastructure as a service and delivering that themselves. This IT as a service. That’s the thing that they’re piecing together. That’s this portfolio of tools that they’re trying to stitch together. They’re picking a platform as as a service component. They’re picking out an infrastructure or compute as service and storage is being is being thrown into that mix too. Then, all of those different components, then need to plug back into different things that they need in order to support that. So it has to support things like authentication. It has to plug into things like charge back and utilization. It has to plug it back into operations. These are all really important things that folks need take in account when they’re getting this stuff up and running.

Jeff Dickey: One of the things I want to talk about too was the differences between the object store offerings at OpenStack. It seems like object store is the only thing I know of that kind of competes in OpenStack. You’ve got Ceph and Swift. What are some of the differences?

Joe Arnold: First off, object storage doesn’t do block storage. The difference is if you want to think about there’s three types of storage. I’m greatly over simplifying right? There’s block storage which export sectors, if you will. That two computers, it allows you to put files system on it. There’s file systems which allow you to put files and that’s kind of what you’d see when you mount something on a desktop. Then object which is mostly delivering content via HTTP or web protocols. What you need in order to run a database or a virtual machine is usually block storage. The reason why you need to do that is you need to have a very strong, consistent view of what that storage looks like because when you go to update a database record, you need to know, all right, no one else touch this thing. I’m about to update it. You do that update and commit that thing down. If you had a what we call eventually consistent model meaning you could write data in and then there could be some percolation time. Their scenarios where there would be some percolation time. Then that database application’s not going to work that well. That operating system run time’s not going to behave well.

That’s an example of a storage system being built for supporting those types of workloads. You do different things to support them. For Ceph for example, it supports, has an object interface and it has a block interface that folks use. It does present those two different views, but it’s more optimized for that block interface than the object interface. They make trade-offs. I don’t want to paint anyone into a corner per se but you make trade-offs when you’re designing your architecture. Either you’re going to provide a consistent view of the storage or you’re going to provide an eventually consistent view of the storage. When you try to be all things to all folks, then there’s trade-offs that you run into.

Niki Acosta: Just say it. What do you really want to say Joe? I see you struggling there with what you want to say and what you’re saying.

Joe Arnold: Yeah, no. Look, what I do know is when folks start getting in production, that’s when we get called in. This is more from accounts of seeing stuff out on the field. When things go into production, things get into multi-data center, you got to architect the system so it can survive those types of situations. If you don’t, then you’re going to have a lot of down VMs and you’re not going to be able to support the traffic that you’re going to be able to need.

Niki Acosta: You’re not giving me any names when you say when you go into these and someone’s [crosstalk 00:40:04].

Joe Arnold: No. Here’s another way to think about it. It’s like you get the best tool for the job. Here’s the thing. Let’s say your hobby of choice. Let’s say you’re a rock climber and your the best rock climber. You’re going to be really discriminating about the different tools that you use. I’m not a rock climber so I have no idea right? You’re going to know all the different components like this type of belay and equipment. You’re going to just be really honed in on what’s good and what’s bad. What we’re finding is that the people who are getting started with OpenStack, particularly these early adopters, these mountain climbers if you will, in the data center, they’re very discerning over the systems. They know what works and what doesn’t’ work. They’re going to go for the best of breed across the board right? They’re going to pit the best tool here, the best tool here, the best tool here because that’s what they going to know is that works.

That’s where we get pulled in. We get pulled in when it’s object storage, it’s large footprint, it’s multi-data center and if you’re not to mix the metaphors too much but if you go to them and say, “Hey, here’s this combo printer fax photocopier machine.” They’re going to go, “Yeah. We’re trying to run a book publisher.” That’s not going to do the job at all. Even though it has each of those individual functions, what these folks need is they need the best breed. By nature, when we’re adopting technology early on, the best of breed products are going to win right? That’s how it works. Yeah sure, maybe a few years down the road technologies catch up and there’s some consolidation that happens. Fine right? But in this stage, it’s all about best of breed.

Jeff Dickey: I’m trying to figure out that kind of tipping point for Swift versus SwiftStack right? Where do you see most of these calls? Are you getting folks early on into POCs or do you find you’re getting more calls at people going and struggling in the production.

Joe Arnold: We’ll get calls in the POC phase certainly. I think storage customers are more comfortable in starting with the POC directly. That’s almost always where we’ve started. I would say occasionally there are people who are running in production and they go, all right, we learned and let’s bring in the big guns so to speak. I would say that would be the obvious second category.

Niki Acosta: Or they get their AWS bill and they go, “Oh crap.”

Joe Arnold: Yeah, that’s when we definitely get the AWS bill holy crap moment from folks.

Jeff Dickey: Do you have any folks doing both? Doing [crosstalk 00:43:21] Amazon and then …

Joe Arnold: Yeah, totally. Well, not on Amazon. People will be using public cloud to the extent that they’re going to, again, another word, federate right across both. We don’t really see it like that but for there to be different tiers that exist in different places and to be able to support that, absolutely. We’ve done a lot work to support the S3 API. It’s great because that gets applications in the door. Often times, you run across applications that have been built with the S3 API in mind but then when, okay, now we get on premises with them, we’ve sucked in a bunch of data, now they go, “It’d be really nice to be able to use storage policies and have different tiers of storage for different folks.” “Ah, okay. Well, you want to do that? Here’s how to do that with the Swift API. Here’s how we’ll set that up.” Or they want to do more sophisticated things on the application side. Put in middleware to process data as it’s going in and out of the system or have more sophisticated ACL capabilities. There’s just 80 features I could probably list off that go through all the different things. There’s features that are above and beyond what S3 provides which is what’s needed on premise, deployment. That’s why we’re strongly pushing this Swift API as well but just getting object storage across the line is a big one for us.

Niki Acosta: You guys are doing going a good job at that. When you look at the various projects and you see who are the experts at each project. Of course you guys are undisputed champions of Swift. That wasn’t a plug. He’s not paying me to say that by the way. This is a question I like to ask most of guests but how do you balance what you contribute back versus what you keep back as your secret sauce? I know you guys just did a hackathon. I saw some of the features things that you guys are working on for that. Where do you draw that line? Is it hard to draw that line?

Joe Arnold: No. It hasn’t been hard at all. Everything that’s a part of the storage run-time is in open source. Every single bit. There’s huge advantages to this. For example, the recently developed erasure coding which was by the way in progress for a long time because we had a lot of people involved with that. What it means is that you don’t get a feature that’s just singly championed by a single company. You get a community behind that set of features. For example with erasure coding, you have Box which has some developers on staff who really know encoding schemes very well. You have Intel who is putting development into Swift itself. Plus they’ve created libraries that are very efficient at dealing with erasure codes on their chips. Then we have our development that is going into actually writing and storing the data. That development is done as a team, as a group of people all moving that forward and it’s in open source.

I think that’s also different with Swift as a project. It’s pretty unique because you have a lot of developers that are contributing to it. The Hackathon, there was eight companies, five countries represented and everyone was all moving the ball forward together. What I think has been great about Swift is it’s been able to take on a few projects and different people from multiple companies all work on that capability together and then get it released into open source. I don’t know. It’s just a really amazing thing to watch as a community.

Then you asked about the dividing line. What we put into our product is mostly around the enablement side. That’s maybe a fuzzy way of saying a more descriptive thing dealing with networking, getting the load balancing, dealing with authentication and deployment and upgrading and capacity management. All of that operational tooling is what we put in the commercial product. Then the storage engine, the storage run-time is all open source. That’s how we divide.

Niki Acosta: You guys are the intelligent gateway to Swift.

Joe Arnold: No. Shoot. It’s still Swift. What we ship is one hundred percent open source swift. Push becomes a shove is like, “Okay, folks we’re engaged with. If you don’t like us, we got to earn our keep. You want to unhitch us, well, we better be doing a good job and providing the value that you need because it is an open source project.” We just like to shift down that ability to get started even the people who are doing small scale, we like to be in there. Anyone can go to our website and that sign up for a testing and development account and download the software and get it running and build an application around it, put some storage in it, aim your backup target at it. We love the small environments and then we have all the tooling that we have helps people scale up and grow and grow over a period of time. I don’t know. It’s pretty clear where we draw the line.

Niki Acosta: It’s a pretty good business model. Storage doesn’t really ever decrease in size. It seems like everyone wants to hang onto everything forever now.

Joe Arnold: Yeah but it’s changing so much so. If you look at what’s happening in the data center, everything is going open source. How is a traditional storage model going to survive? I think that kind of acted like the specialization in the data center. Sure, I think they’re very good at block devices to run databases, run virtual machines. That I see as pretty solid over the next few years. When it comes to something to like a filer, whoa, you’re in trouble because you have stuff like object storage emerging running on these standard hardwares.

The cost model for us so much different and we pair up with different vendors who are providing the hardware. We pair up with different OpenStack companies, emerging and established companies and we’re able to do deployments and the cost of these environments, price per terabyte operational costs, the amount of professional services you need just goes way down. What you’ve done is you’ve simplified the architecture, you’ve simplified the operating model, you’ve dramatically reduced the cost of the underlying hardware. That’s a real business model challenge I think for the people providing this kind of tier this class of storage. It’s going to be hard.

Niki Acosta: Jeff are you excited as I am?

Jeff Dickey: Oh, yeah. Absolutely.

Niki Acosta: So contagious. I’m just like, “Yes!” We have five minutes left. What do you want to say?

Joe Arnold: Can I plug the book? My book?

Niki Acosta: Yes. Buy the book.

Joe Arnold: I’m going to show a picture of it. It’s called OpenStack Swift. Published by O’Reilly and it has my name on it but I totally cheated and the whole company wrote it. We’ve had such a let, learning, lead attitude in the company. We have workshops that teach people how to run Swift. We have operator training on how to get Swift up and running. Not just with SwiftStack but even just pure open source bits. That’s been part of the culture. It’s been that part of the team. I’d encourage anyone who’s really interested in Swift, come check out our website. If you go to swiftstack.com/book. If you have an interesting enough project, send us your contact information, we’ll send you a copy of the book, otherwise you can check it out on O’Reilly site or Amazon.

Jeff Dickey: That’s great.

Niki Acosta: You got Swinterest. You got Swinterest too.

Joe Arnold: The Swinterest. Oh yeah. That’s on our Github account. Yeah.

Jeff Dickey: I have your book, the “Software Defined Storage with OpenStack Swift.”

Joe Arnold: Ooh, that’s an oldie.

Jeff Dickey: I’ve got the physical one.

Joe Arnold: Nice.

Jeff Dickey: What’s new? What’s different on the …

Joe Arnold: What’s new is mostly it’s on the building application side. By the way, for those who haven’t done an O’Reilly book, they are awesome to work with. I highly recommend it. If you’re trying to get you name out there, it’s great because you can just pull a team together. You can work with O’Reilly and they can work through it. We did a self-published book which is what Jeff, you’re referring to and we talked about deploying, setting up. Much of that is the same and then what we added, not only did we get all the editorial review but we also added a significant amount related to building applications. If you’re a developer or if you want to hand this to someone on the development team on how to use Swift, how to write middleware within Swift, how to use the API, there’s a lot more there in the book.

Niki Acosta: It’s killer.

Jeff Dickey: That’s awesome. Yeah. Well, I’ve got the Amazon page up right now so I’ll grab that. Everyone, it’s “OpenStack Swift: Using, Administering … ”

Niki Acosta: All right, show us the book again. Show us the book.

Joe Arnold: Here it is.

Niki Acosta: There we go. OpenStack Swift.

Jeff Dickey: Yep, grab the book.

Niki Acosta: O’Reilly. What kind of animal is that by the way? Is it some random…

Joe Arnold: It is a Swift.

Niki Acosta: It’s a Swift. Oh, perfect.

Joe Arnold: You don’t get to pick. You don’t et you pick which animal you get. You have to petition and hope and beg but they gave us the swift.

Niki Acosta: That landed in a good spot for you guys. Two questions left. We’ve got another minute. Number one, what are you most excited about in the future and number two, who do you want to see on the show?

Joe Arnold: I’m super excited about next generation drives. This is me putting my storage geek hat on. I love what Seagate is doing with Kinetic drives. It’s amazing. It transforms how data centers are going to be architected for storage. I feel like somebody came down upon the drive manufacturers and said: “We have built this drive just for SwiftStack or for OpenStack Swift,” and it’s really going to transform things and not in the too far distant future either. It’s coming fast and it’s going to be very cool.

Two other folks. Let’s see. Matt Haines from Time Warner Cable. I didn’t ask him so maybe he won’t appreciate me saying this but he’s an awesome guy to hear about. Their vision for the future, how they’re building out IT as a service for the rest of their organization. I just think he has a great story to tell about what he’s doing on the infrastructure side there. I’d recommend him. Another one and as a start up guy, I always love hearing about other folks’ stories as well. There’s a company newly launched called Platform9 which has a very similar model to what we’re doing. Although we do more fully on premises work. What Platform9 has done is they’ve built a management platform that’s in the cloud to manage OpenStack environments and one of the founders is Sirish Raghuram and he also has some interesting stories to tell you. He came out of VMware. I’ll send you the rest.

Niki Acosta: Matt and Sirish, you guys have an open invite to join the show. We’d love to have you. Especially, we love user stories. If you’ve got a good user story, we would absolutely love to see you here.

Joe Arnold: Matt is a superuser.

Niki Acosta: You have any final words? I don’t know about you but this has been super fun. Joe you were such a pleasure and you got this amazing view behind you that makes my super jealous.

Jeff Dickey: It’s a great view.

Niki Acosta: I’ve got like a doggie door like stuff back there. Oh!

Joe Arnold: My daughters fight over the OpenStack cards by the way. They love Niki Acosta and then when I was driving into work the other day, my littlest daughter’s like, “Is Lauren Sell going to be there?” I’m like, “No. She doesn’t work at SwiftStack but they have eaten up all of the OpenStack cards you made.”

Niki Acosta: Yay! That’s awesome.

Jeff Dickey: That’s great. Thank you so much for being on. Can I swing by the office tomorrow?

Joe Arnold: Absolutely.

Jeff Dickey: I want to check it out.

Joe Arnold: Come on by.

Jeff Dickey: All right.

Jeff Dickey: Thank you so much. Thank you Joe for being on the show.

Joe Arnold: Thanks Jeff and thanks Niki.

Jeff Dickey: [Crosstalk 00:57:12] a lot of fun and learned a lot.

Niki Acosta: You’re hiring? Real quick, we didn’t talk about that. You’re hiring.

Joe Arnold: We’re hiring. Swiftstack.com/jobs. Both people to help out. Technical folks to help out with customers and people doing development work are the two big areas for us right now.

Niki Acosta: Yay! Go to their website. Check them out. Thank you. Thank you Joe for joining us today. We will get in touch with Matt and Sirish. As always, thanks to our loyal viewers who give us a reason to keep on doing this every week. We really appreciate you guys.

Jeff Dickey: Yeah. [Crosstalk 00:57:47].

Niki Acosta: Who do we have next week Jeff?

Jeff Dickey: That’s a good question. Do I have a next week setup? Give me one second here. We’ve got Mike Metral.

Niki Acosta: Mike Metral, old co-worker from Rackspace. Good stuff.

Jeff Dickey: He’s got some interesting stuff going on that we’re going to talk about. We’ve got a bunch of guests lined up after. You can go to openstackpod.com and check out the guests we have coming up and again, thank you Joe so much for to the show. We’ll see you in Vancouver right?

Joe Arnold: Absolutely.

Jeff Dickey: Okay, good.

Niki Acosta: Thanks Joe. Bye everybody.

Jeff Dickey: Yeah, thanks Joe.

Joe Arnold: Thank you. All right. Bye Jeff.

Open at Cisco

OpenStack Podcast #20: Joe Arnold

CONNECT WITH CISCO

LET US HELP