Cloud Engineering – Software Engineering Daily

Cloud Engineering – Software Engineering Daily
By Cloud Engineering – Software Engineering Daily
About this podcast
Technical interviews about software topics.

Podcasts like "Cloud Engineering – Software Engineering Daily"   · View all

By Hackers – Software Engineering Daily
By Security – Software Engineering Daily
By JavaScript – Software Engineering Daily
Latest episodes
yesterday
A backend application can have hundreds of services written in different programming frameworks and languages. Across these different languages, log messages are produced in different formats. Some logging is produced in XML, some is produced in JSON, some is in other formats. These logs need to be unified into a common format, and centralized for any developer who wants to debug. The popularity of Kubernetes is making it easier for companies to build this kind of distributed application, where different services of different languages are communicating over a network, with a variety of log message types. Fluentd is a tool for solving this problem of log collection and unification. In today’s episode, Eduardo Silva joins the show to describe how Fluentd is deployed to Kubernetes, and what the role of Fluentd is within a Kubernetes logging pipeline. We also discuss the company where Eduardo works–Treasure Data. The story of Treasure Data is unusual. The team started out doing log management, but has found itself moving up the stack, into marketing analytics, sales analytics, and customer data management. This story might be useful for anyone who is open source developer thinking about how to evolve your project into a business. Transcript Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript. Sponsors Azure Container Service simplifies the deployment, management and operations of Kubernetes. Eliminate the complicated planning and deployment of fully orchestrated containerized applications with Kubernetes. You can quickly provision clusters to be up and running in no time, while simplifying your monitoring and cluster management through auto upgrades and a built-in operations console. Avoid being locked into any one vendor or resource. You can continue to work with the tools you already know, such as Helm, and move applications to any Kubernetes deployment. Integrate with your choice of container registry, including Azure Container Registry. Also, quickly and efficiently scale to maximize your resource utilization without having to take your applications offline. Isolate your application from infrastructure failures and transparently scale the underlying infrastructure to meet growing demands—all while increasing the security, reliability, and availability of critical business workloads with Azure. Check out the Azure Container Service at aka.ms/acs. Your company needs to build a new app, but you don’t have the spare engineering resources. There are some technical people in your company who have time to build apps–but they are not engineers. OutSystems is a platform for building low-code apps. As an enterprise grows, it needs more and more apps to support different types of customers and internal employee use cases. OutSystems has everything that you need to build, release, and update your apps without needing an expert engineer. And if you are an engineer, you will be massively productive with OutSystems. Find out how to get started with low-code apps today–at OutSystems.com/sedaily. There are videos showing how to use the OutSystems development platform, and testimonials from enterprises like FICO, Mercedes Benz, and SafeWay. OutSystems enables you to quickly build web and mobile applications–whether you are an engineer or not. Check out how to build low-code apps by going to OutSystems.com/sedaily. The octopus: a sea creature known for its intelligence and flexibility. Octopus Deploy: a friendly deployment automation tool for deploying applications like .NET apps, Java apps and more. Ask any developer and they’ll tell you it’s never fun pushing code at 5pm on a Friday then crossing your fingers hoping for the best. That’s where Octopus Deploy comes into the picture. Octopus Deploy is a friendly deployment automation tool, taking over where your build/CI server ends. Use Octopus to promote releases on-prem or to the cloud. Octopus integrates with your existing build pipeline–TFS and VSTS, Bamboo, TeamCity, and Jenkins. It integrates with AWS, Azure, and on-prem environments. Reliably and repeatedly deploy your .NET and Java apps and more. If you can package it, Octopus can deploy it! It’s quick and easy to install. Go to Octopus.com to trial Octopus free for 45 days. That’s Octopus.com
Jan. 13, 2018
Kubernetes has become the standard way of deploying new distributed applications. Most new internet businesses started in the foreseeable future will leverage Kubernetes (whether they realize it or not). Many old applications are migrating to Kubernetes too. Before Kubernetes, there was no standardization around a specific distributed systems platform. Just like Linux became the standard server-side operating system for a single node, Kubernetes has become the standard way to orchestrate all of the nodes in your application. With Kubernetes, distributed systems tools can have network effects. Every time someone builds a new tool for Kubernetes, it makes all the other tools better. And it further cements Kubernetes as the standard. Google, Microsoft, Amazon, and IBM each have a Kubernetes-as-a-service offering, making it easier to shift infrastructure between the major cloud providers. We are likely to see Digital Ocean, Heroku, and longer tail cloud providers offer a managed, hosted Kubernetes eventually. In this editorial, I explore the following questions: Why is this happening? What are the implications for developers? How are cloud providers affected? What are the new business models that are possible in a Kubernetes-standardized world? Software Standards Standardized software platforms are both good and bad. Standards allow developers to have expectations around how their software will run. If a developer builds something for a standardized platform, they can make smart estimations about the total addressable market for that piece of software. If you write a program in JavaScript, you know that it will run in everyone’s browser. If you create a game for iOS, you know that everyone with an iPhone will be able to download it. If you build a tool for profiling garbage collection in .NET, you know that there is a large community of Windows developers with memory issues who can buy your software. Standardized proprietary platforms can lead to massive profit returns for the platform provider. In 1995, Windows was such a good platform that Microsoft could sell a CD in a cardboard box for $100. In 2018, the iPhone is so good that Apple can take 30% from all app sales on its platform. Proprietary standards lead to fragmentation. Your iPhone app does not run on my Kindle Fire. I can’t use your Snapchat augmented reality sticker on Facebook Messenger. My favorite digital audio workstation only runs on Windows–so I have to keep a Windows computer around just to make music. When developers see this fragmentation, they groan. They imagine the greedy capitalists, who are making money at the expense of software quality.  Developers think, “why can’t we all just get along? Why can’t we have things be open and free?” Developers think, “we don’t need proprietary standards. We can have open standards.” Growth of Apache, part of the LAMP (Linux, Apache, MySQL, PHP) Stack This happened with Linux. These days, new server side applications are mostly in Linux. There was a time when that was controversial (see the far left hand side of chart above). More recently, we saw a newer open standard with Docker. Docker gave us an open, standardized way of packaging, deploying, and distributing a single node. This was hugely valuable! But for all of the big problems that Docker solved, it highlighted a new problem–how should we be orchestrating these nodes together? After all–your application is not just a single node. You know you want to be deploying a Docker container–but how are your containers communicating with each other? How are you scaling up instances of these containers? How are you routing traffic between container instances? Container Orchestration After Docker became popular, a scramble of open source projects and proprietary platforms emerged to solve the problem of container orchestration. Mesos, Docker Swarm, and Kubernetes each offered a different set of abstractions for managing containers. Amazon ECS offered a proprietary managed service that took care of installation and scaling of Docker containers for AWS users. Some developers did not adopt any orchestration platform, and used BASH, Puppet, Chef, and other tools to script their deployments. Whether a developer was deploying their container by using an orchestration platform or a script, Docker sped up development, and made things more harmonious between developers and operations. As more developers deployed containers with Docker, the importance of the orchestration platform was becoming clear. A container is as fundamental to a distributed system as an object is to an object oriented program. Everyone wanted to be using a container platform, but there was a struggle between these platforms, and it was hard to see which would come out on top–or if it would be a decades long struggle, like iOS vs. Android. This struggle between the different container orchestration platforms was causing fragmentation–not because any of the popular orchestration frameworks were proprietary (Swarm, Kubernetes, and Mesos are all open source), but because each container orchestration community had invested so much in their own system. So, from 2013 – 2016, there was some anxiety among Docker power users. Choosing a container orchestration framework is a huge bet–and if you chose the wrong orchestration system, it would be like opening a movie store and choosing HD DVD over Blu-ray. These pictures of container ships falling over never get old. Image: Huffington Post The war between container orchestrators felt like a winner-take-all affair. And as in any war, there was a fog that was hard to see through. When I was reporting on container orchestration wars, I recorded podcast after podcast with container orchestration experts, where I would ask some form of the question, “so, which container orchestration system is going to win?” I did this until someone I respect told me that asking about who was going to “win the orchestration wars” was a less interesting question than evaluating the technical tradeoffs between the orchestrators. Looking back, I regret buying into the narrative of a war between container orchestrators. As the debates about container orchestrators raged on, the smartest people in the room were mostly having calm, scientific discussions–even when reporters like me were getting worked up, and thinking that this was a story about tribalism. The conflicts between container orchestrators were not about tribalism–they were more about differences of opinion, and developer ergonomics. OK, maybe the container orchestration war wasn’t only about differences of opinion. There is a boatload of money to be made in this space. We are talking about contracts with billion dollar legacy organizations–banks, and telcos, and insurance companies–who are making their way onto the cloud. If you are in the business of helping telcos move onto the winning platform, business is good. If you champion the wrong platform, you end up with a warehouse full of HD-DVDs. The time where the conflict was the worst was near the end of 2016, when there were rumors about Docker potentially forking, so that Docker the company could change the Docker standard to fit better with their container orchestration system Docker Swarm–but even in those times, it would have made sense to be an optimist. Creative destruction is painful, but it is a sign of progress–and in the struggle for container orchestration dominance, there was a ton of creative destruction. And when the dust cleared around the end of 2016, Kubernetes was the clear winner. Today, with Kubernetes becoming the safe choice, CIOs feel more comfortable adopting container orchestration at their enterprises–which makes vendors feel more comfortable investing in Kubernetes-specific tools to sell to those CIOs. This brings us to the present: The open source developers are rowing in the same direction, excited about what to build. Major enterprises (both new and legacy) are buying into Kubernetes. The major cloud providers are ready with low-cost Kubernetes-as-a-service. The numerous vendors of monitoring, logging, security, and compliance software are thrilled because the underlying software stack that they have to integrate with is becoming more predictable. Going Multicloud Today, the most lucrative provider of proprietary backend developer infrastructure is Amazon Web Services. Developers do not view AWS resentfully, because AWS is innovative, empowering, and cheap. If you are paying AWS a lot of money, that probably means your business is doing well. With AWS, developers do not feel the level of lock-in that was administered by the dominant proprietary platforms of the 90s. But there is some lock-in. Once you are deeply embedded in the AWS ecosystem, using services like DynamoDB, Amazon Elastic Container Service, or Amazon Kinesis, it becomes a daunting task to move away from Amazon. With Kubernetes creating an open, common layer for infrastructure, it becomes theoretically possible to “lift and shift” your Kubernetes cluster from one cloud provider to another. If you decided to lift-and-shift your application, you would have to rewrite parts of your application to stop using the Amazon-specific services (like Amazon S3). For example, if you wanted an S3 replacement that would run on any cloud, you could configure a Kubernetes cluster with Rook, and start to store objects on Rook with the same APIs that you would use to store them on S3. This is a nice option to have, but I have not yet heard of anyone actually lifting and shifting their application away from a cloud–except for Dropbox, and their migration was so epic it took two and a half years. Certainly there is someone out there other than Dropbox who spends so much money on Amazon S3 that they will consider spinning up their own object store, but it would be a huge effort to do such a migration. (Kubernetes can be used for lifting and shifting–but more likely will be used to have familiar operating layer in different clouds) Kubernetes probably won’t be a tool for widespread lifting and shifting any time soon. A more likely scenario is that Kubernetes will become a ubiquitous control plane, that enterprises will use on multiple clouds. NodeJS is a useful analogy. Why do people like NodeJS for their server side applications? It’s not necessarily because Node is the fastest web server. It’s because people like to use the same language on both the client and the server. Just like NodeJS lets you move between your client and server code without having to switch languages, Kubernetes will let you switch between clouds without having to change how you think about operations. On each cloud, you will have some custom application code running on Kubernetes that interacts with the managed services provided by that cloud. Companies want to be multi-cloud–partly for disaster recovery, but also because there is actual upside to having access to managed services on the different clouds. One emerging pattern is to split infrastructure between AWS for user traffic and Google Cloud for data engineering. One company that uses this pattern is Thumbtack: At Thumbtack, the production infrastructure on AWS serves user requests. The log of transactions that occur get pushed from AWS to Google Cloud, where the data engineering occurs. On Google Cloud, the transaction records are queued in Cloud PubSub, a message queueing service. Those transactions are pulled off the queue and stored in BigQuery, a system for storage and querying of high volumes of data. BigQuery is used as the data lake to pull from when orchestrating machine learning jobs. These machine learning jobs are run in Cloud Dataproc, a managed service that runs Apache Spark. After training a model in Google Cloud, that model is deployed on the AWS side, where it serves user traffic. On the Google Cloud side, the orchestration of these different managed services is done by Apache Airflow, an open source tool that is one of the few pieces of infrastructure that Thumbtack does have to manage themselves on Google Cloud. Today, Thumbtack uses AWS for serving user requests and Google Cloud for data engineering and queueing in PubSub. Thumbtack trains its machine learning models in Google, and deploys them to AWS. This is just the way things are today. Thumbtack might eventually use Google Cloud for user-facing services as well. (a multi-cloud data engineering pipeline from a Japanese company) More companies will gradually move towards multiple clouds–and some of them will manage a unique Kubernetes cluster on each cloud. You might have a GKE Kubernetes cluster on Google to orchestrate workloads between BigQuery, Cloud PubSub, and Google Cloud ML, and you might have an Amazon EKS cluster to orchestrate workloads between DynamoDB, Amazon Aurora, and your production NodeJS application. Cloud providers are not replaceable commodities. The services provided by the different clouds will get increasingly exotic and differentiated. Enterprises will benefit from having access to the different services on the different cloud providers. Distributed Systems Distribution Services like Google BigQuery and AWS Redshift are popular because they give you a powerful, scalable, multi-node tool with a simple API. Developers often choose these managed services because they are so easy to use. You don’t see these types of paid tools for single nodes. I don’t pay for NodeJS, or React, or Ruby on Rails. Tools for a single node are much easier to operate than tools for a distributed system. It’s harder to deploy Hadoop across servers than to run a Ruby on Rails application on my laptop. With Kubernetes, this is going to change. If you are using Kubernetes, you can use a distributed systems package manager called Helm. This is like npm for Kubernetes applications. If you are using Kubernetes, you can use Helm to easily install a complicated multi-node application, no matter what cloud provider you are on. Here’s a description of Helm: Helm helps you manage Kubernetes applications — Helm Charts helps you define, install, and upgrade even the most complex Kubernetes application. Charts are easy to create, version, share, and publish — so start using Helm and stop the copy-and-paste madness. A package manager for distributed systems. Amazing! Let’s take a look at what I can install. Not pictured: WordPress, Jenkins, Kafka Distributed systems are hard to set up. Ask anyone who has set up their own Kafka cluster. Helm is going to make installing Kafka as easy as installing a new version of Photoshop on your MacBook. And you will be able to do this on any cloud. Before Helm, the closest thing to a distributed systems package manager (that I am aware of) was the marketplace that you find on AWS or Azure, or the Google Cloud Launcher. Here again we see how proprietary software can lead to fragmentation. Before Helm, there was no standard, platform-agnostic way of one-click installing Kafka. You can find a way to one-click install Kafka on AWS, Google, or Azure. But each of these installations had to be written independently, for that specific cloud provider. And to install Kafka on Digital Ocean, you need to follow a 10-step tutorial. Helm represents a cross-platform system for distributing multi-node software on any Kubernetes instance. You could use an application installed with Helm in any cloud provider or on your own hardware. You could easily install Apache Spark or Cassandra–systems that are notoriously difficult to set up and operate. Helm is a package manager for Kubernetes–but it also looks like the beginnings of an app store for Kubernetes. With an app store, you could sell software for Kubernetes. What kind of software could you sell? You could sell enterprise distributions of distributed systems platforms like Cloudera Hadoop, Databricks Spark, and Confluent Kafka. New monitoring systems like Prometheus and multi-node databases like Cassandra that are hard to install. Maybe you could even sell higher-level, consumer-grade software like Zendesk. The idea of a self-hosted Zendesk sounds crazy, but someone could build that, and sell it in the form of a proprietary binary, at the cost of a flat fee instead of a subscription. If I sell you a $99 Zendesk-for-Kubernetes, and you can easily run it on your Kubernetes cluster on AWS, you are going to end up saving a lot of money on support ticketing software. Enterprises often run their own WordPress to manage a company blog. Is the software for Zendesk that much more complicated than WordPress? I don’t think so–but enterprises are much more scared of managing their own help desk software than managing their own blogging software. I run a pretty small business but I subscribe to a LOT of different software-as-a-service tools. An expensive WordPress host, an expensive CRM for advertising sales, MailChimp for the newsletter. I pay for these services because they are super reliable and secure, and they are complex multi-node applications. I wouldn’t want to host my own. I wouldn’t want to support them. I don’t want to be responsible for technical troubleshooting when my newsletter fails to send. I want to run less software. In contrast, I’m not afraid that my single-node software is going to break. Software that I use from a single node tends to be much less expensive because I don’t have to buy it as a “service”. The software that I use to write music has a 1-time fixed cost. Photoshop has a 1-time fixed cost. I pay for the electricity to run my computer, but other than that I have no ongoing capital expense to run Photoshop. When multi-node applications are as reliable as single-node applications, we will see changes in the pricing models. Maybe someday I will be able to purchase a Zendesk-for-Kubernetes. The Zendesk-for-Kubernetes will give me everything I need from a help desk–it will spin up an email server, it will give me a web frontend to manage tickets. And if anything goes wrong, I can pay for support when I need it. Zendesk is a fantastic service–but it would be cool if it had a fixed pricing model Metaparticle With Kubernetes, it becomes easier to deploy and manage distributed applications. With Helm, it becomes easier to distribute those applications to other users. But it’s still pretty hard to develop distributed systems. This was the focus of a recent CloudNative Con/KubeCon Keynote by Brendan Burns, called “This Job is Too Hard: Building New Tools, Patterns, and Paradigms to Democratize Distributed Systems Development”. In his presentation, Brendan presented a project called Metaparticle. Metaparticle is a standard library for cloud-native development, with the goal of democratizing distributed systems. In an introduction to Metaparticle, Brendan wrote: Cloud native development is bespoke, complicated and limited to a small number of expert developers. Metaparticle changes this by introducing a set of utilities in familiar programming languages that meet the developer where they are and enables them to begin to develop cloud-native systems using familiar language features. Metaparticle builds on top of Kubernetes primitives to make distributed synchronization easier as well. It supplies language independent modules for locking and leader election as easy-to-use abstractions in familiar programming languages. After decades of distributed systems research and application, patterns have emerged about how we build these systems. We need a way to do locking of a variable, so that two nodes will not be able to write to that variable in a nondeterministic fashion. We need a way to do master election, so that if the master node dies, the other nodes can pick a new node to orchestrate the system. Today, we use tools like etcd and Zookeeper to help us with master election and locking–and these tools have an onboarding cost. Brendan illustrates this with the example of Zookeeper, which is used by both Hadoop and Kafka to do master election. Zookeeper takes significant time and effort to learn how to operate. During the construction of both Hadoop and Kafka, the founding engineers of those projects engineered their systems to work with Zookeeper to maintain a master node. If I am writing a system to do distributed mapreduce, I would like to avoid thinking about node failures and race conditions. Brendan’s idea is to push those problems down into a standard library–so the next developer who comes along with a new idea for a multi-node application has an easier time. Important meta-point: metaparticle is built with the assumption that you are on Kubernetes. This is a language-level abstraction that is built with an assumption about the underlying (distributed) operating system–which once again brings us back to standards. If everyone is on the same distributed operating system, we can make big assumptions about the downstream users of our projects. This is why my mind is blown by Kubernetes. It is a layer of standards across a world of heterogeneous systems. Serverless Workloads Functions as a service (often called “serverless” functions) are a powerful, cheap abstraction that developers can use alongside Kubernetes, on top of Kubernetes, and in some cases instead of Kubernetes altogether. Let’s quickly review the modern landscape of serverless applications, and then consider the relationship between serverless and Kubernetes. Quick review on functions as a service: Functions as a service are deployable functions that run without an addressable server. Functions as a service deploy, execute, and scale with a single call by the developer. Until you make that call to the serverless function, your function is not running anywhere–so you are not using up resources other than the database that is storing your raw code. When you call a function as a service, your cluster will take care of scheduling and running that function. You don’t have to worry about spinning up a new machine and monitoring that machine, and spinning the machine down once it becomes idle. You just tell the cluster that you want to run a function, and the cluster executes it and returns the result. When you “deploy” a serverless function, the function code is not actually deployed. Your code sits in a database in plain text. When you call the function, your code is being taken out of the database entry, loaded onto a Docker container, and executed. (diagram from https://medium.com/openwhisk/uncovering-the-magic-how-serverless-platforms-really-work-3cb127b05f71) AWS Lambda pioneered this idea of a function-as-a-service in 2014. Since then, developers have been thinking of all kinds of use cases. For a comprehensive list of how developers are using serverless, there is a shared Google Doc created by the CNCF Serverless Working Group (34 pages at the time of this article). From my conversations on Software Engineering Daily, these functions as a service have shown at least two clear applications: Compute that can scale up quickly and cheaply in response to bursty workloads (for example, Yubl’s social media scalability case study) Event-driven “glue code” with variable workload frequency (for example, an event sourcing model with a variety of database consumers) To create a FaaS platform, a cloud provider provisions a cluster of Docker containers called invokers. Those invokers wait to get blobs of code scheduled onto them. When you make a request for your code to execute, there is a period of time that you have to wait for that code to be loaded onto an invoker and executed. This time spent waiting is the “cold start” problem. The cold start problem is one of the tradeoffs that you make if you decide to run part of your application on FaaS. You don’t pay for the uptime of a server that isn’t doing any work–but when you want to call your function, you have to wait for your code to be scheduled onto an invoker. On AWS there are invokers designated for whatever requests for AWS Lambda come in. On Microsoft Azure there are invokers designated for Azure Functions requests. On Google Cloud there are invokers reserved for Google Cloud Functions. For most developers, using the function-as-a-service platforms from AWS, Microsoft, Google, or IBM will be fine. The costs are low, and the cold start problem is not problematic for most applications. But some developers will want to get costs even lower. Or they might want to write their own scheduler that defines how code gets scheduled onto invoker containers. These developers can roll their own serverless platform. Open source FaaS projects such as Kubeless let you provision your own serverless cluster on top of Kubernetes. You can define your own pool of invokers. You can determine how containers are scheduled against jobs. You can decide on how to solve the cold start problem for your own cluster. The open source FaaS for Kubernetes are just one type of resource scheduler. They are just a preview of other custom schedulers we will see on top of Kubernetes. Developers are always building new schedulers in order to build more efficient services on top of those schedulers. So–what other types of schedulers will be built on top of Kubernetes? Well, as they say, the future is already here, but it’s only available as an AWS managed service. AWS has a new service called Amazon Aurora Serverless–which is a database that scales storage and compute up and down automatically. From Jeff Barr’s post about AWS Serverless Aurora: When you create an Aurora Database Instance, you choose the desired instance size and have the option to increase read throughput using read replicas. If your processing needs or your query rate changes you have the option to modify the instance size or to alter the number of read replicas as needed. This model works really well in an environment where the workload is predictable, with bounds on the request rate and processing requirement. In some cases the workloads can be intermittent and/or unpredictable, with bursts of requests that might span just a few minutes or hours per day or per week. Flash sales, infrequent or one-time events, online gaming, reporting workloads (hourly or daily), dev/test, and brand-new applications all fit the bill. Arranging to have just the right amount of capacity can be a lot work; paying for it on steady-state basis might not be sensible. Because storage and processing are separate, you can scale all the way down to zero and pay only for storage. I think this is really cool, and I expect it to lead to the creations of new kinds of instant-on, transient applications. Scaling happens in seconds, building upon a pool of “warm” resources that are raring to go and eager to serve your requests. We are not surprised that AWS can build something like this, but it’s hard to imagine how it could emerge as an open source project—until Kubernetes. This is the type of system that random developers could build on top of Kubernetes. If you wanted to build your own serverless database on top of Kubernetes, you’ve got a number of scheduling problems to solve. You need different resource tiers for networking, storage, logging, buffering, and caching. For each of those different resource tiers, you need to define how resources will get scheduled to scale up and down in response to demand. Just like Kubeless offers a scheduler for the small bits of functional code, we might see other custom schedulers that people use as building blocks for bigger applications. And once you actually build your serverless database, perhaps you could sell it on the Helm app store as a $99 one-time purchase. Conclusion I hope you have enjoyed this tour through some Kubernetes history and speculation about the future. As 2018 begins, here are some of the areas we’d like to explore this year: How do people manage deployment of machine learning models on Kubernetes? We did a show with Matt Zeiler where he discussed this, and it sounded complicated. Is Kubernetes used for self-driving cars? If so, do you deploy one single cluster to manage the entire car? What does a Kubernetes IoT deployment look like? Does it make sense to run Kubernetes across a set of devices with intermittent network connections? What are the new infrastructure products and developer tools that will be built with Kubernetes? What are the new business opportunities? Kubernetes is a fantastic tool for building modern application backends–but it is still just a tool. If Kubernetes fulfills its mission, it will eventually fade into the background. There will come a time when we talk about Kubernetes like we talk about compilers or operating system kernels. Kubernetes will be lower level plumbing that is not in the purview of the average application developer. But until then, we’ll continue to report on it.
Jan. 13, 2018
Kubernetes has become the standard way of deploying new distributed applications. Most new internet businesses started in the foreseeable future will leverage Kubernetes (whether they realize it or not). Many old applications are migrating to Kubernetes too. Before Kubernetes, there was no standardization around a specific distributed systems platform. Just like Linux became the standard server-side operating system for a single node, Kubernetes has become the standard way to orchestrate all of the nodes in your application. With Kubernetes, distributed systems tools can have network effects. Every time someone builds a new tool for Kubernetes, it makes all the other tools better. And it further cements Kubernetes as the standard. Google, Microsoft, Amazon, and IBM each have a Kubernetes-as-a-service offering, making it easier to shift infrastructure between the major cloud providers. We are likely to see Digital Ocean, Heroku, and longer tail cloud providers offer a managed, hosted Kubernetes eventually. In this editorial, I explore the following questions: Why is this happening? What are the implications for developers? How are cloud providers affected? What are the new business models that are possible in a Kubernetes-standardized world? Click here to read the full “Gravity of Kubernetes” editorial by Jeff Meyerson.
Jan. 12, 2018
Kubernetes has become the standard system for deploying and managing clusters of containers. But the vision of the project goes beyond managing containers. The long-term goal is to democratize the ability to build distributed systems. Brendan Burns is a co-founder of the Kubernetes project. He recently announced an open source project called Metaparticle, a standard library for cloud-native development: Metaparticle builds on top of Kubernetes primitives to make distributed synchronization easier… It supplies language independent modules for locking and leader election as easy-to-use abstractions in familiar programming languages. After decades of distributed systems research and application, patterns have emerged about how we build these systems. We need a way to lock a variable, so that two nodes will not be able to write to that variable in a nondeterministic fashion. We need a way to do master election, so that if the master node dies, the other nodes can pick a new node to orchestrate the system. We know that just about every distributed application needs locking and leader election–so how can we build these features directly into our programming tools, rather than bolting them on? With Kubernetes providing a standard operating system for distributed applications, we can start to build standard libraries that assume we have access to underlying Kubernetes primitives. Instead of calling out to external tools like Zookeeper and etcd, a standard library like Metaparticle will abstract them away. An example: if I am writing a system to do distributed mapreduce, I would like to avoid thinking about node failures and race conditions. Brendan’s idea is to push those problems down into a standard library–so the next developer who comes along with a new idea for a multi-node application has an easier time. Brendan Burns currently works as a distinguished engineer at Microsoft, and he joins the show to discuss why it is still hard to build distributed systems and what can be done to make it easier. This is the second time we have had Brendan on the show. The first time he came on, he discussed the history of Kubernetes, and some of the design decisions of the system. This episode was more about the future. Full disclosure: Microsoft (where Brendan is employed) is a sponsor of Software Engineering Daily. Transcript Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript. Sponsors Azure Container Service simplifies the deployment, management and operations of Kubernetes. Eliminate the complicated planning and deployment of fully orchestrated containerized applications with Kubernetes. You can quickly provision clusters to be up and running in no time, while simplifying your monitoring and cluster management through auto upgrades and a built-in operations console. Avoid being locked into any one vendor or resource. You can continue to work with the tools you already know, such as Helm, and move applications to any Kubernetes deployment. Integrate with your choice of container registry, including Azure Container Registry. Also, quickly and efficiently scale to maximize your resource utilization without having to take your applications offline. Isolate your application from infrastructure failures and transparently scale the underlying infrastructure to meet growing demands—all while increasing the security, reliability, and availability of critical business workloads with Azure. Check out the Azure Container Service at aka.ms/acs. An in-house team of engineers spent thousands of hours developing the Casper mattress. As a software engineer, you know what that kind of development and dedication is like. The result is an exceptional product, one that you’d use and recommend to your friends. You deserve an exceptional night’s rest so you can continue building great software. Casper combines supportive memory foams for a sleep surface that’s got just the right sink and just the right bounce. Plus, its breathable design sleeps cool to help you regulate your temperature through the night. And, buying a Casper mattress is completely risk free. Casper offers free delivery and free returns with a 100-night home trial. If you don’t love it, they’ll pick it up and give you a full refund. As a special offer to Software Engineering Daily listeners, get $50 toward select mattresses by visiting casper.com/sedaily and using code SEDAILY at checkout. Terms and conditions apply. Simplify continuous delivery with GoCD, the on-premise, open source, continuous delivery tool by ThoughtWorks. With GoCD, you can easily model complex deployment workflows using pipelines and visualize them end-to-end with the Value Stream Map. You get complete visibility into and control of your company’s deployments. At gocd.org/sedaily, find out how to bring continuous delivery to your teams. Say goodbye to deployment panic and hello to consistent, predictable deliveries. Visit gocd.org/sedaily to learn more about GoCD. Commercial support and enterprise add-ons, including disaster recovery, are available.
Jan. 11, 2018
You are requesting a car from a ridesharing service such as Lyft. Your request hits the Lyft servers and begins trying to get you a car. It takes your geolocation, and passes the geolocation to a service that finds cars that are nearby, and puts all those cars into a list. The list of nearby cars is sent to another service, which sorts the list of cars by how close they are to you, and how high their star rating is. Finally, your car is selected, and sent back to your phone in a response from the server. In a “microservices” environment, multiple services often work together to accomplish a user task. In the example I just gave, one service took a geolocation and turned it into a list, another service took a list and sorted it, and another service sent the actual response back to the user. This is a common pattern: Service A calls service B, which calls service C, and so on.  When one of those services fails along the way, how do you identify which one it was? When one of those services fails to deliver a response quickly, how do you know where your latency is coming from? The solution is distributed tracing. To implement distributed tracing, each user level request gets a request identifier associated with it. When service A calls service B, it also hands off the unique request ID, so that the overall request can be traced as it passes through the distributed system (and if that doesn’t make sense–don’t worry, we explain it again during the show). Ben Sigelman began working on distributed tracing when he was at Google and authored the “Dapper” paper. Dapper was implemented at Google to help debug some of the distributed systems problems faced by the engineers who work on Google infrastructure. A request that moves through several different services spends time processing on each of those services. A distributed tracing system measures the time spent in each of those services–that time spent is called a span. A single request that has to hit 20 different services will have 20 spans associated with it. Those spans get collected into a trace. A trace can be evaluated to look at the latencies of each of those services. If you are trying to improve the speed of a distributed systems infrastructure, distributed tracing can be very helpful for choosing where to focus your attention. The published Google papers of ten years ago often turn out to be the companies of today. Some examples include MapReduce (which formed the basis of Cloudera), Spanner (which formed the basis of CockroachDB), and Dremel (which formed the basis of Dremio). Today, a decade after he started thinking about distributed tracing, Ben Sigelman is the CEO of Lightstep, a company that provides distributed tracing and other monitoring technologies. Lightstep’s distributed tracing model still bears a resemblance to the same techniques described in the paper–so I was eager to learn the differences between open source versions of distributed tracing (such as OpenZipkin) and enterprise providers such as Lightstep. The key feature of Lightstep that we discussed: garbage collection. If you are using a distributed tracing system, you could be collecting a lot of traces. You could collect a trace for every single user request. Not all of these traces are useful–but some of them are very useful. Maybe you only want to keep track of traces that take an exceptionally long latency. Maybe you want to keep every trace in the last 5 days, and destroy them over time. So, the question of how to manage the storage footprint of those traces was as interesting as the discussion of distributed tracing itself. Beyond the distributed tracing features of his product, Ben has a vision for how his company can provide other observability tools over time. I spoke to Ben at Kubecon–and although this conversation does not talk about Kubernetes specifically, this topic is undoubtedly interesting to people who are building Kubernetes technologies. Transcript Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript. Sponsors Today’s episode of Software Engineering Daily is sponsored by Datadog. With infrastructure monitoring, distributed tracing, and now logging, Datadog provides end-to-end visibility into the health and performance of modern applications. Datadog’s distributed tracing and APM generates detailed flame graphs from real requests, enabling you to visualize how requests propagate through your distributed infrastructure. See which services or calls are generating errors or contributing to overall latency, so you can troubleshoot faster or identify opportunities for performance optimization. Start monitoring your applications with a free trial and Datadog will send you a free T-shirt! softwareengineeringdaily.com/datadog. Azure Container Service simplifies the deployment, management and operations of Kubernetes. Eliminate the complicated planning and deployment of fully orchestrated containerized applications with Kubernetes. You can quickly provision clusters to be up and running in no time, while simplifying your monitoring and cluster management through auto upgrades and a built-in operations console. Avoid being locked into any one vendor or resource. You can continue to work with the tools you already know, such as Helm, and move applications to any Kubernetes deployment. Integrate with your choice of container registry, including Azure Container Registry. Also, quickly and efficiently scale to maximize your resource utilization without having to take your applications offline. Isolate your application from infrastructure failures and transparently scale the underlying infrastructure to meet growing demands—all while increasing the security, reliability, and availability of critical business workloads with Azure. Check out the Azure Container Service at aka.ms/acs. Simplify continuous delivery with GoCD, the on-premise, open source, continuous delivery tool by ThoughtWorks. With GoCD, you can easily model complex deployment workflows using pipelines and visualize them end-to-end with the Value Stream Map. You get complete visibility into and control of your company’s deployments. At gocd.org/sedaily, find out how to bring continuous delivery to your teams. Say goodbye to deployment panic and hello to consistent, predictable deliveries. Visit gocd.org/sedaily to learn more about GoCD. Commercial support and enterprise add-ons, including disaster recovery, are available.
Jan. 10, 2018
Since Kubernetes came out, engineers have been deploying clusters to Amazon. In the early years of Kubernetes, deploying to AWS meant that you had to manage the availability of the cluster yourself. You needed to configure etcd and your master nodes in a way that avoided having a single point of failure. Deploying Kubernetes on AWS became simpler with an open-source tool called kops (short for Kubernetes Operations). Kops automates the provisioning and high-availability deployment of Kubernetes. In late 2017, AWS released a managed Kubernetes service called EKS. EKS allows developers to run Kubernetes without having to manage the availability and scaling of a cluster. The announcement of EKS was exciting, because it means that all of the major cloud providers are officially supporting Kubernetes. Arun Gupta is a principal open source technologist at AWS, and he joins the show to explain what is involved in deploying and managing a Kubernetes cluster. Arun describes how to operate a Kubernetes cluster, including logging, monitoring, storage, and updates. If you are convinced that you want to use Kubernetes, but you aren’t sure yet how you want to deploy it, this will be useful information for you. We also discussed how Amazon built EKS, and some of the architectural decisions they made. AWS has had a managed container service called ECS since 2014. The development of ECS was instructive for the AWS engineers who built EKS. Amazon wanted to make EKS able to integrate with both open source tools and the Amazon managed services. Transcript Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript. Sponsors The octopus: a sea creature known for its intelligence and flexibility. Octopus Deploy: a friendly deployment automation tool for deploying applications like .NET apps, Java apps and more. Ask any developer and they’ll tell you it’s never fun pushing code at 5pm on a Friday then crossing your fingers hoping for the best. That’s where Octopus Deploy comes into the picture. Octopus Deploy is a friendly deployment automation tool, taking over where your build/CI server ends. Use Octopus to promote releases on-prem or to the cloud. Octopus integrates with your existing build pipeline–TFS and VSTS, Bamboo, TeamCity, and Jenkins. It integrates with AWS, Azure, and on-prem environments. Reliably and repeatedly deploy your .NET and Java apps and more. If you can package it, Octopus can deploy it! It’s quick and easy to install. Go to Octopus.com to trial Octopus free for 45 days. That’s Octopus.com This episode of Software Engineering Daily is sponsored by Datadog. Datadog integrates with technologies like AWS, Docker, and Kubernetes, so you can track every component of your container infrastructure in one place. See across all your servers, containers, apps, and services with rich visualizations, sophisticated alerting, distributed tracing and APM. You can even drill down to get deep insights into your containers’ health, resource usage, and deployment with Datadog’s new Live Containers view. Start monitoring your container cluster with a free trial! As a bonus, Datadog will send you a free T-shirt.  softwareengineeringdaily.com/datadog Simplify continuous delivery with GoCD, the on-premise, open source, continuous delivery tool by ThoughtWorks. With GoCD, you can easily model complex deployment workflows using pipelines and visualize them end-to-end with the Value Stream Map. You get complete visibility into and control of your company’s deployments. At gocd.org/sedaily, find out how to bring continuous delivery to your teams. Say goodbye to deployment panic and hello to consistent, predictable deliveries. Visit gocd.org/sedaily to learn more about GoCD. Commercial support and enterprise add-ons, including disaster recovery, are available.
Jan. 9, 2018
A single user request hits Google’s servers. A user is looking for search results. In order to deliver those search results, that request will have to hit several different internal services on the way to getting a response. These different services work together to satisfy the user request. All of these services need to communicate efficiently, they need to scale, and they need to be secure. Services need to have a consistent way of being “observable”–allowing logging and monitoring. Services need to have proper security. Since every service wants these different features (like communication, load balancing, security), it makes sense to build these features into a common system that can be deployed to every server. Louis Ryan has spent his years at Google working on service infrastructure. During that time, he has seen massive changes in the way traffic flows through Google. First, the rise of Android, and all of the user traffic from mobile phones. And second, the rise of Google Cloud Platform, which meant that Google was now responsible for nodes deployed by users outside of Google. These two changes–mobile and cloud–led to an increase in the amount of traffic and the type of traffic. All of this traffic leads to more internal services communicating with each other. How does service networking change in such an environment? Google’s adaptation to the new networking conditions is to introduce a “service mesh”. A service mesh is a network for services. It provides observability, resiliency, traffic control, and other features to every service that plugs into it. Each service needs to plug into the service mesh. In Kubernetes, services connect to the mesh through a sidecar. Let me explain the term “sidecar.” Kubernetes manages its resources in pods, and each pod contains a set of containers. You might have a pod that is dedicated to responding to any user that is requesting a picture of a cat. Within that pod, you not only have the container that serves the cat picture–you also have other “sidecar” containers that help out an application container. You could have a sidecar that gets deployed next to your application container that handles logging, or a sidecar that helps out with monitoring, or network communications. If you are using the Istio service mesh, that means that you are using a sidecar called Envoy. Envoy is a sidecar called a “service proxy” that provides configuration updates, load balancing, proxying, and lots of other benefits. If we get all that out of Envoy, why do we need a separate abstraction of a “service mesh”? Because it helps to have a tool that aggregates and centralizes all the different communications among these proxies. Every service gets a sidecar for a service proxy. Every service proxy communicates with the centralized service mesh. Louis Ryan joins this episode to explain the motivations for building the Istio service mesh, and the problems it solves for Kubernetes developers. For the next two weeks, we are covering exclusively the world of Kubernetes. Kubernetes is a project that is likely to have as much impact as Linux–and it is very early days. Whether you are an expert in Kubernetes or you are just starting out, we have lots of episodes to fit your learning curve. To find all of our old episodes about Kubernetes (including a previous show about Istio), download the Software Engineering Daily app for iOS or for Android. In other podcast players, only the most 100 recent episodes are available, but in our apps you can find all 650 episodes–and there is also plenty of content that is totally unrelated to Kubernetes! Transcript Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript. Sponsors Azure Container Service simplifies the deployment, management and operations of Kubernetes. Eliminate the complicated planning and deployment of fully orchestrated containerized applications with Kubernetes. You can quickly provision clusters to be up and running in no time, while simplifying your monitoring and cluster management through auto upgrades and a built-in operations console. Avoid being locked into any one vendor or resource. You can continue to work with the tools you already know, such as Helm, and move applications to any Kubernetes deployment. Integrate with your choice of container registry, including Azure Container Registry. Also, quickly and efficiently scale to maximize your resource utilization without having to take your applications offline. Isolate your application from infrastructure failures and transparently scale the underlying infrastructure to meet growing demands—all while increasing the security, reliability, and availability of critical business workloads with Azure. Check out the Azure Container Service at aka.ms/acs. This episode of Software Engineering Daily is sponsored by Datadog. Datadog integrates seamlessly with container technologies like Docker and Kubernetes, so you can monitor your entire container cluster in real time. See across all your servers, containers, apps, and services in one place, with powerful visualizations, sophisticated alerting, distributed tracing and APM. Start monitoring your microservices today with a free trial! As a bonus, Datadog will send you a free T-shirt. Visit softwareengineeringdaily.com/datadog to get started. Simplify continuous delivery with GoCD, the on-premise, open source, continuous delivery tool by ThoughtWorks. With GoCD, you can easily model complex deployment workflows using pipelines and visualize them end-to-end with the Value Stream Map. You get complete visibility into and control of your company’s deployments. At gocd.org/sedaily, find out how to bring continuous delivery to your teams. Say goodbye to deployment panic and hello to consistent, predictable deliveries. Visit gocd.org/sedaily to learn more about GoCD. Commercial support and enterprise add-ons, including disaster recovery, are available.
Jan. 8, 2018
Docker was released in 2013, and popularized the use of containers. A container is an abstraction for isolating a well-defined portion of an operating system. Developers quickly latched onto containers as a way to cut down on the cost of virtual machines–as well as isolate code and simplify deployments. Developers began deploying so many containers that they needed a centralized way to manage the containers.      Then came the rise of the container orchestration framework. A container orchestration framework is used to manage operating system resources. Twitter had been using the open source orchestration framework Apache Mesos to manage resources since 2010. So when developers started looking for a way to manage containers, many of them chose Mesos. Meanwhile, another container orchestration system came onto the scene: Docker Swarm. Swarm is a tool for managing containers that came from the same people who originally created Docker. Shortly thereafter, Kubernetes came out of Google. Google had been using containers internally with their cluster manager Borg. Kubernetes is a container orchestration system that was built with the lessons of managing resources at Google. The reason I recount this history (as I have in a few past episodes) is to underscore that there was a few years where there was a lot of debate around the best container orchestration system to use. Some people preferred Swarm, some preferred Mesos, and some preferred Kubernetes. Because of the lack of standardization, the community of developers who were building infrastructure in this space were put in a tough position: should you pick a specific orchestration framework, and go all in, building only tools for that one framework? Should you try to build tools to be compatible with all three frameworks, and attempt to satisfy Kubernetes, Mesos, and Swarm users? Or should you sit out altogether and build nothing? The fracturing of the community led to healthy debates, but it slowed down innovation, because different developers were moving in different directions. Today, the container community has centralized: Kubernetes has become the most popular container orchestration framework. With the community centralizing on Kubernetes, developers are able to comfortably bet big on open source projects like Istio, Conduit, Rook, Fluentd, and Helm, each of which we will be covering in the next few weeks. The centralization on Kubernetes also makes it easier to build enterprise companies, who are no longer trying to think about which container orchestration to support. There is a wide array of Kubernetes-as-a-service providers offering a highly available runtime–and a variety of companies offering observability tools to make it easier to debug distributed systems problems. Despite all of these advances–Kubernetes is less usable than it should be. It still feels like operating a distributed system. Hopefully someday, operating a Kubernetes cluster will be as easy as operating your laptop computer. To get there, we need improvements in Kubernetes usability. Today’s guest Joe Beda was one of the original creators of the Kubernetes project. He is a founder of Heptio, a company that provides Kubernetes tools and services for enterprises. I caught up with Joe at KubeCon 2017, and he told me about where Kubernetes is today, where it is going, and what he is building at Heptio. Full disclosure–Heptio is a sponsor of Software Engineering Daily. For the next two weeks, we are covering exclusively the world of Kubernetes. Kubernetes is a project that is likely to have as much impact as Linux. Whether you are an expert in Kubernetes or you are just starting out, we have lots of episodes to fit your learning curve. To find all of our old episodes about Kubernetes, download the Software Engineering Daily app for iOS or for Android. In other podcast players, only the most 100 recent episodes are available, but in our apps you can find all 650 episodes–and there is also plenty of content that is totally unrelated to Kubernetes! Transcript Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript. Sponsors Azure Container Service simplifies the deployment, management and operations of Kubernetes. Eliminate the complicated planning and deployment of fully orchestrated containerized applications with Kubernetes. You can quickly provision clusters to be up and running in no time, while simplifying your monitoring and cluster management through auto upgrades and a built-in operations console. Avoid being locked into any one vendor or resource. You can continue to work with the tools you already know, such as Helm, and move applications to any Kubernetes deployment. Integrate with your choice of container registry, including Azure Container Registry. Also, quickly and efficiently scale to maximize your resource utilization without having to take your applications offline. Isolate your application from infrastructure failures and transparently scale the underlying infrastructure to meet growing demands—all while increasing the security, reliability, and availability of critical business workloads with Azure. Check out the Azure Container Service at aka.ms/acs. This episode of Software Engineering Daily is sponsored by Datadog. Datadog integrates seamlessly with more than 200 technologies, including Kubernetes and Docker, so you can monitor your entire container cluster in one place. Datadog’s new Live Container view provides insights into your containers’ health, resource consumption, and deployment in real time. Filter to a specific Docker image, or drill down by Kubernetes service to get fine-grained visibility into your container infrastructure. Start monitoring your container workload today with a 14-day free trial, and Datadog will send you a free T-shirt! softwareengineeringdaily.com/datadog The octopus: a sea creature known for its intelligence and flexibility. Octopus Deploy: a friendly deployment automation tool for deploying applications like .NET apps, Java apps and more. Ask any developer and they’ll tell you it’s never fun pushing code at 5pm on a Friday then crossing your fingers hoping for the best. That’s where Octopus Deploy comes into the picture. Octopus Deploy is a friendly deployment automation tool, taking over where your build/CI server ends. Use Octopus to promote releases on-prem or to the cloud. Octopus integrates with your existing build pipeline–TFS and VSTS, Bamboo, TeamCity, and Jenkins. It integrates with AWS, Azure, and on-prem environments. Reliably and repeatedly deploy your .NET and Java apps and more. If you can package it, Octopus can deploy it! It’s quick and easy to install. Go to Octopus.com to trial Octopus free for 45 days. That’s Octopus.com
Jan. 5, 2018
In the first 10 years of cloud computing, a set of technologies emerge that every software enterprise needs; continuous delivery, version control, logging, monitoring, routing, data warehousing. These tools were built into the Cloud Foundry project, a platform for application deployment and management. As we enter the second decade of cloud computing, another new set of technologies are emerging as useful tools. Serverless functions allow for rapid scalability at a low cost. Kubernetes offers a control plane for containerized infrastructure. Reactive programming models and event sourcing make an application more responsive and simplify the interactions between teams who are sharing data sources. The job of a cloud provider is to see new patterns in software development and offer tools to developers to help them implement those new patterns. Of course, building these tools is a huge investment. If you’re a cloud provider, your customers are trusting you with the health of their application. The tool that you build has to work properly and you have to help the customers figure out how to leverage the tool and resolve any breakages. Onsi Fakhouri is the senior VP of R&D for cloud at Pivotal, a company that provides a software and support for Spring, Cloud Foundry and several other tools. I sat down with Onsi to discuss his strategy for determining which products Pivotal chooses to build. There are a multitude of engineering and business elements that Onsi has to consider when allocating resources to a project. Cloud Foundry is used by giant corporations like banks, telcos and automotive manufacturers. Spring is used by most enterprises that run Java, including most of the startups that I have worked at in the past. Cloud Foundry has to be able to run on premise and in the cloud providers like AWS, Google and Microsoft. Pivotal also has its own cloud, Pivotal Web Services, and all of these stakeholders have different technologies that they would like to see built. Onsi’s job is to determine which ones have the highest net impact and make a decision on those and allocate resources towards them. I interviewed Onsi at Spring One Platform, which is a conference that is organized by Pivotal who, full disclosure, is a sponsor of Software Engineering Daily. This week’s episodes are all conversations from that conference, and if there’s a conference that you think I should attend and do coverage at, let me know. Whether you like this format or not, I would love to get your feedback. We have some big developments coming for Software Engineering Daily in 2018 and we want to have a closer dialogue with the listeners. Please send me an email, [email protected] or join our Slack channel. Transcript Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript. Sponsors Simplify continuous delivery with GoCD, the on-premise, open source, continuous delivery tool by ThoughtWorks. With GoCD, you can easily model complex deployment workflows using pipelines and visualize them end-to-end with the Value Stream Map. You get complete visibility into and control of your company’s deployments. At gocd.org/sedaily, find out how to bring continuous delivery to your teams. Say goodbye to deployment panic and hello to consistent, predictable deliveries. Visit gocd.org/sedaily to learn more about GoCD. Commercial support and enterprise add-ons, including disaster recovery, are available. IBM Cloud gives you all the tools you need to build cloud native applications. Use IBM Cloud Container service to easily manage the deployment of your Docker containers. For serverless applications, use IBM Cloud Functions for low cost, event-driven, scalability. If you like to work with a fully managed platform as a service, IBM Cloud Foundry gives you a cloud operating system to control your distributed application. IBM Cloud is built on top of open source tools, and integrates with all the third party services that you need to build, deploy, and manage your application. To start building with IBM today, go to softwareengineeringdaily.com/IBM and sign up for a free Lite account. With the Lite account, you can start building apps for free, and try numerous cloud services with no time restrictions. Check it out at softwareengineeringdaily.com/IBM. Indeed Prime flips the typical model of job search and makes it easy to apply to multiple jobs and get multiple offers. Indeed Prime simplifies your job search and helps you land that ideal software engineering position. Candidates get immediate exposure to top companies with just one simple application to Indeed Prime. Companies on Prime’s exclusive platform message candidates with salary and equity upfront. Indeed Prime is 100% free for candidates – no strings attached. Sign up now at indeed.com/sedaily. You can also put money in your pocket by referring your friends and colleagues. Refer a software engineer to the platform and get $200 when they get contacted by a company…. and $2,000 when they accept a job through Prime! Learn more at indeed.com/prime/referral.
Jan. 3, 2018
Cloud Foundry is an open-source platform as a service for deploying and managing web applications. Cloud Foundry is widely used by enterprises who are running applications that are built using Spring, a popular web framework for Java applications, but developers also use Cloud Foundry to manage apps built in Ruby, Node and any other programming language. Cloud Foundry includes routing, message brokering, service discovery, authentication and other application level tooling for building and managing a distributed system. Some of the standard tooling in Cloud Foundry was adopted from Netflix open-source projects, such as Hystrix, which is the circuit breaker system; and Eureka, which is the service discovery server and client. When a developer deploys their application to Cloud Foundry, the details of what is going on are mostly abstracted away, which is by design. When you’re trying to ship code and iterate quickly for your organization, you don’t want to think about how your application image is being deployed to underlying infrastructure. You don’t want to think about whether you’re deploying a container or a VM, but if you use Cloud Foundry enough, you might have become curious about how Cloud Foundry schedules and runs application code. BOSH is a component of Cloud Foundry that sits between the infrastructure layer and the application layer. Cloud Foundry can be deployed to any cloud provider because of BOSH’s well-defined interface. BOSH has the abstraction of a stem cell, which is a versioned operating system image wrapped in packaging for whatever infrastructure as a service is running underneath. With BOSH, whenever a VM gets deployed no your underlying infrastructure, that VM gets a BOSH agent. The agent communicates with the centralized component of BOSH called the director. This role of director is the leader of the distributed system. Rupa Nandi is a director of engineering at Pivotal where she works on Cloud Foundry. In this episode we talked about scheduling an infrastructure, the relationship between Spring and Cloud Foundry and the impact of Kubernetes, which Cloud Foundry has integrated with so that users can run Kubernetes workloads on Cloud Foundry. I interviewed Rupa at SpringOne Platform, a conference that is organized by Pivotal who, full disclosure, is a sponsor of Software Engineering Daily, and this week’s episode are all conversations from that conference. Whether you like this format or don’t like this format, I would love to get your feedback. We have some big developments coming for Software Engineering Daily in 2018 and we want to have a closer dialogue with the listeners. Please send me an email, [email protected] or join our Slack channel. We really want to know what you’re thinking and what your feedback is, what you would like to hear more about, what you’d like to hear less about, who you are. Transcript Transcript provided by We Edit Podcasts. Software Engineering Daily listeners can go to weeditpodcasts.com/sed to get 20% off the first two months of audio editing and transcription services. Thanks to We Edit Podcasts for partnering with SE Daily. Please click here to view this show’s transcript. Sponsors Sponsoring today’s podcast is Datadog, a cloud-scale monitoring and analytics platform. Datadog integrates with more than 200 technologies, including Cloud Foundry, Docker, Kubernetes, and Kafka, so you can get deep visibility into every layer of your applications and infrastructure—in the cloud, on-premises, in containers, or wherever they run. With rich dashboards, machine learning-powered alerts, and distributed request tracing, Datadog helps teams resolve issues quickly and release new features faster. Start monitoring your dynamic cloud infrastructure today with a 14-day trial. Listeners of this podcast will also get a free T-shirt for trying Datadog! softwareengineeringdaily.com/datadog Simplify continuous delivery with GoCD, the on-premise, open source, continuous delivery tool by ThoughtWorks. With GoCD, you can easily model complex deployment workflows using pipelines and visualize them end-to-end with the Value Stream Map. You get complete visibility into and control of your company’s deployments. At gocd.org/sedaily, find out how to bring continuous delivery to your teams. Say goodbye to deployment panic and hello to consistent, predictable deliveries. Visit gocd.org/sedaily to learn more about GoCD. Commercial support and enterprise add-ons, including disaster recovery, are available. IBM Cloud gives you all the tools you need to build cloud native applications. Use IBM Cloud Container service to easily manage the deployment of your Docker containers. For serverless applications, use IBM Cloud Functions for low cost, event-driven, scalability. If you like to work with a fully managed platform as a service, IBM Cloud Foundry gives you a cloud operating system to control your distributed application. IBM Cloud is built on top of open source tools, and integrates with all the third party services that you need to build, deploy, and manage your application. To start building with IBM today, go to softwareengineeringdaily.com/IBM and sign up for a free Lite account. With the Lite account, you can start building apps for free, and try numerous cloud services with no time restrictions. Check it out at softwareengineeringdaily.com/IBM.