AWS re:Invent 2021 – Architecting for Sustainability
What time is it? It’s time to decarbonize your company.
In this video from Amazon Web Services (AWS) re:Invent 2021, Adrian Cockcroft of Amazon takes us on a deep dive into sustainability transformation challenges, how to overcome them, and why your company should sign the Climate Pledge.
Techniques recommended by the AWS Well-Architected Framework and direction on reducing the energy and carbon impact of AWS architectures
About user patterns, software design, and AWS service considerations, which organizations of any size can apply to their workloads
How Starbucks optimized its Kubernetes cluster, increased spot usage, and focused on digital sustainability across the organization
Who is this for?
0:00 (gentle music) Okay, hi, everybody. Thank you for coming to ARC325
0:06 architecting for sustainability. I'm Adrian Cockcroft, VP Sustainability Architecture for Amazon.
0:14 People ask me, "What does that mean?" It means this. This is actually kind of the title I have
0:21 for talking about architecting for sustainability. I was previously with AWS working on open source,
0:29 and things like that. Earlier this year, I switched to working for Amazon, but focused on helping AWS and AWS customers
0:38 with their sustainability needs. We'll have Steffen up next, talking about the well-architected
0:44 guide for sustainability, the pillar and Drew talking about how Starbucks has optimized their architecture for sustainability.
0:52 I'll come back at the end just for a few words to wrap it up. So it's the overall of it.
1:00 Starting off talking about sustainability transformation. What do we mean by that? How are we partnering on sustainability?
1:07 Then Steffen will talk about how to architect. We'll get the Starbucks talking. I'll just talk a little bit about the end call to action
1:14 next steps for builders. So sustainability transformation.
1:20 What does this mean? Well, let's start talking about sustainability to start with. There's really four areas to sustainability.
1:27 One is decarbonizing everything we do. The second one is water usage. How do we use water effectively
1:34 make sure we're not polluting, and keep it clean? Then there's social responsibility. This is about knowing your supply chain,
1:41 knowing that the labor practices are good, there's no false labor or child labor. And then circular economy.
1:47 How do we make sure there's zero going to landfill, and there's a lot of recycling? Today, we focusing really
1:53 on the carbon part of this problem.
1:59 So sustainability is becoming an increasing trend across the business.
2:05 We're seeing customer demand for this. We're seeing increased government regulations, driving all kinds of businesses
2:11 to worry about sustainability they didn't use to. We're seeing a lot of employee demand, a lot of enthusiasm for people that want to do something,
2:19 like I'm going to leave the world a better place for your children is a common sort of phrase that we hear.
2:25 We're seeing investors that want to invest in things which are helping the planet
2:32 rather than destroying the planet. And we're also seeing sustainability as competitive positioning across product lines.
2:39 Everyone knows what that looks like in the grocery store, you see all of these brandings.
2:45 But what we've been doing for the last decade or so, what I've been working on is digital transformation.
2:51 And what that means to me is that we were able to connect to our customers and everything we made,
2:58 and we were able to use that to transform our businesses, and build new kinds of services,
3:03 and new kinds of capabilities. And speed everything up using Cloud. So that's digital transformation.
3:09 You should all be pretty familiar with that by now. But what we have to do now is use those same techniques,
3:14 but use them to decarbonize everything we do. Measure it, decarbonize it, and do it at speed.
3:21 Do it quickly using the same techniques that we used for digital transformation.
3:27 And it turns out that companies that figured out digital transformation and sustainability transformation
3:33 are doing much better in the market. The combination together is a very powerful thing.
3:41 But there's challenges. How do you identify what your carbon emission hotspots are?
3:47 How do you reduce energy and water use in your operations? How can you innovate faster to achieve this transformation?
3:54 And how do you collaborate with your supply chain and your value chain to reduce the carbon emissions
4:00 across end to end?
4:05 And then the other thing that's different, digital transformation happens in months or years you should know fast enough, but months to years.
4:12 Sustainability, we're talking about decades. And we say, well, you may have heard of people saying,
4:20 "Well, we have a pledge. We're going to be carbon neutral in 2050." How many people are still gonna be working
4:26 at that company in 2050 that are making that commitment now? You can make all kinds of promises
4:32 you don't have to deliver on for 30 years. So how can you deliver on a result that's decades away
4:37 and know that you're speeding up the process of getting there? So I'm gonna talk a bit about how Amazon's approaching that,
4:45 and then a bit more about how AWS is approaching that.
4:51 About two years ago, we announced The Climate Pledge. This is to take the Paris Agreement,
4:58 which is a 2050 net-zero goal, and bring it in 10 years earlier to 2040.
5:03 And to have a number of companies around the world, join us in that pledge.
5:09 Now, around Earth Day, which is April this year, we announced that we passed 100 signatories
5:16 in the Climate Pledge. And at COP26, month or so ago, we announced we'd passed 200 companies
5:23 in the Climate Pledge. So this is growing very quickly, and I'd encourage you, if your company has a goal to be sustainable,
5:30 take a look at The Climate Pledge, reach out through your account team, reach out to me.
5:35 We can help figure out what does it take to help you get to a 2040 commitment.
5:41 Because the more people we can get to 2040, the more impact we can get on the overall goal of getting the carbon for 2050 down
5:51 to the where it needs to be. There's gonna be plenty of people that miss the 2050, right? We need to have a lot of people that are early
5:58 to balance the people that are gonna be late. So you need a path, oh.
6:06 So we've committed to net-zero carbon, and then there's a path to 100% renewable energy by 2025
6:11 that Amazon as a whole is committed to as well. And we also have this $2 billion Climate Pledge fund,
6:17 which has invested in a number of smaller startups plus a company called Rivian. And you may have seen the truck downstairs,
6:23 which is done very well recently. So, The Climate Pledge has three principles.
6:29 Regular reporting, carbon elimination, and credible offsets. And across many industries, many countries,
6:36 and many large brands from around the world. So what do you get out of signing this?
6:43 So it's collaboration, we're sharing ideas. We're trying to find a puff to look to carbon neutrality.
6:51 But these are the companies who are the leaders and innovators in their markets that have figured out how to get to an earlier commitment
6:58 and then there are other companies going well. I'd like to get there, but I don't see how to get there.
7:05 So what we want to do is use these leaders and innovators, document how they did it.
7:10 Discuss how they did it. Collaboratively work together to help get there quickly and then bring the next wave of people
7:18 into The Climate Pledge. So that's what we're doing. But also if you're a Climate Pledge signatory,
7:25 and the investors are looking at you, there were these ESG funds, environmental, social, and governance funds,
7:31 which are looking for investing in companies which have a good sustainability record. So it helps your branding in that sense.
7:39 And then there's always opportunities to lead by sharing technologies, and this is really to accelerate the progress.
7:48 If you look at how we've got there, last three years, we've published our annual report.
7:55 You can find it on the website. Incredible detail on what we're doing, and across all of Amazons.
8:01 This is not just AWS. We publish the entire company of Amazon, and AWS is just a piece of that.
8:08 We've ordered these 100,000 Rivian electric delivery vehicles. We've reduced the weight of outbound packaging,
8:14 saved a million tons of cardboard and things like that. And we're also the largest buyer of renewable energy in the world.
8:21 And we recently announced, just a few days ago, that we're now up to 12 gigawatts of energy under contract.
8:29 We're building new headquarters, and that's got LEED platinum, it's sustainable.
8:34 These new buildings have no gas supply to the building. They're fully electric. And they're built out of low-carbon concrete,
8:40 low-carbon steel. So building is one of the areas that every company has buildings,
8:45 everyone needs to decarbonize how they build their buildings, and how they operate their buildings. One of the big areas we're working on.
8:52 And then if you have an echo or a fire device, we're actually working to offset the energy you use
8:58 in your house to run those devices. We know which ones are turned on. We know how much power they use. We add that up, and we we're basically offsetting the energy
9:06 of the footprint of our devices.
9:11 If we look at AWS, we've done a lot of work on water
9:17 around data center cooling, and evaporative cooling, and recycling the water. So there's a very advanced set of processes there
9:25 around water management. We've also optimized the energy that comes into the data center from the edge,
9:33 and it gets supplied to the building. There's a lot of conversion losses before it actually gets to the CPU
9:40 that's going to run your workloads. We reduce those energy losses by 35%. And part of that was by going to a distributed UPS
9:47 that's actually lets you optimize everything deeply in the system
9:52 rather than having a UPS for the entire building. Obviously, we've been using the renewable energy
9:58 that Amazon as a whole has been buying up to provide low-carbon energy for data centers.
10:05 We've got a low-carbon concrete data center design. You've heard a lot about Graviton this week,
10:10 Graviton3 and Graviton2. Very good performance per watt. So you can reduce your carbon footprint
10:17 by just moving to Graviton. And I'll talk a bit more about the Amazon Sustainability Data Initiative.
10:22 This launched in 2018, and it's an Open Data Program.
10:28 Just going back to that carbon neutral data centre, there's a European organization,
10:33 the Climate Neutral Data Centre Pact. We've joined that, and we're building this commitment
10:39 to proactively lead the transition to climate-neutral economy. But data is central to addressing the climate crisis.
10:47 It's growing exponentially from new sources. You heard in the keynote, certainly about satellites.
10:54 There were more and more satellites, downloading more and more high resolution data, as well as all the instrumentation we're getting
11:00 from the Internet of Things, and all the metering we're doing to figure out where our carbon footprint goes.
11:06 It's very diverse data. There's lots of different people using it, and lots of different applications for it.
11:13 So we have a few programs that can help here. And we have an Open Data Program that in general,
11:19 if you have a data set, you own a dataset, and you'd like to share it with the world. Maybe it's petabytes in size.
11:25 That's fairly expensive to share it with the world. But if you enter our Open Data Program,
11:31 we will zero out the cost of that S3 bucket to you. You still own the data, it's your bucket,
11:36 but you share that to the world. We zero out the cost, and we cover all the transfer and usage costs.
11:42 So, it's free for you and it's free to use. So we're taking that approach, and we have a very large number of data sets
11:48 in this program. A subset of that is Amazon Sustainability Data Initiative,
11:53 which is more focused on climate data. And then we have a marketplace, AWS Data Exchange.
12:01 So ASDI is there to reduce the cost, time, and technical barriers with dealing with these
12:07 very large data sets. Lots of different data sources here,
12:12 Digital Globe, NASA, the Met Office, UK Met Office. Climate data, we have models.
12:20 Climate data, air quality, historical weather records forecasts, all kinds of indicators.
12:26 So if you're trying to do something that's more sort of global, trying to figure out what's going on in your region,
12:31 you can actually aggregate all these data sets together. We have a hackathon, the Code Green hackathon on Monday,
12:38 where we had teams come in work on combining some of these data sets to build some new capabilities.
12:47 We've also recently announced the first climate model, the full climate models. These are the things that are predicting
12:53 what's gonna happen that we brought to the Cloud with NCAR.
12:58 And SilverLining is the nonprofit that's doing, sort of working with NCAR to make this happen.
13:06 Then Data Exchange provides data sources as a marketplace.
13:13 So that's AWS as your sustainability partner. But what problems you're trying to solve? What are your biggest challenges related to sustainability?
13:20 How are you solving these issues? We want to identify the cases, move to AWS,
13:28 and optimize those workloads. Lots of different use cases. Our professional services team has been working
13:34 across all these different areas, and work through your account teams,
13:40 if you're working on one of these, and we can bring in some experienced teams that have built this for other organizations already.
13:47 One of those case studies is carrier with the cold chain, keeping refrigeration working end-to-end through the entire chain of activities.
13:55 Another one is using machine learning to optimize the energy and water usage for Coca-Cola.
14:04 So lots of different ways. We can migrate workloads to AWS. And we even decode developed products with our customers
14:11 through the industry products group. And one example there is Vector,
14:16 who had building metering system that can track the energy usage of that customers.
14:23 So that's what I've got. And we should look now optimizing Cloud workloads.
14:30 And I'm gonna bring up Steffen now to talk about that.
14:36 Yeah, thank you. There you go. Thank you, Adrian. Yeah, I would like to talk about how you architect for sustainability,
14:43 and reduce your environmental impact.
14:48 When we talk about the environmental impact, one important factor for the climate change and for the global warming is the emission of pollution
14:55 and emission by greenhouse gases. One way to categorize and quantify these emissions
15:03 is the Greenhouse Gas Protocol, which divides the emissions into three scopes.
15:09 It's measured in carbon dioxide equivalents to also factor in other gases that support global warming.
15:18 Scope one is direct emissions. Fuel, wood, anything which burns, or emits greenhouse gases.
15:25 To reduce that, you would electrify everything. For example, by switching to electric vehicles,
15:31 or moving from gas ranges to induction ranges. Scope two is emissions through purchased electricity.
15:40 If you electrify anything, still the energy is having a carbon intensity,
15:45 because it's usually sourced, for example, from a mix of wind, and solar, and coal.
15:53 And the emissions contribute to scope two. To reduce that, you would use as much
15:59 renewable power as possible, and use the renewable energies,
16:05 for example, in batteries. So it is there when you need it.
16:11 Scope three is indirect emissions, all the way up and down your supply chain.
16:17 Scope three is depending a lot on your production depth, the products and services you're delivering,
16:22 and how you deliver them. Let's transfer these three scopes to data centers.
16:29 Looking at a typical data center, you see the energy is coming from the grid.
16:34 It's usually a mix of renewables and fossil fuels. And the carbon in the energy contributes to scope two.
16:43 Then you have a diesel generator for backup if the grid is not available, and its direct emissions contribute to scope one.
16:50 And then there are things which needs to be built and delivered to the facilities like the building itself,
16:57 and also equipment like racks and servers. And the carbon emitted for this contributes to scope three.
17:06 Zooming in close on the data center, you see that the energy is distributed in the facilities
17:12 for the cooling to the service, and to charge the uninterruptible power supplies.
17:19 The servers are doing a lot, but if you just look from outside at these boxes,
17:27 using the energy, they trust electricity and emit it as heat. And this heat needs to be dissipated by cooling.
17:35 And on an abstract level, every data center looks like this. Cloud or on-premise.
17:41 But let's look at how the Cloud helps to increase the resource efficiency,
17:46 and reduce the carbon footprint. First, let's go back to something
17:52 you're already likely familiar with, the shared responsibility model of security, which you share with AWS.
18:00 It says, AWS is responsible for the security of the Cloud, like physical data centers, managed services,
18:07 and customers are responsible for the security in the Cloud using these managed services to, for example,
18:15 configure a firewall in the VPC, or encrypt data.
18:21 And we can apply the same concept of shared responsibility to sustainability.
18:29 AWS designs things, builds things, like data centers, data halls, REX servers,
18:37 and takes care of the material from the construction to the recycling. AWS, as you have seen in Adrian's part
18:45 purchases the energy, and then ensures that the energy is used efficiently, and also other resources
18:52 like water for cooling is used efficiently. And lastly, AWS service teams manage services,
19:00 and take care to optimize them for sustainability. And customers responsibility on top of that
19:07 is making architectural decisions. Selecting the services, using the services,
19:12 and determined which code is running on these, and how efficient that is.
19:19 How does the sustainability of the Cloud look like and how does it compare to on-premises operations?
19:26 A typical data center is a mix of technology often under utilized,
19:31 and with a lot of wasted energy and older equipment.
19:37 For most organizations running data centers and IT equipment is not their core competency,
19:43 and as such, they have less experience, relatively less experience, and investments into end-to-end efficiency improvements
19:50 of their operations. In contrast to that, if you look at an AWS datacenter,
19:59 the optimization begins with the purchase of energy, as you have seen in Adrian's part.
20:07 Amazon is the largest corporate purchaser of renewable energies, and is far from being done here,
20:14 as you saw in the recent announcements this week. But we are also investing in innovating
20:20 to increase the energy efficiency. For example, the decentralized uninterruptible power supplies,
20:26 and we also minimize the energy we use for cooling
20:32 by using direct evaporative cooling or cooling with outside air.
20:39 We manage centralized services, for example, for network and storage. And each service has a dedicated service team,
20:47 which is managing the capacity according to the demand of the customers.
20:53 And each team works on optimizing and reducing the cost of operating the services,
21:01 and ultimately the energy and carbon footprint of the systems used to run AWS.
21:09 Take the EC2 teams. They manage the fleet of instances used by customers, but if they are not used by customers,
21:16 the service are still there. Of course, the service teams cannot dial down the infrastructure to the exact demand.
21:24 There must be spare capacities, so it is there when you need it. That's why they incentivize the use of spare EC2 capacity
21:32 within discount of up to 90% on demand pricing with EC2 spot instances.
21:40 And this increases the overall utilization, and resource efficiency of the data centers.
21:48 I'll take a higher level service like AWS Lambda. When Lambda was launched at re:Invent seven years ago,
21:55 it was running with dedicated EC2 instances per customer. So it meets the desired level of isolation and security.
22:03 Four years later already, it was running on all new services. Bare metal instances controlled by EC2 and Nitro
22:10 with the virtualization technology called firecracker, which allows the launch of a new secure microVM
22:17 in a fraction of a second. And this of course reduces the overhead also of the whole service.
22:24 And this is just one example of how AWS service teams can optimize under the hood.
22:31 You can find more information on how AWS helps to reduce the carbon emissions in these public reports
22:39 from 451 Research for US, APAC, and Europe.
22:44 One important factor of course, for the reduction is the growing share of renewable energies.
22:50 But it's also very important how this energy is used.
22:55 In the report for Europe, which has just been released last month, you can see, and this is highlighting how the energy use
23:03 can be reduced by almost 80% by moving from on-premise data centers to the Cloud
23:12 through more efficient facilities, more efficient servers, and the overall higher utilization of the Cloud.
23:20 But what can customers do in their part of the responsibility?
23:26 Over the last years, we've published five pillars in the AWS Well-Architected Framework,
23:33 and they capture best practices for operational excellence, security, reliability, performance, and cost.
23:40 And these all have to support business needs for fast time to value for features,
23:48 and the delivery of products. However, today we have new problems,
23:53 and we noticed that the ocean is rising. There's a climate crisis affecting our businesses,
24:00 and we need to think about sustainability. Essentially, we need to optimize for the delivery
24:07 of sustainable products. So that's why today we announced sustainability
24:14 as the sixth pillar for the AWS Well-Architected Framework to meet this need.
24:21 The sustainability pillar enhances the framework to provide a way for you to consistently measure
24:27 architectures against best practices, and identify areas for improvement.
24:32 And over time, it helps us to do our bit towards
24:37 mitigating the climate crisis. The practice of sustainability when building Cloud workloads
24:45 is to understand and quantify the impacts, and apply best practices to reduce these impacts.
24:52 And the term sustainability is a broad field like covering social, and economic,
24:59 and environmental aspects. And as this talk, that the pillar focuses on the environmental impacts,
25:06 and especially resource efficiency, energy consumption,
25:11 resource efficiency, as this is an important lever, architects have to improve and reduce the resource usage.
25:21 How can you improve? Basically the improvement process that is outlined
25:28 in the well-architected pillar is an iterative process. It's similar how we already optimizing for cost
25:35 and performance for decades. First, you need to be aware of your KPIs,
25:42 and the performance against these KPIs. And you also need a goal for improvements.
25:49 You identify targets like looking at all your applications in your application landscape,
25:54 and then you prioritize them by impact and by usage type, for example.
26:00 Then you evaluate specific improvements. You make hypothesis like implementing auto-scaling,
26:07 doing right-sizing, making an architectural change. Then you experiment, you deploy to production,
26:13 and then you measure the results. Depending on the success, you either roll back unacceptable outcomes,
26:20 or you spread the word. You look at ways to replicate the success to other workloads.
26:27 And as long as you provide application teams with metrics, they can optimize against that.
26:34 Which metrics make sense? I need metrics that can be measured frequently,
26:39 that are delivered in a timely manner, and which I can break down to application teams
26:44 so that application teams can say, yeah, this release we did last Monday
26:49 had a bad or good impact on our metrics. And metrics need to align with resource usage,
26:57 so that they are tangible for application teams. These metrics here are metrics I can draw from AWS services.
27:06 I can look at resources such as compute, network, and storage
27:11 and it is important to not look only at one metric, since they often compete with each other.
27:18 For example, I can store my data in just one region,
27:23 but then to serve my customers globally, I will have more data transfer.
27:28 Or I can reduce my compute resources by storing more data,
27:33 caching more data. And in the same way, the metrics compete with each other,
27:40 also they compete with traditional non-functional requirements.
27:46 Essentially, your sustainability KPIs are yet another nonfunctional requirement.
27:52 And as such, you consider trade offs like adjusting costs. Moving processing work from your end user devices
27:59 into your application backend, may increase the cost of your application,
28:05 but it will also lower the demands to your end user devices. And maybe your application will not be responsible
28:12 that this iPhone from last year need to be upgraded.
28:17 And you can also make a trade off regarding the quality of the results.
28:23 Sometimes it's okay to say there are hundreds of results instead of giving an exact figure.
28:30 And we should avoid the pitfall to just look at the resources. When your usage is the same, but at the same time,
28:38 your user base is decreasing, that's a cost for alarm. That's why you should factor in the business metrics
28:45 to measure efficiency in resources, by unit of work, such as vCPU hours per connected vehicle mile,
28:54 or megabyte of storage per customer. And this, by the way, you normalize KPIs to compare the performance over time.
29:04 You need to get this tablet dashboards for the KPIs at the team level so that each team can own
29:10 their goals and can optimize independently.
29:16 This year is just one way to do this. Most of you are already familiar with the AWS cost and user tree parts for the sake of cost show backs,
29:25 but as the name implies, it is a source for usage as well. And you can combine that with Amazon CloudWatch metrics
29:33 and your business metrics and create that with Amazon Athena. Start by establishing a few KPIs,
29:39 for example, in an Amazon QuickSight and start with KPIs where you have a large influence
29:46 in the shared responsibility model, for example, EC2 Compute and experiment based on the feedback
29:53 by your application teams. We look now at the process and the KPIs
29:59 you need to establish, or you're suggesting to established in the sustainability pillar
30:04 and these help us to identify workloads
30:09 where you should invest with further deep dives and reviews.
30:15 And the pillar also highlights best practices you should apply from these five areas.
30:20 User behavior, software, and architecture, the hardware, the data management
30:25 and the development and deployment process. And I would like to go
30:30 over these best practice areas in detail.
30:36 First, user behavior, best practices. They describe how you can align your workload
30:41 to your customers usage to meet your sustainability goals.
30:48 It's a lot of suggestions here, but in short look at when, where
30:53 and how your customers are using your workload and what they don't use, is of course,
30:59 potential for optimization and you can get rid of unused resources.
31:05 And revisit the service levels you promise to your external, but also internal customers.
31:13 I want to show an example of the impact of the architectural decisions and SLAs
31:19 on the resource efficiency. Consider the SLA of your application requires
31:24 an immediate fail over, and your service is already running into availability zones for high availability.
31:33 For implementing the immediate fail over, you will have to have the capacity running of one AZ
31:41 and being available in the other AZ so that when you have a fail over
31:47 the other AZ can take over the additional workload. But as you saw to be able to successfully accomplish that,
31:55 you have to have over 50% of the capacity, just waiting for the rare case of a fail over,
32:03 which translates to 50% utilization. Now, if you run across three availability zones,
32:10 you will have to have less capacity in the other two availability zones
32:18 to handle the workload. In this case here, you have at least two thirds
32:23 of the capacity available for your resources. So when a fail-over happens,
32:28 your reserve capacity in the other two availability zones can take over.
32:35 So as you see, you can run with higher utilization rates, across three availability zones to have enough capacity.
32:45 If you want to optimize resiliency for sustainability, revisit the SLA,
32:51 does it need to have an immediate fail over capability? If not, you can use cold capacity for failover.
32:58 The trade off, you will have to consider is it takes a longer time to spin up the resources and your users may have to wait
33:05 until their request is fulfilled. However, the upside is utilization.
33:11 You don't have to have all that extra capacity waiting for the rare case of a fail over.
33:21 One recommendation is to review the SLAs and negotiate impact-friendly SLAs.
33:26 Yes, they might be negative effects like slower response times, but the upside is a higher resource efficiency.
33:38 And for those stakeholders who can be convinced by US dollars, yes, less resources means less cost.
33:47 The next area for best practices is reducing your impact by making changes to the software and architecture.
33:54 And to sum up the previous slide, it is work less, store less, do work more efficiently,
34:02 drive up the utilization by reducing idling resources,
34:07 use sustainable scheduling strategies, steer work to more sustainable regions.
34:14 To make it more concrete, let's pick the example of logging in your application. Review the details and the lock level and the detention.
34:21 Use efficient formats and compression that matches the structure and the type of your logging.
34:28 Use as in Kronos lock free logging. So your logging does not become the bottleneck. And when you don't care,
34:34 when your lock analytics need to run, distribute your logs over time.
34:40 You can imagine that many customers are running their analytics workloads at night at the full hour.
34:48 And as regions have a different carbon intensity select regions near Amazon renewable energy projects
34:55 or pick regions where the grid has published carbon intensity
35:01 that is lower than in other regions. This curve here shows the resources used by an application.
35:11 You see an average use and also peaks when the work load
35:16 is all done at the same time. However, the resources and the energy use
35:23 is not indicated by this curve, but it's indicated by the provision capacity line.
35:29 And that provision capacity is needed to handle the peaks. Now let's consider that we can drive down these peaks
35:37 by distributing your workloads over time, or by implementing a queue to smoothen the workload.
35:43 We achieve a better ratio from the peaks to the average use and can dial down the provision capacity.
35:51 This results in less resources and also less energy consumed.
35:57 Next area for best practices from the pillar is minimize the amount of hardware
36:04 that is needed to provision and deploy. In general, select the most efficient hardware
36:11 for your individual workload. A general recommendation is do right-sizing
36:16 not only select the right ratio from compute to memory and the size of the instance,
36:23 and also features like GPU's and instance storage also, the processor type.
36:29 One example is Graviton, already mentioned by Adrian.
36:34 The Graviton2 processor has been launched already in 2019, Graviton3 this week
36:40 and it remains the most power efficient processor AWS offers.
36:45 The adoption can be as easy as switching a managed service like Amazon relational database or open search service to the corresponding Graviton type.
36:55 And you can also move your code to Graviton with EC2 and also with Lambda.
37:02 Next in the pillar are best practices around data management and storage
37:08 and processing practices. For our data, we should think strategically
37:14 about the type of storage that we use. Some data needs to be accessed fast and often,
37:21 some data is never read once written. And if you decide for storage or service features
37:27 with a relaxed durability, availability or response time,
37:32 AWS can use this information to make trade offs and improve the resource efficiency
37:40 and energy efficiency. The example here of Amazon S3,
37:46 S3 One Zone-Infrequent access allows AWS to not send all objects over the wire
37:52 to another availability zone. And in the same way, the extended retrieval times of glacier allow tradeoffs
38:00 to use less energy and less infrastructure for operations.
38:05 The general recommendation is look for opportunities to move your data to cold storage tiers,
38:12 and also take a look at the announcements. This week, we also did two interesting announcements
38:20 about EBS networks and in S3 for further storage tiers that you can leverage.
38:29 Let me point out the case from an internal service AWS users for Lock archival to S3,
38:34 to highlight the relevance of compression. The service teams looked for opportunities
38:40 to reduce the resource use and experimented with different compression algorithms.
38:47 Of course, the results will vary by the trade offs you're willing to make between speed and ratio
38:53 and the types of data you compress but the compression in this example here, improved significantly coming from LZ4 and gzip
39:02 to Z standard, and as many small things at Op, this reduced the required storage
39:09 by overall one exabyte. And finally one area for the best practices by the pillar,
39:16 look at your software life cycle to support improvements for resource efficiency.
39:21 That includes of course, switching off test and dev environments when you do not use them,
39:28 but also make it safe and easy to introduce and validate improvements off your own applications
39:35 for resource efficiency and third party libraries. One example I would like to pick here
39:47 or node js application is using an AWS service. And as the many copies of these libraries at app,
39:54 the team wanted to decrease the size of the libraries. With version three, they implemented
39:59 a modular approach so that you can cherry pick the clients that are needed for your application,
40:05 which resulted in a potential reduction of the size by 75%.
40:11 And in further iterations, they trimmed the unnecessary code and build artifacts
40:16 to reduce the packaged and install size by another 50%.
40:23 Now we had a quick rundown of the process and the best practices from the well-architected pillar document
40:30 to get more practical. Let's hear from Drew Engelson, Director of Engineering at Starbucks Technology,
40:36 how he and his team implemented sustainability as a non-functional requirement.
40:43 Thank you. Thank you. Thank you, Steffen, thank you, Adrian.
40:49 And welcome everybody. To inspire and nurture the human spirit,
40:56 one person, one cup, one neighborhood at a time. This is the Starbucks mission statement.
41:02 It serves as a guiding light a guidepost and a directional opportunity
41:09 for us to understand and make decisions hopefully for the betterment of our customer experience and the planet.
41:17 You'll see, there's no mention of coffee there. It's about people, it's about humanity.
41:22 Yes, we sell coffee, but we really sell moments of joy.
41:31 So Starbucks was founded in 1971. So for those who are really quick at math, that makes this our 50th anniversary.
41:38 And over that time, we have made many investments in sustainability. I'll highlight just a few.
41:45 So in 1985, we were one of the first retailers to offer a 10 cent discount for bringing in a reusable cup.
41:52 And that was just one year into Starbucks selling coffee and cups. Prior to that, we were selling bulk coffee and tea.
42:00 In 2008, LED Build, if you remember,
42:06 had got a cold feeling, they weren't warming cozy like they are today. So we work with general electric
42:11 to help develop the right feel for LED bulbs and put them in all of our stores.
42:20 And in 2019, we updated our cold drink lids to eliminate the need for straws,
42:26 reducing the overall plastic by about 9% and increasing the recyclability.
42:32 And this past September Starbucks was recognized by the environmental protection agency
42:37 as a green power leader for having 100% renewable energy in our company owned stores.
42:45 And our CEO, Kevin Johnson routinely highlights the importance to our business of being planet positive.
42:52 So trying to set the context that Starbucks itself really believes in sustainability
42:58 and being green. Let's take a shift and look at what my team does. So I find it best to start here and help.
43:05 This is familiar to some of you, the Starbucks app.
43:10 It's a very popular app for ordering your Starbucks coffee for getting rewarded with stars and spending stars.
43:18 I'll remind you what I remind my mother all the time, I don't build the app.
43:23 This app is powered by several domains of APIs or engines, as I call them.
43:29 So the first is the loyalty engine or Starbucks rewards and allows customers to, earn stars as they drink coffee
43:36 and eventually it can earn enough to spend them on other caffeinated beverages.
43:43 Second is the commerce engine. When you go and place that order with your phone, all those call requests are coming through our platform.
43:50 We handle approximately two and a half million mobile orders on a daily basis in north America alone.
43:59 Now these systems were built in the last two or three or four years. It was an amazing Greenfield opportunity
44:06 to start from scratch, to build custom software Cloud native, using the best of breed technologies
44:14 that helped us deliver on these capabilities. I'll give a quick overview of what some of that looks like.
44:20 Lees run on AWS. We use an event driven microservice architecture.
44:25 So in general requests are coming in. They can hand it off to Kafka then they're processed asynchronously wherever possible.
44:33 Now the engineering team are building services or microservice is the smallest unit of the application
44:41 that can be built and these services get deployed into Kubernetes pods,
44:46 and we run just enough pods to be able to deliver our services to our customers.
44:52 You see a lot of these things on this diagram were part of our optimization efforts and we'll talk about some of that in just a minute.
45:00 So Starbucks has a great green mission. It's a mission that it resonates very much with me very well
45:08 and gets me very excited to go to work in the morning. I still felt a bit of a gap in my day-to-day life.
45:15 We're off building the rewards program. We're building the commerce engine. Starbucks is saying, and setting audacious green goals
45:23 but I felt there was a gap between, making the link from what we do to helping participate
45:30 in those broader statements. So I asked myself, what is the environmental impact of our systems?
45:36 Could, and should we be doing better? What is our carbon footprint and what metrics should we be looking at
45:42 as we think about reducing our impact? So I started out by reaching out
45:48 to our global sustainability team. Those are the folks who are responsible for calculating and reporting
45:53 the overall carbon footprint for Starbucks. I learned some very interesting things.
45:59 One, they use a very course spend based model for estimating annual carbon emissions.
46:05 I also learned that over 20% of our carbon footprint is attributed to dairy
46:11 and less than 1% is attributed to Starbucks technology.
46:16 So one of my key takeaways, well, I'm definitely putting oat milk in my macho lattes from now on
46:23 technology has very low impact compared to other parts of the business.
46:29 While that 1% seems low, we are a very large business. So 1% of a lot is still quite a big number.
46:37 And we have opportunities to improve that as well. Another interesting takeaway
46:42 is that because technology tends to be efficient by nature, we can leverage technology to actually offset
46:49 and build projects, to reduce our carbon footprint outside of technology.
46:56 I also learned that these annual spend model isn't granular enough or timely enough for me to make architectural decisions,
47:03 to be able to make a change test and see how it impacted our carbon footprint.
47:11 So ahead of mine, this dashboard, what if we can get near real-time metrics, if we can, perform a change and test
47:20 and see whether or not we actually had positive or negative impact on our actual carbon footprint
47:27 or our customer experience. So we did this during hack week. I partnered with our sustainability team
47:34 analyzing AWS usage data Now I had no way to get any access to our actual carbon emissions.
47:41 So we were left with figuring what are the right proxy metrics to look at to get a good understanding of our impact?
47:49 Well, some easy ones are costs, cost is directly related to how much you're consuming in the Cloud.
47:55 So we can get costs down. We probably having a lower impact on the planet. And similar to that, we look at CPU hours normalized
48:03 based on the size of an instance. So making many assumptions and educated guesses,
48:09 we built a dashboard that kind of gave us some numbers and some directional ideas
48:15 about how well we're doing. Knowing the numbers, the absolute numbers were completely garbage, but relative from one month to the next to the next,
48:24 we can figure out what direction we were going. And directional is really what we're looking for.
48:29 Now, maybe perhaps more interesting than the actual carbon footprint, which I was really still going after is what opportunities do we have
48:36 for being even more efficient. So we built in, everything we can think of that would,
48:41 we could do to potentially reduce consumption, be greener. Some of these are very obvious
48:47 and there's so many other areas what are quite nuanced. I mean, created this green score to show us what the gap is
48:53 for what other opportunities we have to be greener.
48:59 At that point, we were entirely on our own and just making stuff up and doing the best we can. We rent then reached out to AWS's sustainability team
49:06 who was extremely helpful to help us validate some of our assumptions to make sense and provide a lot of recommendations
49:14 and some data for us to help make better decisions. The one thing we did,
49:21 we went through a very early version of the sustainability pillar, the well-architected review.
49:26 We had been through many traditional well-architected reviews in the past and knew that we were pretty tightly optimized
49:33 just from a performance and cost perspective. My intuition told me that we were also having a pretty good a lower impact on the environment
49:42 as a result of those optimizations we already done. We also got an early version
49:49 of some of the carbon footprint tool data sent to us and learnt that from 2019 to 2020,
49:56 our actual carbon footprint was reduced by about 32%. Now, throughout this entire time,
50:03 our business was growing like crazy. We had several record breaking mobile order days in 2020.
50:10 And we went from 50 stores with our commerce engine to every store in north America, by the end of that year.
50:21 So I shared this with our leadership, our CTO, and our chief sustainability officer
50:26 that, Hey, look there and we reduced our footprint. And by the way, we saved money and our business
50:31 has grown like crazy. And his response was, this is a big challenge.
50:37 And technology actually offers some really interesting opportunities here where we can divorce the carbon reduction
50:44 from business growth. Business grows, carbon goes down in these particular cases,
50:50 the more we optimized. That's a legitimate win-win situation right there.
50:56 So looking back a little bit, now, some of this, again, with hindsight, when we sort of just did the analysis and how do we pull some of that off?
51:05 So first of all, it was extremely important to have great observability, great service level objectives. We couldn't really know if we were improving or not
51:13 if we didn't know where our targets were, where we were headed.
51:19 So we depend heavily on Datadog in our case, but whatever tool that does the job we'll do it,
51:25 but it's important to have good observability and have an understanding of what your customers expect
51:31 from their experience and what your SLAs are. We also knew that we were quite efficient by design.
51:37 Cloud Native was cleaner, reportedly up to 88% cleaner than a traditional data center.
51:45 We rely heavily on Kubernetes, which allows us to densely pack our services onto the infrastructure beneath it,
51:51 really getting us very high utilization. We use binary protocols,
51:57 GRPC is super efficient for communicating between our services.
52:03 Many of our services are written in very efficient Scala.
52:09 And we also with cost in mind are constantly optimizing.
52:14 So I like to just say right-size everything all the time, make sure that if we are not utilizing resource,
52:20 but we're paying for it, let's get rid of it. We don't need that. Rightsize instance types
52:26 that we learned a lot about making sure that we can match an EC2 instance type two
52:32 with a workload and there's so many EC2s there's one, that's perfect for the job.
52:38 And it made things along those lines. You can see some examples here of where we do some auto-scaling tuning.
52:44 We found that some of our services were asking for too much compute than what they actually needed.
52:50 So based on the data, we were able to turn that down a little bit and have the auto scaling curve much more closely
52:55 and more smoothly matched the actual demand curve. Some less obvious learnings.
53:01 It's a little bit nuanced areas. And we're already hinted at actually by my predecessors here.
53:07 Some regions are greener than others lower carbon regions.
53:12 So if we can get access and understand where those are, all other things being equal, we might as well run these workloads
53:18 in those lower carbon regions. We've heard a lot about Gravitas, this week.
53:25 We switched some of our incidents types to arm. And I think the cost was about the same performance was the same.
53:31 We did it for no other reason than it was more energy efficient. Spot is an interesting one as well.
53:38 From a customer perspective, we're still using as much compute as we were, whether we use spot or not.
53:44 It's an obvious target for optimizing by costs, but we believe it's the right thing to do
53:49 in addition to the cost savings, it allows the Cloud provider as we saw earlier, to get much higher utilization
53:56 of their infrastructure, it's already running, let's just use it. And another one we learned,
54:01 which was also something you don't normally think about is, we have backup jobs running on a regular basis.
54:07 And by default, it might run at the top of each hour, very predictably. But again, if every single customer of Amazon web services
54:16 did that, and they do, I think have a crazy amount of peak.
54:21 So we went and added kind of randomization to the time that some of these jobs run.
54:30 Now, we talk mostly about Cloud but we're not unique here. We are a part of a much larger ecosystem. Data centers, Data transfer, Vendors,
54:38 Cloud, End-user devices, each one of these areas, no matter what part of the business you're in,
54:44 if you're making decisions about how to configure data center, how are you gonna build an application,
54:51 how are you going to write a contract with the vendor, there is something you can do to help improve the sustainability of your workloads.
55:00 So, as we already said, moving the data to the Cloud is much more efficient.
55:06 You can favor vendors that have strong sustainability statements put out there in public.
55:12 Data transfer is interesting because we have to move data back and forth, but let's make fewer calls. Let's make smaller calls.
55:18 Let's shorten those calls and reduce our impact overall. I find end-user devices
55:24 would be one of the most interesting areas here, because whatever it is, you're building a website, you're building a mobile app. It lands on your end user screen.
55:31 And there's so many different ways to interpret and understand how interview's consumed on that screen.
55:37 So for example, I like dark mode. Screens consume energy and sometimes the brighter they are,
55:43 the more drain on the battery it is. Some colors require, on some screens require more energy.
55:50 So dark mode is one way to potentially reduce the energy need for displaying a simple application or whatever it is.
55:56 Blocking bots, bots, hitting your website are just complete wasted bots you don't want to have, to say,
56:02 and even good bots. You can tell them don't come back for another week and don't every day.
56:08 Lazy loading, reducing CPU, reducing page weight, these are all opportunities we have for improving.
56:14 Looking at a website, for example, there's all sorts of tools out there for helping to understand, how green a website is.
56:21 And you look at some of these safari tools at the bottom, you can see the CPU profiles spike.
56:27 It might be interesting to look at that what's causing that spike? When sometimes when you hear your laptop fan
56:33 worrying and worrying and worrying, it might be your laptop saying help, I can't do this anymore.
56:44 Let's not auto play a video that is off the screen below the fold anyway.
56:50 So there's a lot of things to think about. Some of these might seem like small negligible tweaks,
56:55 but at a large scale with lots of volume, it can have a major impact overall. Plus you probably get a snap, your user experience.
57:04 So real quick, I think we're low on time here. Overall, I believe that we should factor in carbon
57:10 as just a cost and part of our total cost of ownership as I like to call it,
57:15 and please, I beg for forgiveness already. TCO2.
57:21 If we were to consider carbon as the thing we're optimizing for, and that's what it costs,
57:26 we're making trade-offs. If carbon was the thing, what would we do? Would we have make different choices?
57:32 in the Cloud, it's easy 'cause we tend to also align very closely with costs, but that's not always the case
57:38 in different parts of the stack. So what do we do? I think it's very important to get the word out
57:45 just in your company, start having conversations, get people excited about this. I find people are already excited about this
57:51 when I bring it up. We recently started a greener technology Gilda at Starbucks, across the organization to help bring ideas together
57:59 and see how we can start measuring and really have an impact overall.
58:04 Make sure you include green goals in your projects as a non-functional requirement. Target low-hanging fruits,
58:10 accelerate migration to the Cloud. There's lots of things we can do. And when you have your technology in order,
58:17 you can use technology outside of technology to have a better impact. And here's just an example of one of the reasonable cup programs.
58:24 Kiosks, you can drop a cup in, it gets washed automatically, and you can get a couple stars as a reward for that.
58:32 And lastly, a couple great people I met along the way. ClimateAction.tech is a great slack community.
58:38 And through that group, I learned about Tom Greenwood, "Sustainable Web Design" book.
58:43 It's a great book and I greatly enjoyed it. So I'm going to hand it back to Adrian to wrap us up.
58:55 Thanks very much. Thanks Drew, and really appreciate all the effort you put in to come here
59:01 and present for us and the wonderful work that you've been doing.
59:06 So just sort of divide this as a summary. If you're on the development site, you can optimize code, choose faster languages,
59:14 fishing algorithms, do a bunch of things to just speed up your code, peat up the way you've building it.
59:20 Then the operations side, you can configure systems to have the right instance types, get high utilization, automation,
59:28 things like auto-scaling. Worry about these over specified requirements, archive and delete data sooner
59:34 you duplicate, be very careful about times and locations. So all of these different things,
59:40 but make sure you're doing this at scale. And you don't want to spend a lot of time optimizing something that is actually so small
59:47 that you're spending more carbon thinking about it, then you're actually saving. So this really makes a huge difference at scale,
59:53 but don't worry about the small things. Just want to mention this was announced yesterday,
1:00:01 AWS Customer Carbon Footprint Tool it's gonna come soon.
1:00:06 It includes the full cost that we'll show you how your sustainability investments go down over time. And we've talked a little bit about
1:00:13 sort of choosing regions. So just very, very broad sort of things that you'll see here. Europe is pretty low carbon already.
1:00:20 Like there's really not a lot you can do there to reduce the carbon because it's worth saving the energy,
1:00:26 but it's largely a low carbon environment. The US, we're buying lots of energy.
1:00:32 Carbon is going down really quickly. So yeah, it's on a good path.
1:00:39 Asia is more problematic. There's a lot of very high carbon sources and it's very hard to buy renewable power there.
1:00:47 And this is all very well known in the industry, but just to think about, if you're going to optimize a workload first
1:00:54 deploy that optimization in Asia, if you have a nation workload that will make more difference than deploying it in Europe.
1:01:01 Just think about it in those terms. And when you get access to the tool, you'll be able to see the actual behavior and where your competence is going.
1:01:08 So that's pretty much it, just a call to action. One of the things you can do when you go back
1:01:14 to your companies is just us asking questions when you're doing planning discussions. When you set goals for 2022,
1:01:20 what are you doing about sustainability? Find areas where there's the biggest opportunity,
1:01:25 to make a difference, really focus on where this is going to be a big thing. And then collect, share
1:01:31 your sustainability optimization learnings, come back reinvent next year and tell us the same stories
1:01:37 that we've heard from Starbucks from Drew. What have you done and how have you done it and what techniques have you learnt?
1:01:44 So thank you very much. There are some other sessions you can go look up from re:Invent and we had a hackathon
1:01:51 and a few other special screenings of videos that you can go find as well.
1:01:56 So thank you very much. (audience applauding)