<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[Posts tagged 'AWS' — Typing with mittens on]]></title><description><![CDATA[Rachel Evans writes about tech, Denmark, and probably other stuff]]></description><link>https://rachelevans.org/blog/tag/aws/</link><image><url>https://rachelevans.org/blog/assets/favicon.png</url><title>Posts tagged &apos;AWS&apos; — Typing with mittens on</title><link>https://rachelevans.org/blog/tag/aws/</link></image><generator>RSS for Node</generator><lastBuildDate>Wed, 18 Feb 2026 09:07:06 GMT</lastBuildDate><atom:link href="https://rachelevans.org/blog/tag/aws/rss/" rel="self" type="application/rss+xml"/><pubDate>Wed, 18 Feb 2026 09:07:06 GMT</pubDate><copyright><![CDATA[Copyright 2026 Rachel Evans]]></copyright><language><![CDATA[en-gb]]></language><managingEditor><![CDATA[Rachel Evans]]></managingEditor><webMaster><![CDATA[Rachel Evans]]></webMaster><ttl>180</ttl><item><title><![CDATA[Not helpful, “aws s3 sync”]]></title><description><![CDATA[The aws s3 sync "--metadata-directive" option: what does it do? Does it work? AWS themselves aren't clear on the matter...]]></description><link>https://rachelevans.org/blog/not-helpful-aws-s3-sync/</link><guid isPermaLink="false">b8c55e82c0e8</guid><category><![CDATA[technology]]></category><category><![CDATA[AWS]]></category><dc:creator><![CDATA[Rachel Evans]]></dc:creator><pubDate>Mon, 24 Apr 2017 08:54:24 GMT</pubDate><content:encoded><![CDATA[<p>I have a requirement to change the Content-Type / Cache-Control headers of a load of objects in S3. At the API level, there&#x27;s no way of modifying the metadata of an existing object — rather, you create a new object with the desired metadata. Of course, if this new object is in the same bucket and has the same key as the old object, it&#x27;ll effectively overwrite it. You don&#x27;t have to re-upload your data if you don&#x27;t want to — you can copy the data from the old object to the new one.</p><p>Instead of using the API directly, various tools already exist which encapsulate this behaviour. For example, the aws command line offers “aws s3 sync”. So I&#x27;m wondering if “aws s3 sync” might be the tool for the job.</p><p>But then we come to this gem in the help text:</p><blockquote>--metadata-directive (string) Specifies whether the metadata is copied from the source object or replaced with metadata provided when copying S3 objects. Note that if the object is copied over in parts, the source object&#x27;s metadata will not be copied over, no matter the value for --metadata-directive, and instead the desired metadata values must be specified as parameters on the command line. Valid values are COPY and REPLACE. If this parameter is not specified, COPY will be used by default. If REPLACE is used, the copied object will only have the meta- data values that were specified by the CLI command. Note that if you are using any of the following parameters: --content-type, content-lan- guage, --content-encoding, --content-disposition, --cache-control, or --expires, you will need to specify --metadata-directive REPLACE for non-multipart copies if you want the copied objects to have the speci- fied metadata values.</blockquote><p>Apart from being horrible to read, there&#x27;s a big problem with this. Note the phrases “Note that if the object is copied over in parts” and “for non-multipart copies”: the behaviour varies depending on whether or not multipart copies are in use.</p><p>So, <em>are</em> multipart copies in use?</p><p>Well, we&#x27;re not told. The S3 maximum size for non-multipart uploads is 5GB, so we know that for objects over 5GB, multipart uploads <em>must</em> be used, because that&#x27;s the only option. But for smaller objects?</p><p>¯\_(ツ)_/¯</p><p>So the help text explaining <code>--metadata-directive</code> tells us that the behaviour of this option can vary, depending on an implementation detail which is not revealed to us.</p><p>Here&#x27;s my attempt to reword that help text to be (a) clearer, and (b) more honest:</p><pre>--metadata-directive (string)

Valid values are COPY (which is the default), and REPLACE. Specifies
whether the metadata is copied from the source object (&quot;COPY&quot;), or
replaced with metadata provided on the command line (&quot;REPLACE&quot;) when
copying S3 objects.

Note that &quot;COPY&quot; does not work if multipart uploads are used, which is
definitely the case for objects larger than 5GB, and might be the case
for smaller objects too — good luck!</pre><p>Not helpful.</p>]]></content:encoded></item><item><title><![CDATA[The AWS S3 Inventory Service: don't end the destination prefix with “/”]]></title><description><![CDATA[If you end the destination prefix with "/", then you'll end up with an unusable manifest.]]></description><link>https://rachelevans.org/blog/the-aws-s3-inventory-service-dont-end-the-destination-prefix-with-slash/</link><guid isPermaLink="false">8630dfb13c88</guid><category><![CDATA[technology]]></category><category><![CDATA[AWS]]></category><dc:creator><![CDATA[Rachel Evans]]></dc:creator><pubDate>Fri, 21 Apr 2017 10:00:16 GMT</pubDate><content:encoded><![CDATA[<div><p>This started out as a longer blog post, but then a lot of it boiled down to “read the fine documentation, Rachel”. So here&#x27;s the short version.</p><p>Launched in December 2016, S3&#x27;s Inventory Service is an alternative to using the ListObjects / ListObjectsV2 APIs for enumerating the objects in a bucket. You put an inventory configuration to your bucket (broadly speaking: which bit of S3 to list, where to put the results, and how often to do it), then sit back and wait for S3 itself to do all the hard work, so you don&#x27;t have to. Great!</p><p>The <a href="http://docs.aws.amazon.com/AmazonS3/latest/dev/storage-inventory.html#storage-inventory-location" rel="noopener ugc nofollow" target="_blank">documentation states where the inventory output goes</a>:</p><pre><span><em>destination-prefix</em>/<em>source-bucket</em>/<em>config-ID</em>/<em>YYYY-MM-DDTHH-MMZ</em>/manifest.json<br/><em>destination-prefix</em>/<em>source-bucket</em>/<em>config-ID</em>/<em>YYYY-MM-DDTHH-MMZ</em>/manifest.checksum</span></pre><p>And for the sake of brevity, let&#x27;s cut to the chase: if you end your prefix with a “/” (either accidentally, or because like me you think you&#x27;re being smart whereas in fact you simply haven&#x27;t read the docs — good going, Rach), then due to a bug in the S3 Inventory service, your inventory will not be usable.</p><p>Specifically, I ended up with objects in S3 with keys like this:</p><pre>s3-inventories//media/rachel-test-inventory/data/6eabc318-5ee0-41d9-b32b-a12b40a6f271.csv.gz
s3-inventories//media/rachel-test-inventory/data/b7dff5ea-c83d-4879-bc2a-0d0ced298356.csv.gz</pre><p>whereas the manifest I got contained this (line breaks added for clarity):</p><pre>{
  &quot;files&quot;: [
    {
      &quot;key&quot;: &quot;s3-inventories/media/rachel-test-inventory/
                data/6eabc318-5ee0-41d9-b32b-a12b40a6f271.csv.gz&quot;,
      &quot;size&quot;: 16486333,
      &quot;MD5checksum&quot;: &quot;3c94f6eed1fc3c2d057c098f355afffc&quot;
    },
    {
      &quot;key&quot;: &quot;s3-inventories/media/rachel-test-inventory/
                data/b7dff5ea-c83d-4879-bc2a-0d0ced298356.csv.gz&quot;,
      &quot;size&quot;: 20147436,
      &quot;MD5checksum&quot;: &quot;f0b39e0d85f0f5fb11bc5be73ecc26cf&quot;
    }
  ]
}</pre><p>The problem being that those double-slashes in the keys have become single slashes. On a Linux-ish filesystem, this would make no difference; on S3, it makes all the difference. The keys given in the manifest simply do not exist.</p><p>tl;dr: There&#x27;s a bug in the S3 Inventory service which means that manifests are broken if the destination prefix ends with “/”. Solution: don&#x27;t end your destination prefixes with “/”.</p></div>]]></content:encoded></item><item><title><![CDATA[Save money and be tidy with s3-upload-cleaner]]></title><description><![CDATA[Amazon Web Services (AWS) S3 is a popular, highly-scalable object storage service. It's used by a lot of big companies, including the one I work for. But it's very easy gradually to accumulate billable "invisible" storage.]]></description><link>https://rachelevans.org/blog/save-money-and-be-tidy-with-s3-upload-cleaner/</link><guid isPermaLink="false">7043b8b5332e</guid><category><![CDATA[technology]]></category><category><![CDATA[AWS]]></category><dc:creator><![CDATA[Rachel Evans]]></dc:creator><pubDate>Tue, 01 Dec 2015 12:56:02 GMT</pubDate><content:encoded><![CDATA[<p>Amazon Web Services (AWS) S3 is a popular, highly-scalable object storage service. It&#x27;s used by a lot of big companies, <a href="http://www.computerweekly.com/news/2240219866/Case-study-How-the-BBC-uses-the-cloud-to-process-media-for-iPlayer" rel="noopener ugc nofollow">including the one I work for</a>.</p><p>Getting data — especially large files — into S3 uses a mechanism called Multipart Uploads. For example, to upload a multi-gigabyte file to S3, you might make a sequence of calls like so:</p><ol><li>CreateMultipartUpload</li><li>UploadPart (1 .. n times)</li><li>CompleteMultipartUpload</li></ol><p>On the “complete” call, S3 assembles your parts together to form a single object, that then appears in the bucket. Or, you can call “AbortMultipartUpload” to abandon it, and throw away the parts.</p><p>So what&#x27;s the catch?</p><p>The catch is that it&#x27;s very easy to forget to ever call either CompleteMultipartUpload or AbortMultipartUpload. And if you neither complete nor abort the upload, then any parts you have uploaded just sit around in S3, waiting. Forever. It&#x27;s relatively hard to <em>see</em> those parts, mind — they don&#x27;t show up in the regular bucket listing. But they are there, and they are costing you money.</p><p>So what&#x27;s the solution?</p><p>Enter <code>s3-upload-cleaner</code>. Simply put, it scans your buckets looking for stale (that is, started a long time ago) incomplete multipart uploads — the premise being, if you haven&#x27;t completed an upload after, say, a week, then you never will — and aborts them. Thus, periodically running s3-upload-cleaner keeps your account&#x27;s multipart uploads under control, and helps keep your bill down.</p><p>(I&#x27;m a little surprised that this isn&#x27;t a native feature of S3, and to be honest, I expect that one day, it will be.)</p><p>Here it is running for a single bucket, and finding nothing to clean:</p><pre>$ sudo apt-get install nodejs npm
$ npm install s3-upload-cleaner aws-sdk
$ export AWS_ACCESS_KEY_ID=…
$ export AWS_SECRET_ACCESS_KEY=…
$ nodejs ./node_modules/s3-upload-cleaner/example/minimal.js
Running cleaner
Clean bucket my-bucket-name
Bucket my-bucket-name is in location eu-west-1
Bucket my-bucket-name is in region eu-west-1
Running cleaner for bucket my-bucket-name
$</pre><p>The code comes with a minimal bootstrap script, though you are encouraged to use your own if you wish.</p><p>To call out of a few of its features:</p><ul><li>it&#x27;s multi-region aware (it will attempt to process all of your buckets, no matter what region they are in);</li><li>it can be configured to process only some buckets, or only some regions, or only some keys;</li><li>the threshold for what counts as “stale” is configurable — the minimal bootstrap script uses 1 week as the cutoff age;</li><li>when a stale upload is found, it emits logging data in json form;</li><li>it can be run in “dry run” mode, where all the scanning and logging is performed, but the abort itself is not.</li></ul><p>Finally, here&#x27;s an example of one of its log entries:</p><pre>[
  {
    &quot;event_name&quot;: &quot;s3uploadcleaner.clean&quot;,
    &quot;event_timestamp&quot;: &quot;1448495889.529&quot;,
    &quot;bucket_name&quot;: &quot;my-bucket-name&quot;,
    &quot;upload_key&quot;: &quot;bigfile.mpg&quot;,
    &quot;upload_initiated&quot;: &quot;1447888220000&quot;,
    &quot;upload_storage_class&quot;: &quot;STANDARD&quot;,
    &quot;upload_initiator_id&quot;: &quot;arn:aws:iam::123456789012:user/SomeUser&quot;,
    &quot;upload_initiator_display&quot;: &quot;SomeUser&quot;,
    &quot;part_count&quot;: &quot;135&quot;,
    &quot;total_size&quot;: &quot;2831189760&quot;,
    &quot;dry_run&quot;: &quot;true&quot;
  }
]</pre><p>s3-upload-cleaner typically only takes a few seconds to run, and doesn&#x27;t need to be run very often, so this makes it perfect to run via a scheduled AWS Lambda function.</p><p>You can find the <a href="https://github.com/rvedotrc/node-s3-upload-cleaner" rel="noopener ugc nofollow">code on github</a> and the <a href="https://www.npmjs.com/package/s3-upload-cleaner" rel="noopener ugc nofollow">package on npm</a>.</p>]]></content:encoded></item><item><title><![CDATA[Why I gave the AWS keynotes a miss]]></title><description><![CDATA[The keynote presenters at AWS re:Invent need to focus more on the audience, and less on big launches.]]></description><link>https://rachelevans.org/blog/why-ill-pass-on-the-aws-keynotes/</link><guid isPermaLink="false">4b9c8166bf3</guid><category><![CDATA[AWS]]></category><category><![CDATA[presentations]]></category><dc:creator><![CDATA[Rachel Evans]]></dc:creator><pubDate>Thu, 13 Nov 2014 10:08:40 GMT</pubDate><media:content url="https://rachelevans.org/blog/content/images/2014/11/aws-reinvent-keynote.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://rachelevans.org/blog/content/images/2014/11/aws-reinvent-keynote.jpg" alt="Why I gave the AWS keynotes a miss"><p>Fancy going to a three-hour-long presentation which is in parts self-contradictory, and includes half a dozen product launches with no coherent target audience? I promise, you <em>will</em> find some of it boring.</p><p>No? Not tempted?</p><p>Me neither.</p><p>Last year I went to the <a href="https://reinvent.awsevents.com/" rel="noopener ugc nofollow">Amazon Web Services “re:Invent” conference</a> in Las Vegas. It&#x27;s five days of a rather tightly packed mix of certification sessions, training, product launches, all sorts of “breakout” sessions (some by Amazon themselves, and some by guest speakers — i.e. customers), the vendor expo, and of course what I&#x27;ll somewhat euphemistically call the “after-hours” events.</p><p>The breakout sessions cover a wide range of topics — individual AWS technologies, both at basic and advanced levels; and how customers are using AWS, in all sorts of diverse industry sectors. Each session is about 45 minutes long, and you can pick and choose which ones to go to. Great!</p><h2>Keynotes</h2><p>The “keynotes” though, are a very different affair. On two successive days, the keynotes (yes, plural: two different keynotes) are each 90 minutes long.</p><p>For me, a keynote should be a summary of the key themes, typically 30–45 minutes long (I&#x27;m sure I&#x27;ve seen a definition of the word, to this effect), so Amazon&#x27;s keynotes immediately ring at least two alarm bells:</p><ul><li>Why are there two keynotes? It&#x27;s not just two opportunities to see the same speech: it&#x27;s two different presentations. Why are there two different summaries of the key themes?</li><li>Why is each one so long? <em>Each</em> keynote seems to be two to three times longer than is typical, and if the keynotes are <em>different</em> then this means that in total it&#x27;s effectively up to six times longer.</li></ul><p>If your summary is three hours long, then you need to summarise your summary.</p><p>But the explanation of course is that AWS “keynotes” <em>aren&#x27;t</em> keynotes. Yes, they include a summary of some themes; but then they also include some more in-depth look at those themes, and the launches of some new AWS products that tie in with those themes. They&#x27;re really an all-in-one mega-presentation, split into two halves.</p><h2>Information overload</h2><p>Where Amazon get it right with the “breakout” sessions, they get it very wrong with their keynotes: they include a diverse range of subjects all in the same presentation (e.g. software deployment, and also financial accounting) so that it&#x27;s highly unlikely that <em>anyone</em> is going to find the whole presentation interesting. And even if it <em>was</em> all interesting, ninety minutes is <em>too damn long</em>.</p><p>While they do at least break it into two halves, on successive days, that<br/>doesn&#x27;t go anything like far enough.</p><p>I&#x27;d love it if Amazon would have the keynotes and product launches follow the same model as the breakout sessions:</p><ul><li>The keynote (yes, <em>one</em> keynote), 45 minutes long, which can be a <em>summary</em> of the themes, and <em>brief</em> “teaser” announcements of the product launches;</li><li>Product launch presentations, probably around three separate sessions, each around 45 minutes long, grouped by approximate theme.</li></ul><p>Launching two separate products related to software development? That&#x27;s one session. Two products related to finance? That&#x27;s another. Two unrelated product launches left over? Well, lump &#x27;em together in a third session. (Not ideal, but still way better than the current approach).</p><p>Then we&#x27;ll be free to pick and choose which sessions we go to. Each session will be more focussed on one audience, who will therefore be more <em>engaged</em>. And the worst case is that you go to one of the “mixed” product launch sessions, and find one half interesting, and the other half not. Time wasted: 20 minutes (compared to an hour or two, currently).</p><h2>Summing up</h2><p>The keynote sessions are at odds with the style of the rest of AWS re:Invent. Whereas most of the week is dynamic, punchy, focussed, and fast-moving, the keynotes come across as over-long self-indulgent ramblings, which include what <em>should</em> be gems of interest, but hidden amongst far too much other content seemingly designed to help the attendees catch up on the sleep lost due to after-hours indulgences.</p><p>By restructuring the keynotes and product announcements into a series of separate sessions, the Amazon “big name” presenters can up their game, and reinvent themselves as the focus of something interesting — so that we might just be tempted enough to go along and listen.</p>]]></content:encoded></item><item><title><![CDATA[Managing AWS CloudFormation templates using stack-fetcher]]></title><description><![CDATA[At the AWSUKUG meetup in September I talked about Video Factory, and a tool we've created for managing stack templates.]]></description><link>https://rachelevans.org/blog/managing-aws-cloudformation-templates-using-stack-fetcher/</link><guid isPermaLink="false">4d798d406fd0</guid><category><![CDATA[technology]]></category><category><![CDATA[AWS]]></category><dc:creator><![CDATA[Rachel Evans]]></dc:creator><pubDate>Tue, 28 Oct 2014 17:14:47 GMT</pubDate><media:content url="https://rachelevans.org/blog/content/images/2014/10/bridge-building.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://rachelevans.org/blog/content/images/2014/10/bridge-building.jpg" alt="Managing AWS CloudFormation templates using stack-fetcher"><p>Last month at the <a href="http://www.meetup.com/AWSUGUK/events/194314272/" rel="noopener ugc nofollow">AWSUKUG meetup</a> I talked about Video Factory, and there was a little section there where I spoke about<a href="http://www.slideshare.net/rvedotrc/bbc-iplayer-bigger-better-faster/53" rel="noopener ugc nofollow">the tooling that we use</a> to manage all of our components. One of the tools, “stack-fetcher”, generated quite a bit of interest from the audience, and there was interest in open-sourcing it. I definitely want to do this — but we&#x27;re not quite there yet.</p><p>For now, though, I can talk about where stack-fetcher is right now, and what direction I want to take it in.</p><h2>The problem space</h2><p>“<a href="http://aws.amazon.com/cloudformation/" rel="noopener ugc nofollow">AWS CloudFormation</a> gives developers and systems administrators an easy way to create and manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion,” says the documentation. As a developer, you do this by creating a template (JSON which defines one or more desired <a href="http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-template-resource-type-ref.html" rel="noopener ugc nofollow">resources</a>), then submitting that template to CloudFormation — either via the API, or via something which wraps the API (e.g. the <a href="https://console.aws.amazon.com/cloudformation/home" rel="noopener ugc nofollow">web console</a>). Then CloudFormation goes and creates or updates your stack to match your template.</p><p>As a developer who loves automation and consistency, this leaves you with several problems:</p><ul><li>How do I generate the template JSON?</li><li>How do I generate the other JSON required by the stack (e.g. parameter values)?</li><li>If I was to push that JSON to CloudFormation — i.e. apply the change — how do I know what changes I&#x27;m actually pushing?</li><li>Can I push some changes but not others?</li><li>Once I know what I want to push, how do I do so?</li></ul><h2>A little BBC Media Services history</h2><p>To put all of the above into a specific story: in BBC Media Services, we found during the development of Video Factory that we were managing more and more stacks, and by the start of this year we had something like 100 stacks to manage in each of our three environments.</p><p>By January 2014, we had a system for generating the JSON, but different people ran the relevant tools in different ways, therefore sometimes yielding differing results. And once the JSON had been generated, we had no way of knowing in what way it was different from the stack&#x27;s existing template, so we didn&#x27;t know what we were actually changing. And finally, we had no consistent approach for actually updating the stacks with the new template — mostly we were using the web console, but not always in the same way. And even then: it&#x27;s a <em>web console</em>, so that&#x27;s just awful from a productivity and automation point of view.</p><p>Thus, stack-fetcher was created, to address all of the above problems.</p><h2>The workflow</h2><p>Once you&#x27;ve updated your source files, the workflow to update a stack consists of three steps:</p><ul><li>Run “stack-fetcher”. This generates a set of three files: <em>current</em>, <em>generated</em>, and <em>next</em>.</li><li>Use your favourite diff/merge tool to compare the <em>current</em>, <em>generated</em> and <em>next </em>files, making whatever changes you wish to <em>next</em>.</li><li>Run “stack-updater” to push <em>next</em> into CloudFormation.</li></ul><h2>The workflow in action</h2><p>Here&#x27;s a demo of a simple change, illustrating the basic workflow, and some of stack-fetcher&#x27;s strengths.</p><p>Before running stack-fetcher, We have two stacks, “resource” and “component”. The first diff has already been applied: a queue was added to the resource stack. These screenshots show the second diff being applied: to modify the IAM policy defined in the “component” stack, such that access is granted to the queue in the resource stack.</p><figure class="kg-card kg-image-card"><img src="https://rachelevans.org/blog/content/images/2014/10/stack-fetcher-before.png" class="kg-image" alt="screenshot of a terminal session"/><figcaption>Before running stack-fetcher</figcaption></figure><p>We then run <em>stack-fetcher </em>(in this example, “int” is the environment in question — integration). <em>stack-fetcher</em> retrieves the existing stack, generates the desired template, and compares the two. The summary shows “resource: same” (all in sync), and “component: DIFFERENT (20 lines)” (there are 20 lines of differences).</p><figure class="kg-card kg-image-card"><img src="https://rachelevans.org/blog/content/images/2014/10/stack-fetcher-output.png" class="kg-image" alt="screenshot of a terminal session"/><figcaption>The output of stack-fetcher</figcaption></figure><p>stack-fetcher has generated three template files per stack: <em>current, generated, </em>and <em>next</em>. Here we see the three files compared, using vimdiff:</p><figure class="kg-card kg-image-card"><img src="https://rachelevans.org/blog/content/images/2014/10/stack-fetcher-three-files-top.png" class="kg-image" alt="screenshot of a terminal session"/></figure><p>and the bottom half of the same files:</p><figure class="kg-card kg-image-card"><img src="https://rachelevans.org/blog/content/images/2014/10/stack-fetcher-three-files-bottom.png" class="kg-image" alt="screenshot of a terminal session"/></figure><p>You can see that “generated” (in the middle column) has some sections that “current” doesn&#x27;t — these is for the policy change we&#x27;re trying to make. But you can also see that “current” has some lines that “generated” doesn&#x27;t. This is because in this example, the stack in CloudFormation started off not in sync with our local copy (for example, maybe someone applied a change but neglected to commit the corresponding source).</p><p>So now we modify “next” (the right-hand file) to match whatever changes we want to apply. In this example we choose to pull in the new lines, but elect not to remove the extra, unexpected ones:</p><figure class="kg-card kg-image-card"><img src="https://rachelevans.org/blog/content/images/2014/10/stack-fetcher-merge.png" class="kg-image" alt="screenshot of a terminal session"/><figcaption>Merging the desired template into “next”, in the right-hand column</figcaption></figure><p>After saving these changes (remember, we didn&#x27;t modify “current” or “generated” — only “next”), we run <em>stack-updater</em>:</p><figure class="kg-card kg-image-card"><img src="https://rachelevans.org/blog/content/images/2014/10/stack-fetcher-apply-1.png" class="kg-image" alt="screenshot of a terminal session"/><figcaption>Running stack-updater (first time)</figcaption></figure><p><em>stack-updater</em> now warns us that it has detected a new parameter on the template (“MattressFailQueueArn” in this example): it adds this parameter, with the default value from the template, to the description file; then invites us to check this and edit the description file if we wish.</p><p>In this case the default is fine, so we just run <em>stack-updater</em> again:</p><figure class="kg-card kg-image-card"><img src="https://rachelevans.org/blog/content/images/2014/10/stack-fetcher-apply-2.png" class="kg-image" alt="screenshot of a terminal session"/><figcaption>Running stack-updater (second time)</figcaption></figure><p>Now <em>stack-updater</em> very clearly shows us the diffs between <em>current</em> and <em>next</em>: that is, if we elect to proceed, <em>these are the changes that we&#x27;re actually about to make</em>.</p><p>After confirming that we&#x27;re OK with this, <em>stack-updater</em> applies these changes, using the CloudFormation UpdateStack API:</p><figure class="kg-card kg-image-card"><img src="https://rachelevans.org/blog/content/images/2014/10/stack-fetcher-apply-3.png" class="kg-image" alt="screenshot of a terminal session"/><figcaption>Applying the changes using stack-updater</figcaption></figure><p><em>stack-updater</em> polls the stack&#x27;s status, waiting for it to reach a terminal state (i.e. not “in progress”). The stack events are displayed as they occur.</p><p>In this case the stack update completes successfully, and <em>stack-updater</em>&#x27;s work is done.</p><h2>In more detail</h2><p>stack-fetcher is a name given to a collection of scripts, one of which is itself called “stack-fetcher”. The other script that is intended to be manually invoked is “stack-updater”. There are other scripts, but one of the goals of stack-fetcher is to invoke and orchestrate those other scripts so that the user doesn&#x27;t generally have to think about them.</p><h3>stack-fetcher</h3><p>stack-fetcher&#x27;s job is to generate a set of three outputs:</p><ul><li><em>current</em> is the existing stack, fetched from CloudFormation</li><li><em>generated</em> is the stack that you want, generated from your codebase</li><li><em>next</em> is what you&#x27;re going to push back to CloudFormation using “stack-updater”</li></ul><p>When stack-fetcher runs, <em>next</em> is generated simply as a copy of <em>current</em> — that is, if you don&#x27;t edit the <em>next</em> file, then you won&#x27;t push any changes.</p><p>To make <em>generated</em>, stack-fetcher runs a series of scripts. Currently, this step is rather BBC-specific: we invoke <em>./generate-templates</em> with PYTHON_LIB set to point to part of the stack-fetcher codebase; if there&#x27;s a <em>transform</em> script, then the json is then filtered through this; then there&#x27;s a <em>cosmos-cloudformation-postproc</em> script which post-processes the json in various ways — primarily, providing defaults for the stack&#x27;s parameters.</p><p>To make <em>current</em>, stack-fetcher needs to know what stack name it should work with — and again, currently calculating this stack name is fairly BBC-specific. Once entered, the stack name is remembered via the <em>./stack_names.json</em> file, so you don&#x27;t have to calculate or enter it again. Once the stack name is known, the existing stack template and descriptor are fetched, and saved as <em>current</em>.</p><p>After this, stack-fetcher <em>normalises</em> both <em>current</em> and <em>generated</em>. The purpose of the normalisation is partly to make the files more readable, but also to get rid of differences that are meaningless. As well as whitespace reformatting and sorting object keys, the normalisation also includes CloudFormation-specific elements, such as sorting parameters, tags and outputs; removing empty arrays, if that would mean the same thing; and even re-ordering statements within IAM Policies.</p><p><em>next </em>always starts off as a copy of <em>current</em>, so that by default no changes are pushed.</p><p>Finally, stack-fetcher compares <em>current</em> and <em>generated</em> and shows a simple summary: they&#x27;re either the “SAME” or “DIFFERENT” (or, if the stack doesn&#x27;t exist yet, “NEW”); then shows some help text describing what to do next.</p><h3>diff/merge</h3><p>The help text displayed by stack-fetcher suggests using <em>vimdiff</em> to compare and edit the files, but of course you can use whatever tools you wish. The goal of this step is to update <em>next</em> to reflect what you want pushed back into CloudFormation (whilst leaving the <em>current</em> and <em>generated</em> files unchanged).</p><p>You may wish to simply review that <em>generated</em> is exactly what you want, then copy <em>generated</em> over <em>next</em> (this is probably what you want, ideally); or, you can cherry-pick, and perform more complex merges.</p><h3>stack-updater</h3><p>Once you&#x27;ve updated <em>next</em> to be as desired, you invoke <em>stack-updater</em>, with exactly the same arguments as you did for <em>stack-fetcher</em>.</p><p>If there are any differences between the set of parameters declared in the stack template, and the set of parameters passed in the stack descriptor, then stack-updater shows those differences (e.g. “You&#x27;re passing a parameter called X but it doesn&#x27;t exist”), automatically applies corrections (e.g. removing the no-longer-existent parameter), then stops, so that you can check its changes before re-running stack-updater.</p><p>Assuming the stack already exists, then stack-updater now diffs <em>current</em> against <em>next</em> — that is, it shows you the changes you&#x27;re about to push. It also displays the differences between the stack&#x27;s parameter defaults, and the actual parameter values you&#x27;re passing, so you can check which ones you&#x27;re overriding. (If the stack doesn&#x27;t currently exist, then this step is skipped, and the confirmation step up next reminds you that you&#x27;re about to create the stack).</p><p>It then asks for confirmation to proceed, and if you say yes, then the change is pushed using the CloudFormation “update stack” (or “create stack”) API, and then stack-updater polls the stack status, waiting for completion.</p><p>Finally there&#x27;s another BBC-specific step, wherein the stack can be registered in Cosmos, our deployment manager.</p><h2>Dependencies</h2><p>stack-fetcher is written in ruby, and uses the aws-sdk gem.</p><h2>Benefits</h2><p>By using this tool, we have realised several benefits:</p><ul><li>speed: Using this tool is much quicker than using the other (several) tools that we used before. There are fewer commands to type, with fewer options to remember. And probably most importantly, you never have to leave your terminal.</li><li>consistency: By automating more of the process, and by normalising the output, we now achieve more consistency: by which I mean between developers, between environments, and between components.</li><li>understanding: This tool makes it very obvious what changes you&#x27;re about to apply to live (or whatever environment you&#x27;re updating) — no more blind pasting of a load of json and hoping for the best — which means fewer mistakes.</li></ul><p>All of which means: this tool has helped us to be more productive.</p><h2>Next steps</h2><p>We need to separate out the BBC-specific parts from the rest, so that we can offer this tool out to a wider audience.</p><p>I&#x27;d like to make the “generation” phase more uniform: run a series of executables (bash, ruby, whatever — the tool should not care), where the first executable receives null input, and each subsequent tool filters the output of the previous one. So for example you might have filters which do: make the basic template; customise it for this environment; fill in parameter defaults.</p><p>I don&#x27;t have any news yet of <em>when</em> this might happen, but I certainly <em>want</em> it to happen. Please drop me a line via a comment or <a href="https://twitter.com/rvedotrc" rel="noopener ugc nofollow">on twitter</a> if you have thoughts on this — I&#x27;d love to hear your feedback.</p>]]></content:encoded></item><item><title><![CDATA[Personal highlights from the AWS Enterprise Summit]]></title><description><![CDATA[Yesterday I attended the AWS Enterprise Summit in London — I've chosen my highlights, and reflected on the summit as a whole.]]></description><link>https://rachelevans.org/blog/personal-highlights-from-the-aws-enterprise-summit/</link><guid isPermaLink="false">e491a8cdc60a</guid><category><![CDATA[technology]]></category><category><![CDATA[AWS]]></category><dc:creator><![CDATA[Rachel Evans]]></dc:creator><pubDate>Wed, 22 Oct 2014 22:32:59 GMT</pubDate><content:encoded><![CDATA[<div><p>Yesterday I attended the <a href="https://aws.amazon.com/aws-summit-2014/enterprise-summit-oct/" rel="noopener ugc nofollow">AWS Enterprise Summit in London</a>. I&#x27;ve already written about how <a rel="noopener" href="https://rachelevans.org/blog/amazon-web-services-fails-at-diversity/">it was very poor, from a diversity perspective</a>. But, it wasn&#x27;t all bad: some of the content was rather good...</p><h2>All hail the snail</h2><p>The first customer presentation was given by <a href="https://twitter.com/jodbod" rel="noopener ugc nofollow">John O&#x27;Donovan</a> of the <a href="http://www.ft.com/" rel="noopener ugc nofollow">Financial Times</a>. He told a fascinating and engaging story of the changing world in which they found themselves: with <a href="http://www.slideshare.net/AmazonWebServices/going-cloud-first-at-the-ft/6" rel="noopener ugc nofollow">print distribution in decline</a>, they needed to refocus on the net — and on future platforms and <a href="http://www.slideshare.net/AmazonWebServices/going-cloud-first-at-the-ft/10" rel="noopener ugc nofollow">devices yet to come, whatever they are</a>. John&#x27;s presentation was had a great balance of <a href="http://www.slideshare.net/AmazonWebServices/going-cloud-first-at-the-ft/26" rel="noopener ugc nofollow">information</a>, <a href="http://www.slideshare.net/AmazonWebServices/going-cloud-first-at-the-ft/20" rel="noopener ugc nofollow">insight</a>, and <a href="http://www.slideshare.net/AmazonWebServices/going-cloud-first-at-the-ft/16" rel="noopener ugc nofollow">humour</a>.</p><p>A particular highlight for me — and by the reaction from the audience, I&#x27;m going to guess for many other engineers in the audience — was <a href="http://www.slideshare.net/AmazonWebServices/going-cloud-first-at-the-ft/32" rel="noopener ugc nofollow">Chaos Snail</a>. “Like Chaos Monkey, but more chilled”, its job is to slow down I/O on certain instances, to test how software reacts to such degraded conditions. I asked John later if this tool has already been, or will be, open sourced — he says they&#x27;ve had a few requests for this, so yes they will. Good news!</p><p>John also talked about <a href="http://www.slideshare.net/AmazonWebServices/going-cloud-first-at-the-ft/35" rel="noopener ugc nofollow">Tagbot</a>, which locates and terminates untagged instances (“My team loves turning stuff off”, he said). Sounds like a blend between Chaos Monkey and Conformity Monkey.</p><h2>Maximum support</h2><p>After lunch we heard from <a href="https://twitter.com/brentjaye" rel="noopener ugc nofollow">Brent Jaye</a>, VP of AWS Support. He emphasised the value of Trusted Advisor as a way of identifying problems, and how they&#x27;re keen on building quick fix facilities into the web console. (For example: if a volume hasn&#x27;t been backed up for a long time, then highlight this as a potential problem, and show a “backup” button right there).</p><p>“We&#x27;re in the business of you spending less money with us”, he said — which has a nice ring to it.</p><p>Brent also spoke of the value of integrating AWS and the customer&#x27;s support system together; and of using Trusted Advisor and AWS Support not just via the console, but by their respective APIs. (John O&#x27;Donovan would I&#x27;m sure agree: earlier on he said “We don&#x27;t buy a product unless it has an API”. +1 on that).</p><p>Finally Brent spoke of the importance of engaging with AWS Support <em>early</em>, not just when there&#x27;s a problem.</p><h2>Auntie adapts</h2><p>Next up, <a href="https://twitter.com/rob_shield" rel="noopener ugc nofollow">Robert Shield</a> from <a href="http://www.bbc.co.uk/iplayer/" rel="noopener ugc nofollow">BBC iPlayer</a> spoke about Video Factory: how it uses AWS, the benefits realised over the previous platform, and how the BBC&#x27;s Operations function has adapted with the use of the cloud.</p><p>(I work with Robert, on the same team — I presented <a href="http://www.slideshare.net/rvedotrc/bbc-iplayer-bigger-better-faster" rel="noopener ugc nofollow">the Video Factory story</a> to the AWS UK User Group last month. So of course it should be assumed that I&#x27;m biased :-) )</p><p>However, it was obvious that the audience enjoyed it: Rob talked of the <a href="http://www.slideshare.net/AmazonWebServices/evolving-operations-for-bbc-i-player/4" rel="noopener ugc nofollow">benefits of smaller, simpler components</a>; of <a href="http://www.slideshare.net/AmazonWebServices/evolving-operations-for-bbc-i-player/6" rel="noopener ugc nofollow">how much data Video Factory shifts into S3 every day</a>; and on the importance of <a href="http://www.slideshare.net/AmazonWebServices/evolving-operations-for-bbc-i-player/11" rel="noopener ugc nofollow">automation</a> and <a href="http://www.slideshare.net/AmazonWebServices/evolving-operations-for-bbc-i-player/12" rel="noopener ugc nofollow">consistency</a>.</p><p>By re-architecting for smaller, simpler, more easily understandable components, he said, each part also became more reliable, and thus <a href="http://www.slideshare.net/AmazonWebServices/evolving-operations-for-bbc-i-player/19" rel="noopener ugc nofollow">people were more willing to look after the system</a>.</p><h2>News from the cloud</h2><p>The last customer presentation was from <a href="https://www.linkedin.com/pub/chris-birch/0/524/96b" rel="noopener ugc nofollow">Chris Birch</a> of <a href="http://www.news.co.uk/what-we-do/" rel="noopener ugc nofollow">News UK</a>. Like John and Robert before him, Chris told an entertaining and engaging story.</p><p>Much of News UK&#x27;s business is about Sunday publications, and combined with their “paywall” (he didn&#x27;t call it that, but that&#x27;s what the rest of us know it as), this meant that their traffic is highly spiked around Sunday mornings. And the old system could handle <a href="http://www.slideshare.net/AmazonWebServices/news-uk-our-journey-to-cloud/6" rel="noopener ugc nofollow">only 17 transactions per second</a>! But of course things were <em>much</em> faster on the cloud.</p><p>Part of Chris&#x27; talk was about the importance and the difficulty of assessing the Total Cost of Ownership — needed to be able to <a href="http://www.slideshare.net/AmazonWebServices/news-uk-our-journey-to-cloud/10" rel="noopener ugc nofollow">make the business case for moving to the cloud</a>. One thing I found very interesting was the idea that an application&#x27;s “App Book” (documentation on what it is, etc) should also document the app&#x27;s TCO.</p><p>There was also a nice section where Chris said that <a href="http://www.slideshare.net/AmazonWebServices/news-uk-our-journey-to-cloud/14" rel="noopener ugc nofollow">48% of their instances had no tags</a>, so it wasn&#x27;t clear what the instances were doing. However Chris also said that “It&#x27;s really boring switching stuff off”, which I have to say I <em>completely </em>disagree with!</p><h2>The two-pizza team</h2><p>Two of the speakers (sorry, I forget which ones exactly) mentioned the idea of the “two-pizza team”. Basically: a team which requires more than two pizzas will have communication problems. I like this concept — it&#x27;s a good rule of thumb that definitely matches my own experience.</p><h2>And the others…</h2><p>You may notice that I only wrote about four of the ten speakers. That&#x27;s because the other speakers very much failed to hold my attention. I enjoyed the customer talks, all of which were interesting, and engaging, and got a great reaction from the audience; but the talks from the partners, and from Amazon themselves (with the exception of Brent), seemed to be aimed very much at CxO level — at “suits”, one might say — and as such really weren&#x27;t my thing at all.</p><p>So I saw it as a summit of two opposing audiences: CxO versus techies. If the event was larger, then it would make more sense to split into two events, or two tracks in one event.</p><p>As it is, it seems to me that most people would have found half of the talks less than engaging — but, it&#x27;s only a one-day event, so that&#x27;s not such a burden.</p><h2>Wrapping up</h2><p>Overall I really enjoyed the day — the CxO-style talks weren&#x27;t for me, and I didn&#x27;t explore the partner and sponsor stands; but the customer presentations were great, and I had a good chat or two with AWS staff, and I loved swapping stories with the other attendees.</p><p>Oh, and there was <a href="https://twitter.com/pipoe2h/status/524611579145637889" rel="noopener ugc nofollow">highly practical swag</a>!</p><p>I think I&#x27;ll be back — maybe not every time, but it was a good day, and I&#x27;d be happy to do it again sometime. See you there!</p></div>]]></content:encoded></item></channel></rss>