AWS S3 .NET Client High Memory Usage

AWS S3 .NET Client High Memory Usage

hackernoon.com hackernoon.com4 years ago in #Dev Love58

Originally published by Indy Singh on June 21st 2018 Reducing AWS S3 .NET client LOH allocations by 98% Contents Problem discovery Why is it a problem? Introducing the best magic number — 81,920 Idle hands Just one more thing TLDR — Give me the good stuff Footnotes Problem discovery One of the things we do at Codeweavers is help people find their next vehicle. That usually involves customers seeing what vehicle they are buying — I mean, would you buy a car without seeing what it looks like? The application that holds this responsibility is the worst offender for obscene amounts of allocations, time spent in GC, and generally eating RAM like the Cookie Monster eats well…cookies. Every now and then we like to take a memory dump of this application from our production environment. We have done this enough times that we have automated the most common diagnostics steps we take and bundled them into a little tool¹ called ADA (Automated Dump Analysis). If you are interested you can find the tool here and all the code talked about in this article here. One of the analysers that we run is to dump all the byte[] arrays found on the Large Object Heap (LOH). After running that analyser against our eight gigabyte memory dump, we found several hundred byte[] arrays with a length of 131,096 or 131,186. Well that is pretty odd. Opening some of the files in Notepad just presented us with lots of random characters. Throwing the scientific method out of the window for a second, I decided to mass rename all the dumped byte[] arrays to *.jpg – hey presto some of the files were now displaying thumbnails! On closer inspection, around 50% of the files were images. The other 50% failed to open as an image at all. Opening a handful of the non-image files in Notepad showed that they all had a line similar to this right at the beginning of the file:- 0;chunk-signature=48ebf1394fcc452801d4ccebf0598177c7b31876e3fbcb7f6156213f931b261d Okay, this is beginning to make a little more sense. The byte[] arrays that have a length of 131,096 are pure images. The byte[] arrays that are not images have a length of 131,186 and have a chunk-signature line before the rest of the contents. I would guess the signature is a SHA256 hash of the contents. Before we go any further, it is worth establishing how busy this application is with image processing. All of our image processing is distributed across our farm using AWS SNS and SQS. Using CloudWatch Metrics we can see that easily:- Okay, so fairly busy. It is worth noting that before any performance centric work is carried out, always establish how often the code is hit and the current costs. If a code path has a high cost (e.g. takes twenty seconds) but is only hit once a day, then it is not worth investigating. However, if the same code path is hit a lot (e.g. a million times a day) then it is definitely worth investigating. At this point I had two culprits in mind. We have already established the application in question does a lot of image processing.  » Read More

Like to keep reading?

This article first appeared on hackernoon.com. If you'd like to keep reading, follow the white rabbit.

View Full Article

Leave a Reply