Skip to main content

Optimizing Data Centers Through Machine Learning

Google has published a paper outlining their approach on using machine learning, a neural network to be specific, to reduce energy consumption in their data centers. Joe Kava, VP, Data Centers at Google also has a blog post explaining the backfround and their approach. Google has one of the best data center designs in the industry and takes their PUE (power usage effectiveness) numbers quite seriously. I blogged about Google's approach to optimize PUE almost five years back! Google has come a long way and I hope they continue to publish such valuable information in public domain.



There are a couple of key takeaways.

In his presentation at Data Centers Europe 2014 Joe said:  
As for hardware, the machine learning doesn’t require unusual computing horsepower, according to Kava, who says it runs on a single server and could even work on a high-end desktop.
This is a great example of a small data Big Data problem. This neural network is a supervised learning approach where you create a model with certain attributes to assess and fine tune the collective impact of these attributes to achieve a desired outcome. Unlike an expert system which emphasizes an upfront logic-driven approach neural networks continuously learn from underlying data and are tested for their predicted outcome. The outcome has no dependency on how large your data set is as long as it is large enough to include relevant data points with a good history. The "Big" part of Big Data misleads people in believing they need a fairly large data set to get started. This optimization debunks that myth.

The other fascinating part about Google's approach is not only they are using machine learning to optimize PUE of current data centers but they are also planning to use it to effectively design future data centers.

Like many other physical systems there are certain attributes that you have operational control over and can be changed fairly easily such as cooling systems, server load etc. but there are quite a few attributes that you only have control over during design phase such as physical layout of the data center, climate zone etc. If you decide to build a data center in Oregon you can't simply move it to Colorado. These neural networks can significantly help make those upfront irreversible decisions that are not tunable later on.

One of the challenges with neural networks or for that matter many other supervised learning methods is that it takes too much time and precision to perfect (train) the model. Joe describing it as a "nothing more than series of differential calculus equations " is downplaying the model. Neural networks are useful when you know what you are looking for - in this case to lower the PUE. In many cases you don't even know what you are looking for.

Google mentions identifying 19 attributes that have some impact on PUE. I wonder how they short listed these attributes. In my experience unsupervised machine learning is a good place to short list attributes and then move on to supervised machine learning to fine tune them. Unsupervised machine learning combined with supervised machine learning can yield even better results, if used correctly.

Comments

Popular posts from this blog

Reveiw: Celluon Epic Laser Keyboard

The Celluon Epic is a Bluetooth laser keyboard. The compact device projects a QWERTY keyboard onto most flat surfaces. (Glass tabletops being the exception) You can connect the Epic to vertically any device that supports Bluetooth keyboards including devices running iOS , Android , Windows Phone, and Blackberry 10. On the back of the device there is a charging port and pairing button. Once you have the Epic paired with your device it acts the same as any other keyboard. For any keyboard the most important consideration is the typing experience that it provides. The virtual keyboard brightness is adjustable and is easy to see in most lighting conditions. Unfortunately the brightness does not automatically adjust based on ambient light. With each keystroke a beeping sound is played which can be turned down. The typing experience on the Epic is mediocre at best. Inadvertently activating the wrong key can make typing frustrating and tiring. Even if you are a touch typist you'll still ...

Emergent Cloud Computing Business Models

The last year I wrote quite a few posts on the business models around SaaS and cloud computing including SaaS 2.0 , disruptive early stage cloud computing start-ups , and branding on the cloud . This year people have started asking me – well, we have seen PaaS, IaaS, and SaaS but what do you think are some of the emergent cloud computing business models that are likely to go mainstream in coming years. I spent some time thinking about it and here they are: Computing arbitrage: I have seen quite a few impressive business models around broadband bandwidth arbitrage where companies such as broadband.com buys bandwidth at Costco-style wholesale rate and resells it to the companies to meet their specific needs. PeekFon solved the problem of expensive roaming for the consumers in Eurpoe by buying data bandwidth in bulk and slice-it-and-dice-it to sell it to the customers. They could negotiate with the operators to buy data bandwidth in bulk because they made a conscious decision not to st...

Rise Of Big Data On Cloud

Growing up as an engineer and as a programmer I was reminded every step along the way that resources—computing as well as memory—are scarce. The programs were designed on these constraints. Then the cloud revolution happened and we told people not to worry about scarce computing. We saw rise of MapReduce, Hadoop, and countless other NoSQL technology. Software was the new hardware. We owe it to all the software development, especially computing frameworks, that allowed developers to leverage the cloud—computational elasticity—without having to understand the complexity underneath it. What has changed in the last two to three years is a) the underlying file systems and computational frameworks have matured b) adoption of Big Data is driving the demand for scale out and responsive I/Os in the cloud. Three years back, I wrote a post, The Future Of The BI In Cloud  where I had highlighted two challenges of using cloud as a natural platform for Big Data. The first one was to create a lar...