Automating Infrastructure De-provisioning

At my last job I was asked to consolidate and automate the de-provisioning of enterprise infrastructure, primarily Virtual Machines. I believe that web apps are the way of the future, so I decided to achieve this feat with a “simple” Django application, but considering the requirements (listed below), it would be anything but simple.

Requirements

  • Authenticate users against Active Directory
  • Live search VMware vSphere for Virtual Machines
  • Automatically remove user-specified VMs
  • Clean up records of deleted VMs in other systems
  • Allow additional tasks to be added at a later date
  • Allow users to search for in-progress and completed decommissions
  • Have a full-featured API
  • Be fast

Technologies Used

  • Python 3.6
  • Django
  • Django Rest Framework
  • Django Channels
  • JavaScript
  • jQuery
  • Docker
  • Redis

Here’s a quick run down of functionality:

The Login Page displaying an “Invalid Permissions” error

The user is authenticated against AD via a custom Django Authentication Backend.  The backend also checks that the authenticated user has permission to modify certain objects in vSphere.

Decommission Request with three VMs

The primary purpose of the application is to allow users to build “Decommission Requests”.  A Decommission Request is composed of one or more VMs and associated options.  Users can search vCenter for VMs to add to the Request.  For safety checks, the application will display additional information about the VM including the machine’s IP Address and Operating System.

After submitting the request, the application will queue the Decommission Request and redirect the user to the “Your Requests” page.

Live display of tasks applied to a VM

Decommissions are dequeued and performed by Task Workers which are additional python processes that run in a different container.  By default, five workers are spawned per Worker container, but this value can be modified with environment variables and the Worker container can be scaled up seamlessly.

A single worker can decommission one VM at a time, so if there are few enough VMs in a Decommission Request, all VMs can be decommissioned concurrently.  Decommissions are built and executed as a series of steps that is dependent on what OS the VM is running.

As a Worker process is executing a decommission, the status of that decommission is displayed, live, on the UI via WebSockets.  If there are warnings or errors during a step, these are also displayed.

At any time, a user can immediately cancel a decommission, but steps that have already been performed are not rolled back.

The Application’s Full-featured API displaying Decommissioned Machines

 

 

Initial Thoughts on Solid

I recently stumbled upon Solid, a technology that claims to “re-orient the web” by allowing people to own their own data.  I think it’s a great idea, but there are problems with it.

In a few words, Solid stores all your social media data (Facebook posts, YouTube comments, etc.) in a self-hosted Solid Pod.  Participating social media sites just link to your data.  If you delete a piece of data from your Pod, it is gone forever because every social media site just stores a reference to your data.

The biggest barrier to entry for a common person is the self-hosting aspect.  To use Solid correctly, in my opinion, you must host your own Solid Pod.  Solid is all about having complete control over your own data and allowing a Solid provider to manage your Pod partially defeats the purpose of the technology.

Other problems with Solid that must be addressed before widespread adoption include Scalability, Caching, and Duplication concerns.

Scalability

If you are a popular social media user, and all your social media data is stored within your Solid Pod, your Pod may have difficulty serving consumers of your data.  The reference Solid Server implementation is in Node.js and Node is pretty fast, but I don’t recall there being anything that allows your Pod to scale.  A Pod can serving 25k requests per second may be more than enough for someone with a intermediate level of internet presence and popularity, but for celebrities it is insufficient.

Caching

To combat the scalability problem, social media sites may cache your Solid data.  This poses another problem.  Cached data is now no longer controlled by the owner of the data and defeats the idea behind Solid.  How long is data cached? One minute? One hour?  The longer the data is cached, the less control you have over it.

Duplication

Duplication concerns are similar to caching concerns, but have to do with purposely nonconforming sites duplicating your content.  Having control over your own data because it is all served from a server you control is possible in theory, but any site that can see your data can just duplicate it.  What prevents a bad actor from copying any Solid data referenced and preventing it from being erased?

So, what’s great about Solid?

All your data in one place.  Think about it!  Solid allows you absolute control over what applications do to your data.  Imagine a single, correct repository of all your medical data.  Every time you go to a new doctor, you grant read/write access to you entire medical history.  Records can easily be shared between hospitals and providers because everything is in one place.

Imagine a future where all your governmental identity data is in one place.  Starting a new job and need two forms of ID? Just grant them read access to your Driver’s License and Birth Certificate.  New Driver’s License?  The DMV already has write access to your License, so it can just update the current version.

 

A Code Golf Discord bot that executes user submitted code

I can only write really great or really bad JavaScript for so long, so when I need to take a break, I’ve been spending some time on the Code Golf StackExchange site. Code Golf is the art of solving a small computing problem in the shortest amount of code possible measured in bytes.  Since participants tend to use single letter variable names and often remove all the spaces, Code Golf code is typically nearly unreadable and this appeal to me greatly. I liked it so much that I created a #code channel in a friend group Discord server and posted a couple challenges.  Then I thought about how cool it would be to have a bot just compile and execute the code that people post.

So I made one.

It’s written in python 3.7 because python is the best language, and runs in a docker container for ease of use.  It’s actually very simple and just needs a discord bot token and volume access to the host machine’s docker socket.

When a user posts a message that uses Discord’s markdown syntax highlighting syntax ,

the bot spins up the appropriate docker container, executes the code, and prints the output to the same channel.

Execution is relatively safe.  The container cannot use more than 50MB of memory, it only runs for 10 seconds max, and container storage is removed after execution.

Supported languages can be added and configured via editing a config file (example below) and restarting the container. User posted code is substituted in for the {}

The discord-codebot image is available on Docker Hub.