In February of 2011 we relocated back to the East Coast; we packed everything we own into a big yellow truck and moved it all the way across the country. We made it a road trip, and wound our way from Mountain View, CA to Philadelphia over the course of about two weeks. In the process I ended up with about 4,000 miles worth of time, which is a lot of time to think and reflect about a great many things...including performance testing.
In my mind I translated our cross-country move into a metaphor for how information systems work. I thought about the supposition that if all of our worldly possessions in the back of the truck were actually just data (instead of our stuff), how could the experience of moving all this stuff be relevant to how we process data in computing systems?
Storage: Let’s be clear, our stuff was just fine sitting in our house in California. It was happy there. It was at a state of rest. Or at least, most of it wasn't moving about very much. Each item had it's own place and we knew where most things were located. Some of our things were in use, some on display, and other, infrequently used items were packed away. And some of our things were set in places where we could manipulate them, use them or move them around quite a lot, like our car keys, our clothes, or our dishes. This also holds with computing systems: data is stored across many mechanisms as a means to enable different types of access, depending on frequency of use and also the speed or convenience of retrieval.
Temporary Storage: After we decided to make a move back east, the first effort started with packing everything up into boxes. Some of our things didn't need to be changed very much to be packed and prepared for the move, but they did need to be pulled from their in-house storage state and put into some type of a temporary storage state. The act of packing was tough work - it took processing power to manipulate our stuff. And when we go to retrieve data from a disk volume or database table, the storage system or database must first put our data into some type of container or dataset object.
Manipulation: As we were packing up our stuff, some of it had to be drastically dismantled to fit into the truck; it had to be changed or manipulated in order to be moved into a temporary state (a box) into another temporary state (the truck). This manipulation of our stuff took a lot of time, just as in computing systems that must transform or change the data format. Even doing a small aggregation or conversion for a lot of data (stuff) takes a long time. Manipulation of our stuff also included the packing, the compression of stuff into smaller spaces, much like how we compress data before it moves between systems. On the other end of the trip in Philadelphia we had to reverse all the manipulations – assembly, decompression, un-packing – which again takes a lot of time and processing power.
Movement: Even though this stage of the move was easier than packing and loading, while we were moving we didn't take anything out of the truck or boxes - it was left as-is for the long journey to Philadelphia. There were limitations on how fast we could drive and how much stuff would fit into the truck. And not only did we have a truck, we had a car trailer on the back, had the car on the trailer, and inside the car we had a disassembled motorcycle! It was quite a rig. So, we had limitations on capacity and speed and there were only certain routes we could take because of the size of our truck/trailer combo. One option we considered was that our stuff would have travelled faster had we divided all of it into smaller, individual containers and shipped them all separately to our destination.
Display: As we put our stuff into boxes and then, once we landed, took it back out of the boxes, we had to examine many of the things to make decisions what to do with them. We had to see things in order to make a decision about them - while they were being manipulated in packing or un-packing, we had to display them to ourselves to decide where they belonged in our new place. You might know this feeling when you look at an unmarked cardboard box that is taped shut, but you can’t remember what’s inside it. The only way to know what “data” is inside is to open the box and display the contents.
Reflection on the long cross-country drive led me to the recognition of the four natural states for stuff: storage, manipulation, movement, and display. And as I examined information moving between these different states, I noted that we have the same base concepts in performance engineering; assessing costs of storage, movement, manipulation, and display of information.
How many times is information manipulated into temporary storage on it's way to/from a display state? And is there a conceptual model (that does not involve trucks, trailers, or boxes) which could help us formulate new language for transactional analysis? Applied to the analysis of end-to-end distributed system performance, data must flow through the system entering-and-exiting the four states. “Awesome!” I thought.
But my next thought was: how much did it cost for our stuff to change state? How much did it cost in dollars, and how much did it cost in terms of energy or effort? (e.g. muscles and ibuprofen...lots of ibuprofen...)