Idempotency – a key to better code
Recently I found a term that intrigued me. Idempotency.
From the web I saw this definition that I liked: In computing, an idempotent operation is one that has no additional effect if it is called more than once with the same input parameters.
Much of my computing experience has been dedicated to the transformation of data. And I have written countless routines that transform data in one way or another, from constructing schedule applications to assigning inventory to orders. I support many applications that manage data in various ways with various syntaxes and data structures in various applications. Invariably, those sections of the code that keep me up at night and cause me the most angst in support, are those that were developed without this concept of Idempotency incorporated into their design. When one makes a routine that operates on a data set, if you do not consider what happens if that routine runs again, it will eventually cause you grief in a support call. I learned long ago that operations of these sort must be designed to not accumulate their effect on the data. They must be ‘smart’ enough to not cause errors when they are run again, because in complex systems the control of when the routines are run may not be under the control of the programmer (think jobs or multi user UI), or even if it is, the programmer may end up calling it again for other reasons. If the routine does not adhere to the concept of Idempotency, it can be very tricky to understand how the data go into the sate it is in when the user calls. Often my most difficult troubleshooting issues are these types of problems. So I read with keen interest about this concept that was well enough defined to help me keep the concept in the forefront when designing new applications.
Some examples when using Idempotency is critical are: netting inventory, re-scheduling activities, parsing address data into fields, and in some cases, adding records to a record set. In all these examples, the code needs to be aware of whether the transformation operation was already performed.
Adding Records to a data set: Let’s say you are accumulating records in a data set from various data sources, like names from each department’s employee databases. If you have already appended the finance department’s data to the master table, then appending it again will cause duplicates. Obviously there are many techniques to prevent duplicates in a database, but let’s explore how Idempotency can help. If the appending routine is designed with Idempotence, it can be run anytime, and as many times as you like without adverse effect (like duplicates). To incorporate this into the append routine, ensure your data set has a text field to hold the name of the source of the data. I usually put in the name of the action query or stored procedure that creates the records. Then the first part of the routine can query the data set to see if this action has been run previously, and if so, either terminate or remove the records before executing the append of new records. In this way, running the routine multiple times for the finance department will replace the finance department’s names in the master table.
Netting Inventory: When dealing with inventory, I typically read the value form the system of record, and netting happens there. However, let’s say you need to carry a book inventory value in your local system, and net that inventory as the user enters adjustments to it every day. The netting logic can be complex. It begins with the starting inventory, and adjustments are accumulated and applied to become a new starting inventory value. If the adjustments are applied to the starting inventory more than once then the value will drift away from reality making it unusable. To prevent this, and apply the concept of Idempotency, I carry three inventory fields: Inventory (both the starting inventory and the resulting adjusted inventory), Original Inventory, and Adjustments to Inventory. When the adjustment field changes via the UI, I replace the Original Inventory field with the contents of the Inventory field. After this I can apply the transformation (repeatedly) to the entire data set to calculate the Inventory = Original + Adjustment. Additionally, I time-stamp the record when the transformation is applied, and when the Adjustment is entered. The UI can compare the Adjustment time-stamp to the Transformation time-stamp to see how to treat the Adjustment, either as a replace or an accumulation. If the Adjustment time stamp is later than the Transformation time-stamp this means that the Transformation has not yet been run to use this Adjustment. In this case, the UI might accumulate any new user adjustment into the field. If the transformation has already been run, then the UI would replace the Adjustment.
Aspen SCM (MIMI) Scheduling routines: Another area where this concept of Idempotency is important is when using some scheduling techniques in Aspen SCM Plant Scheduler. Sometimes it is interesting to move all the activities off of one or more reactor facilities to a temporary holding place (similar to a queue) to be able to re-schedule them one by one on the best reactor at the most appropriate time. This is a powerful technique to allow the routine to prioritize activities to meet customer demand, and maximize the capacity utilization on the reactors. However, if Idempotency is not considered during the design of this routine, the results can be devastating to the quality of the schedule. Lets say the routine fails during the re-scheduling portion of the routine. The reactors are partially filled, and the temporary holding place is loaded with activities. Since multiple reactors are the source of the activities, the temporary holding facility would be overloaded in time, having activities that extend beyond the end of the scheduling horizon. Executing the routine again when starting in this state would erase all of the activities on the temporary holding place, thus erasing much of the schedule. Incorporating Idempotency into the routine would mean considering the path to recovering these activities in the case of a failure or re-running the routine.
It turns out there are several other related terms that are interesting as well: Again from the web: read about them here:
NULLIPOTENT: If an operation has no side effects, like purely displaying information on a web page without any change in a database (in other words you are only reading the database), we say the operation is NULLIPOTENT. All GETs should be nullipotent. Otherwise, use POST.
IDEMPOTENT: A message in an email messaging system is opened and marked as “opened” in the database. One can open the message many times but this repeated action will only ever result in that message being in the “opened” state. This is an idempotent operation.
NON-IDEMPOTENT: If an operation always causes a change in state, like POSTing the same message to a user over and over, resulting in a new message sent and stored in the database every time, we say that the operation is NON-IDEMPOTENT.
Reading about and exploring these terms has reinforced and put a name to a concept that through experience I have come to understand has major consequences. Now that I can name the concept, hopefully I can be more concise in explaining to others the need this concept addresses, and write better code too.
Jim Piermarini – Profit Point Inc.
Profit Point is helping several large chemical manufacturers upgrade their many Aspen SCM scheduling models with a goal to achieving long term support-ability in the new Aspen architecture of ver 8.5. An Aspen SCM (MIMI) Upgrade is no small undertaking, but we have been helping people manage, support, and enhance their scheduling models for over 20 years.
I have seen many Mimi scheduling models over the last 20 years, in many different businesses, and it is still amazing to me how well these scheduling models work. Their superior applicability is primarily due to creativeness of the their original modelers and their efforts to incorporate all the important aspects of the plants which they schedule, and most that I have seen have remained relevant and useful all these years. Their longevity is due is no small part to the flexibility of the scheduling environment which is Aspen SCM (AKA Mimi). This allows for many minor changes to the tool as equipment characteristics change or are upgraded, or as the business needs change, or indeed as the scheduler changes. This new version retains that flexibility which has made Aspen SCM scheduling models still relevant today.
In previous version changes, Aspen SCM has always been backward compatible; meaning that with nominal effort a newer Aspen SCM version would open an older version’s scheduling model. This was true up to ver 8.x released earlier this year. With this version, the older scheduling models, especially those that were developed in house, will not be able to function properly without a more substantial effort. Version 8.x brings a new XML based architecture and with it a new look and feel, more compatible with today’s applications. In addition, it has some useful new features that can make scheduling easier. Link here https://www.aspentech.com/products/aspen-plant-scheduler/ Aspen SCM remains, in my opinion, the best tool for the job of scheduling plants of all types and sizes. This new version is no break from that long history of being the best, indeed it has just been made even better.
With plants around the world, our customers trust Profit Point to upgrade their effective scheduling models to the latest version of Aspen SCM (Mimi) so they can enjoy many more years of effective scheduling at their plants.
We love doing this work. Call us if you are facing the same upgrade challenge, we may be able to help get you going.