A piece of predictive analytics I want to focus on in this blog entry is predictive model maintenance. It is not a favorite topic, but it is something that needs to be done and actually considered before you even build your models.
You have to make sure that the predictive models you build are still performing—a month, six months, a year later—the way you need them to in order to make those important business decisions. Model maintenance provides an opportunity to evaluate model accuracy and make updates to the model if it is no longer accurate.
But let’s begin with planning for model maintenance. What does that look like?
Building Your Models
First, develop your models with maintenance in mind. Pick an algorithm that is fairly straightforward and easy to update. This means you may have to compromise a little between having a model that is as accurate as possible or having a model that is reasonably uncomplicated and easier to implement. Don’t choose the most complicated model just because it is one-tenth of a percent more accurate, but could potentially be so much harder to maintain! It is better to make those trade-offs in the beginning; you can really speed up your time to market for model implementation and cut down on maintenance time with a simpler model. Also, try to table drive as much as you can, as this will make the archival process and maintenance much easier.
When developing your models, make sure you have whole team involvement. Often, those who build the models may not be who implements and maintains them. It is important to keep open communication between teams to make future maintenance a smooth process. You want the folks who know your tools, systems and environments to advise during the model development process. You also want to make sure that model owners are kept up to date on any changes to the data collection process so they can adjust the models accordingly.
Lastly, save all of your model building materials! There is a huge time and knowledge loss if the original builder leaves and takes the code and build knowledge. You almost always have to do a rebuild when this happens.
Model Reviews and Ongoing Maintenance
So when is the best time for model maintenance? Ideally, schedule regular reviews and refreshes of your models. You can do this on a regular basis and do all of your models at once, or set a schedule for each model. Your data has to maintain reliability or your model performance decreases.
Models should also be evaluated for maintenance after:
- Changes to any model inputs
- User interfaces change
- Quality of the data changes
- Changes in environment
- Macroeconomic conditions
- Competitor influences
- Change to population
- Customer profile changes
- Product offering changes
Once you’ve tested and reviewed your models, you have to determine what kind of maintenance you need to do. There are two types of model maintenance:
Refresh/refit: This is an instance where you are using the same variables as the models used before, but perhaps you only need to tweak the weights of those variables. You are not evaluating everything in the model, but simply making minimal changes to a particular aspect of the model. This type of maintenance requires very few system changes and allows you to leverage your existing implementation strategy.
Rebuild: This is a situation when you want to reconsider your model inputs and/or algorithm. Obviously, this is a much higher level of effort than just doing a refresh, and could potentially be as much effort as the initial model build. You may even have changes to the implementation. But if your model and its performance has degraded significantly, this may be the best option.
You have to gauge your effort level based on what is needed to bring that model back to its highest performance level.
Once you’ve made your model changes, you need to test the new models and communicate the new model and its changes to your users. If you have stored historical scores from your models, be sure to tag them with a model version number so you know which scores came from the original version and which came from the new version—with a major model enhancement scores may change drastically or even have an entirely different meaning.
Again, model maintenance is not particularly a favorite activity for anyone, but it is something that you really must do, and in fact, plan for before you even build your model. You really can’t afford to let your models degrade. Your customers, conditions and environments are changing every day. Your models need to reflect those changes so your business can keep up.
About Hillary Bliss
Hillary Bliss is a Senior ETL Consultant at Decision First Technologies, and specializes in data warehouse design, ETL development, statistical analysis, and predictive modeling. She works with clients and vendors to integrate business analysis and predictive modeling solutions into the organizational data warehouse and business intelligence environments based on their specific operational and strategic business needs. She has a master’s degree in statistics and an MBA from Georgia Tech.