Least Squares Linear Predictive Report with Cognos Report Studio

In a large corporation, things move extremely slowly. At my most recent job, if we wanted more disk space, it would take about 2 months to actually see the space. First there were the requests and signoffs, then the provisioning and people who ask you to justify the need, then the ordering and finally the administration.

Anyway, on achillies heel of databases is disk space. you can have a perfectly optimized database that has redundancy and high reliability, but run out of disk space and you are dead. So a team member wrote a 10 line PERL script to log the disk space hourly for each drive on a machine to a simple database table. We did this for multiple machines.

My job was predictive analytics. We needed to take the data for a given machine and drive, and determine the trend of usage and project it 1,3 and 6 months into the future and create actionable information.

So, using Report Studio, I did something I had never done before, I performed a least squares regression on disk data, then using the parameters that were calculated, added 3 calculations that projected those values into the future.

Remembering my Slope math of y=mx+b got me started. b is the trend offset vertically and m is the factor that tells us how much y for each x. I needed those two values to project into the future.

This was a parenthesis nightmare. I solved the problem like I tell everyone I train to build complex calculations. Add incremental steps of the whole function as separate calculations. It makes troubleshooting easier, and bouncing out to Excel for a comparison a breeze. To say it an easier way, it allows you to internally diagnose your function.

Anyway, (I forget exact details),you have to get the square root of sum of the squared differences or something like that. I must of had 6 or 8 individual columns and naming got tricky ( you can have parenthesis in names in Report Studio and that helped)

Once I got the parameters correct, the rest was additional columns applying the above formula with a multiplication factor (the number of months).

Later, I moved the whole query over to Event Studio to make an early warning alert using the same mater that would send us an email when we had a problem 6 months out, or 3 or 1.

Why not use SPSS? Interestingly, SPSS could do the least squares and draw the line, but I was not able to get the parameters of their calculation to do the projection, I’m not sure if this is still true (I did the least squares about 3 years ago)