Aligning daily variables with utility metered data - Kumaradevan
16187
page-template,page-template-full_width,page-template-full_width-php,page,page-id-16187,ajax_fade,page_not_loaded,,qode-title-hidden,qode-child-theme-ver-1.0.0,qode-theme-ver-10.1.1,wpb-js-composer js-comp-ver-5.0.1,vc_responsive

Aligning daily variables with utility metered data

Written by San (SPSS, Excel & Stata – Data mining and Econometric Modeller)

Generally, utility meters are read at different point in time, at irregular intervals. In Sydney, residential utility meters are read four times a year on a rolling basis, with a meter reading cycle typically taking anytime between 80 and 110 days. This means that the meter reads taken for all households for a particular quarter can measure consumption over quite different periods of time. This sort of rolling water meter readings is common practice in large cities but researchers tend to ignore this fact due to difficulties in handling them.

 

The issue here is that it is difficult to match other independent variables, such as weather observations, with water meter recordings. As there is no defined start and end dates in recording water meters, there is no neat alignment with explanatory variables either. A major challenge in modelling household level data is accounting for the differences in timing of individual meter reads.

 

To overcome this issue, first the quarterly read meter readings are converted to daily meter readings. Similarly, the value of each explanatory variable is also converted on a daily basis, for example, the daily rainfall specific to each household. Then all datasets which are based on daily records are joined together. Finally the daily records are aggregated back to the original quarterly timeframe, to calculate the average value of the explanatory variable experienced by the household.

 

The above process requires a high level of programming skills and high computing power, as more than billion records, along with all explanatory variables, are manipulated.

 

Here is a sample SPSS syntax to create daily records which can be used to MATCHed at later stage.

COMPUTE startDate = DATE.DMY(1,1,2004).

* number of days between 1 Oct 2003 and July 2004 = 2010.

LOOP #cnt = 0 TO 2100.

. COMPUTE fullDate = DATESUM(startDate, #cnt, “days”, ‘closest’).

. XSAVE OUTFILE= !temp3 / DROP =startDate .

END LOOP.

GET FILE = !temp3.

FORMATS fullDate (DATE12).

SELECT IF ( fullDate GE DATE.DMY(1,01,2004) ).

SELECT IF ( fullDate LE DATE.DMY(1,07,2009) ).

SORT CASES BY N_PROP fullDate.

SAVE OUTFILE = !temp3.

 

Example:

 

MATCH FILES

FILE = !temp3

/ TABLE = *

/ BY N_PROP fullDate.

 

Perform all required calculations. Finally AGGREGATE the whole data set by the subject and the MeterReadSequenceNumber.

 

If you find many sibjectIDs have same meter reading period, then you may want to drop the subjectID to speed up the process.

 

AGGREGATE

/OUTFILE=* /PRESORTED /BREAK= subjectID MeterReadSequenceNumber

/Var 1 = MEAN(Var1)

:

:

/nDays = N.

 

 

Please note that above SPSS codes are only a guide. Please do not hesitate contact me if you need a hand.