;---------------------------------------------------------------------------- ; \$Id: lc_lagcorr.pro,v 1.4 2002/04/26 02:05:50 johnny Exp johnny \$ ;+ ; NAME: ; LC_LAGCORR ; ; PURPOSE: ; Calculate one or several lag correlation maps. The map is the correl- ; ation of each point in a timeseries of 2-D slices (dataset "two", i.e. ; in_datatwo) with a single point timeseries (dataset "one", i.e. ; in_dataone_mean). ; ; Signficance is calculated using one-tail t-test with a measure of ; effective degree of freedom (EDOF) from Livezey & Chen (1983) and Chen ; (1982). Locations in in_datatwo where correlation or significance ; cannot be calculated are returned as NaN in Result. ; ; CATEGORY: ; Statistics. ; ; CALLING SEQUENCE: ; Result = LC_LAGCORR(in_dataone_mean, in_datatwo, in_lags) ; ; INPUTS: ; in_dataone_mean First timeseries, the reference timeseries that every ; x-y point in in_datatwo will be correlated to. A 1-D ; vector of NT elements. Should be some sort of floating ; type. Not changed by procedure. ; ; in_datatwo Second timeseries, of x-y slices. Dimensioned ; (NX,NY,NT) where NX is the number of points in the ; "x"-direction, NY is the number of points in the "y"- ; direction, and NT is the number of time points (same ; as in in_dataone_mean). See "Notes" section below for ; more information on in_datatwo. Should be some sort ; of floating type. Not changed by procedure. ; ; in_lags Lags (k) at which to calculate the correlation ; coefficient. 1-D long array of NLAG elements. In ; units of the timeseries time step. Not changed by ; procedure. Lag k is defined such that covariance at ; lag k is: ; ; Cov(Z1(t),Z2(t+k)) = ; E[(Z1(t)-mean{Z1})(Z2(t+k)-mean{Z2})] ; ; where Z1 and Z2 are timeseries from "one" and "two," ; respectively. ; ; KEYWORD PARAMETERS: ; EDOF Values of effective degrees of freedom at each (x,y) ; spatial point in in_datatwo. Dimensioned (NX,NY). ; Created by function and overwrites. ; ; SIG_LEVELS The level of significance of the correlation ; coefficient calculated at each (x,y) point in ; in_datatwo, expressed as a probability (i.e. in the ; interval [0,1] (the array sigdata_all). Thus, if the ; value of Sig_Levels is 0.95 or higher, the correlation ; coefficient is significant at the 95% confidence ; level. Dimensioned (NX,NY,NLAG). Created by function ; and overwrites. ; ; OUTPUTS: ; Result Correlation coefficient (the array correlmap_all) at ; each (x,y) point in in_datatwo, for each lag. ; Dimensioned (NX,NY,NLAG). Has value in the interval ; [-1,1]. ; ; FILE DEVICE I/O: ; None. ; ; COMMON BLOCKS: ; None. ; ; EXAMPLE: ; Create some example data: ; dataone = [ 0.14, 0.92, 0.48 , 1.44, -0.45 \$ ; , 0.41, 0.57, -0.52, 0.28, -0.47 ] ; datatwo = SIN(0.2*FINDGEN(3,2,10)) ; lags = [-1, 1] ; corr = LC_LAGCORR(dataone, datatwo, lags) ; PRINT, corr ; IDL prints: ; 0.13504242 0.17972478 0.21650849 ; 0.24462751 0.26363680 0.27323013 ; ; -0.065169703 -0.089100609 -0.10907557 ; -0.12465870 -0.13558459 -0.14166381 ; ; MODIFICATION HISTORY: ; - 6 Mar 2002: Orig. ver. Johnny Lin, CIRES/University of Colorado. ; Email: air_jlin@yahoo.com. Based on the program in the file ; main.pro,v 1.10 2002/02/14 01:22:23, which operates on CCM3 output. ; Passed minimally passable tests. ; - 25 Apr 2002: Added ability to handle NaN input and timeseries which ; create NaN output. Passed minimally passable tests. ; ; NOTES: ; - Written for and maily tested in IDL 5.3. Last few digits of example ; output will be slightly different in IDL 5.4 and 5.5. ; - References: ; * Regarding EDOF procedure: Livezey & Chen (1983), "Statistical field ; significance and its determination by Monte Carlo techniques," Monthly ; Weather Review, pp. 46-59, and Chen (1982), "Fluctuations in northern ; hemisphere 700 mb height field associated with the southern oscill- ; ation," Monthly Weather Review, pp. 808-823. ; * Regarding t-test formula: Brooks and Carruthers (1953), Handbook of ; Statistical Methods in Meteorology, p. 220. ; - NB: Mid-way through this function the second timeseries array is ; transposed to speed-up memory access. ; - Both input timeseries are assumed to be regular in time (i.e. the same ; timesteps throughout). EDOF is set such that its maximum possible value ; is NT. ; - Even though in_datatwo is dimensioned as a timeseries of x-y slices, ; the function does not make use of any of the information that can be ; implied through gridding. As far as its concerned, each (x,y) vector ; can be anywhere in space. However, the 3rd dimension of in_datatwo must ; be time. ; - It appears that T_PDF (used in this procedure) gives the PDF for a ; one-tail t-test while T_CVF computes the cutoff value for a two-tailed ; t-test. ; - If any element in in_dataone_mean is NaN, all correlation values re- ; turned will be NaN, since C_CORRELATE is unable to process NaN values ; as "missing values." Also, if the variance of either of the timeseries ; is zero, the correlation coefficient is undefined and will return a ; value of NaN and give an arithmetic error. ; - Keywords are optional unless otherwise noted. ; - No procedures called with _Extra keyword invoked. ; - No user-written procedures called. ;- ; Copyright (c) 2002 Johnny Lin. For licensing, distribution conditions, ; and contact information, see http://www.johnny-lin.com/lib.html. ;---------------------------------------------------------------------------- FUNCTION LC_LAGCORR, in_dataone_mean, in_datatwo, in_lags \$ , EDOF = out_edof \$ , SIG_LEVELS = out_sig_levels \$ , _EXTRA = extra ; ----------------- Initial Declarations and Error Checks ------------------ ON_ERROR, 0 lags = in_lags ;- protect input variables (this is esp. dataone_mean = in_dataone_mean ; b/c the transpose alters datatwo) datatwo = in_datatwo dims = SIZE(datatwo) NXtwo = dims[1] ;- x dim. for dataset for timeseries "two" NYtwo = dims[2] ;- y dim. for dataset for timeseries "two" NTtwo = dims[3] ;- number of timesteps in timeseries "two" NT = N_ELEMENTS(dataone_mean) ;- number of timesteps in timeseries "one" NLAG = N_ELEMENTS(lags) ;- number of lags to do calcs. for lag_type = SIZE(lags, /Type) if ((lag_type ne 2) and (lag_type ne 3)) then MESSAGE, 'error--bad lag type' if ((NT mod 2) ne 0) then MESSAGE, 'error--NT not even' if (NT ne NTtwo) then MESSAGE, 'error--bad NTs' if (dims[0] ne 3) then MESSAGE, 'error--bad datatwo dims' ; --------------------------- Calculate EDOFs ---------------------------- PRINT, 'LC_LAGCORR: Calculate EDOF' tau = DBLARR(NXtwo,NYtwo) templag = DINDGEN(NT-1L)+1.d0 ;- lag values from +1 to +(NT-1) Cxx = C_CORRELATE(dataone_mean, dataone_mean, templag, /Double) PRINT, 'LC_LAGCORR: Transpose dataset two' datatwo = TRANSPOSE(TEMPORARY(datatwo),[2,0,1]) ;- make NT dim. first dim. for iy=0L,NYtwo-1L do begin ;- calculate PRINT, 'LC_LAGCORR: Calculate tau for y row', iy ; tau value for ix=0L,NXtwo-1L do begin Cyy = C_CORRELATE(datatwo[*,ix,iy], \$ datatwo[*,ix,iy], \$ templag, /Double) tau[ix,iy] = TOTAL(Cxx * Cyy) endfor endfor tau = (TEMPORARY(tau) * 2.d0) + 1.d0 ;- tau is the "effective time if ( (WHERE(tau le 0.d0))[0] ne -1 ) then \$ ; between indep. samples," in MESSAGE, 'error-tau le 0 somewhere' ; units of 1 timestep NTd = DOUBLE(NT) EDOF = NTd / tau toohigh = WHERE(EDOF gt NTd) ;- values of EDOF > NT are if (MIN(toohigh) ne -1) then \$ ; set to NT EDOF[toohigh] = NTd edof_to_output = EDOF ;- this will be output in the "Prepare Output" ; section at the end of the function ; ---------------------- Calculate Lag Correlations ----------------------- correlmap_all = DBLARR(NXtwo,NYtwo,NLAG) ;- correl. coeff. and signif. sigdata_all = DBLARR(NXtwo,NYtwo,NLAG) ; levels for all lags for ilag=0L,NLAG-1 do begin ;-> loop through lags (begin) lagvalue = lags[ilag] ;- the lag we're considering PRINT, 'LC_LAGCORR: Calc. correl./sig. map lag =',lagvalue correlmap = DBLARR(NXtwo,NYtwo) ;- temp arrays for this lag sigdata = DBLARR(NXtwo,NYtwo) for iy=0L,NYtwo-1L do \$ for ix=0L,NXtwo-1L do \$ correlmap[ix,iy] = C_CORRELATE(dataone_mean, \$ datatwo[*,ix,iy], \$ lagvalue, /Double) lagvalue = lagvalue[0] ;- restore lagvalue to be a scalar since it appears ; C_CORRELATE chgs. it into a single elem. vector tvalue = correlmap / SQRT( ( 1.d0 - (correlmap*correlmap) ) / EDOF ) for iy=0L,NYtwo-1L do begin for ix=0L,NXtwo-1L do begin goodpt = ( FINITE(tvalue[ix,iy]) and FINITE(EDOF[ix,iy]) ) if (goodpt eq 1) then \$ sigdata[ix,iy] = T_PDF( tvalue[ix,iy] \$ ;- signif. levels as ,EDOF[ix,iy] ) \$ ; probabil. value else \$ sigdata[ix,iy] = !VALUES.F_NAN endfor endfor correlmap_all[0,0,ilag] = correlmap[*,*] ;- transfer from temp arrays to sigdata_all[0,0,ilag] = sigdata[*,*] ; full array endfor ;-> loop through lags (end) ; ---------------------------- Prepare Output ----------------------------- out_edof = TEMPORARY(edof_to_output) ;- output EDOF (optional) out_sig_levels = TEMPORARY(sigdata_all) ;- output signif. levs. (optional) Result = TEMPORARY(correlmap_all) ;- return correl. coeff. RETURN, Result END ;=== end of function === ; ========== end of file ==========