;----------------------------------------------------------------------------
; $Id: rpca_la00.pro,v 1.12 2002/11/09 04:28:55 johnny Exp $
;+
; NAME:
; RPCA_LA00
;
; PURPOSE:
; Obliquely rotated principal component analysis (PCA) using a revised
; Promax method, similar to that of Lin and Arakawa (2000).
;
; CATEGORY:
; Statistics.
;
; CALLING SEQUENCE:
; Result = RPCA_LA00(in_data)
;
; INPUTS:
; in_data: Input data of n objects (e.g. observations in time)
; at p variables (e.g. locations), floating array
; dimensioned (p,n). E.g. each column is n observa-
; tions at a single location. "Objects" and "varia-
; bles" is RJ93 notation. Equivalent to Z (prior to
; any transformations) in LA00, p. 3574, and X in RJ93,
; eqn. 3.1. Unchanged by function.
;
; INPUT KEYWORD PARAMETERS:
; CORREL_PCA If keyword set to true, the initial PCA (LA00, eqn.
; 3.6) is done using correlations of (optionally
; transformed) input data. Default is for keyword to
; be false, with initial PCA calculated using covar-
; iances. Unchanged by function.
;
; ISIGN: If keyword set to true, the magnitude of the ensemble
; mean in the (optionally transformed) input timeseries
; is reduced using the Isign function method (LA00,
; Fig. A1). Note that this operation on the ensemble
; mean is the main feature that sets apart the method
; of LA00 from a regular Promax rotated PCA function.
; Unchanged by function.
;
; N_PC_RETAINED: Keyword value is the number of EOF/PCs (factors) we
; wish to retain (r_input) from the initial PCA (LA00,
; eqn. 3.6) for the Promax rotation analysis (integer).
; Note that this only sets an upper bound of the number
; of EOF/PCs used as there might be fewer EOF/PCs
; "defined" than the value of the keyword (see r_input
; entry in "Notes" in this header block for details).
; If keyword is undefined, r_input is set to all p
; variables (e.g. locations). Unchanged by function.
;
; PC_REGRESS: In "true factor analysis" (TFA) as oppsed to PCA, the
; PCs (factor scores) should be estimated by regression
; instead of directly calculated from LA00, eqn. 3.7,
; because in TFA residuals are not assumed to be small.
; If this keyword is true, regression estimation fol-
; lowing RJ93, eqn. 7.9, is used to calculate Result.
; Default is for keyword to be false and to directly
; calculate Result from LA00, eqn. 3.7 (in the form
; given in RJ93, p. 224). See RJ93, pp. 78-79 for
; further details of how TFA differs from PCA.
;
; REMOVE_MEAN: If keyword set to true, the mean over all n objects
; (e.g. observations in time) for each of the p var-
; iables (e.g. locations) is removed to transform the
; input data such that the function sees a zero mean
; for each p over all n. Keywords Remove_Mean and
; Standardize cannot both be set true; either one or
; neither can be true. Unchanged by function.
;
; STANDARDIZE: If keyword set to true, each of the p variables (e.g.
; locations) is standardized by removing the mean and
; dividing by the standard deviation over all n objects
; (e.g. observations in time) for each of the p var-
; iables (i.e. for each p variable, z = (x-MEAN(x)) /
; STDDEV(x)). Keywords Remove_Mean and Standardize
; cannot both be set true; either one or neither can
; be true. Unchanged by function.
;
; OUTPUT KEYWORD PARAMETERS:
; EOF_LOADINGS: Matrix of obliquely rotated EOFs (factor loadings,
; A* in LA00), floating array dimensioned (r_input,p).
; See "Notes" in this header block for the definition
; of r_input. Each column is an EOF. Because the
; Promax method produces structure and pattern matrices
; that are all based upon a row normalized target
; matrix B (RJ93, p. 220) and column normalized ref-
; erence and pattern transformation matrices, even if
; the orthogonal, unrotated EOFs of the input data are
; non-normalized, the Promax EOFs will be "normalized"
; (see references for PROMAX_HW64) and thus the output
; Eof_Loadings also are always "normalized". Note that
; if not all r_input EOFs are "defined," all columns
; greater than the last "defined" EOF are set to NaN.
; Created by function.
;
; PC_VAR_FRAC: Fraction of variance explained by each obliquely
; rotated EOF/PC (factor), dimensioned (r_input).
; Note that in calculating these values, the joint
; contributions are attributed to the EOF/PC that is
; lesser, e.g. the joint communiality for EOF/PCs 1 &
; 2 is attributed to EOF/PC 1, not EOF/PC 2. Created
; by function.
;
; MEAN_LOC: Returns the mean over the input objects (e.g. time-
; series) for each p variable (e.g. location), float-
; ing vector dimensioned (p). Created by function.
;
; SDEV_LOC: Returns the standard deviation over the input ob-
; jects (e.g. timeseries) for each p variable (e.g.
; location), floating vector dimensioned (p). Cre-
; ated by function.
;
; OUTPUTS:
; Result: Matrix of obliquely rotated PCs (factor scores, F*
; in LA00), floating array dimensioned (r_input,n).
; See "Notes" in this header block for the definition
; of r_input. Each column is a PC. If in_data has
; been transformed by Standardize or Remove_Mean,
; the PCs returned by the function are amplitudes
; derived from the transformed input data; if in_data
; has not been transformed, the PCs are amplitudes
; derived from the input in_data. In either case, if
; Isign is true, Result is adjusted to return the PC
; values to the signs that make the PCs consistent
; with (transformed not including applying Isign, if
; applicable) input data signs. Note that if not all
; r_input PCs are "defined," all columns greater than
; the last "defined" PC are set to NaN. Created by
; function.
;
; FILE DEVICE I/O:
; None.
;
; COMMON BLOCKS:
; None.
;
; EXAMPLE:
; None.
;
; REFERENCES:
; - Harman, H. H. (1960), Modern Factor Analysis (Chicago: The Univer-
; sity of Chicago Press), 469 pp. Abbreviated H60 in this function.
; - Lin, C. and A. Arakawa (2000), "Empirical determination of the basic
; modes of cumulus heating and drying profiles," J. Atmos. Sci., Vol.
; 57, No. 21, pp. 3571-3591. Abbreviated LA00 in this function.
; - Reyment, Richard A., and K. G. Joreskog (1993), Applied Factor
; Analysis in the Natural Sciences, 2nd Edition (Cambridge: Cambridge
; University Press), ISBN 0-521-41242-0 (hardback), 371 pp. Abbrevia-
; ted RJ93 in this function. See also MATLAB code by L. F. Marcus in
; the Appendix of this book.
; - Richman, M. B. (1986), "Rotation of principal components," Journal
; of Climatology, Vol. 6, pp. 293-335. Abbreviated R86 in this
; function.
;
; MODIFICATION HISTORY:
; - 8 Nov 2002: Orig. ver. Johnny Lin, CIRES/University of Colorado.
; Email: air_jlin@yahoo.com. Passed passably adequate tests.
;
; NOTES:
; - Written for IDL 5.3.
; - Value of r_input: This is the number of unrotated EOF/PCs (factors)
; that we *want* retained for calculating the Promax rotation. The
; *actual* number that is retained in those calculations (which we call
; r, after R86 notation; r is equivalent to k in RJ93, eqn. 7.2 etc.)
; is determined later in the program as r depends on how many "defined"
; EOF/PCs the initial PCA by PCOMP will return; r is set to the smaller
; between r_input and the number of "defined" EOF/PCs returned by PCOMP.
; A "defined" EOF/PC is one where the EOF is non-NaN, non-zero eigen-
; values, and non-zero length. If the input keyword N_Pc_Retained is
; set, r_input is set to the value of N_Pc_Retained. If N_Pc_Retained
; is not set, r_input is set to p (all possible EOF/PCs).
; - See discussion in the "Regular PCA" section in the code below regard-
; ing my choice in how I decompose the transformed data Z into PC and
; EOF matrices F and A, as I do a few things differently than discussed
; in the text of LA00 after eqn. 3.6.
; - In LA00, Appendix A, the Promax method is described as operating on
; orthogonally rotated F* instead of A*, which is at odds with R86,
; p. 320 and RJ93, p. 220 in which Promax operates on the loadings
; (EOF) matrix (A* in LA00). RJ93, pp. 222-225, also indicates that
; once you've rotated A to get A*, you obtain the rotated scores (PC)
; matrix F* through direct or regression estimation (in LA00 notation).
; Thus, in the Promax part of this function, we use the version de-
; scribed by R86 and RJ93 instead of LA00, Appendix A. Finally, the
; version of Promax used in this function is slightly different from
; the version described by R86 and RJ93 in that the B matrix (RJ93,
; p. 220) is not column normalized. This is done to conserve total
; variance through rotation; additional discussion is found in the
; comments for PROMAX_HW64. From a linear algebra standpoint, this
; shouldn't affect the least squares method used to obtain the
; reference structure transformation matrix.
; - Output units: As RJ93 (p. 227) points out, the EOF/PC (factors)
; are composite "variables" that are linear combinations of the input
; variables. Thus, if the input data variables do not have the same
; units, the EOF/PCs will express some sort of composite units. If
; the input data series is transformed (e.g. Standardize is set),
; the EOF/PCs will be based on the transformed input. The EOFs
; returned, regardless of whether the input data is transformed,
; are based upon a Promax method which uses row-normalized EOFs from
; a standard PCA (see comments in PROMAX_HW64), and thus the EOFs
; should be thought of as being in "normalized" arbitrary units.
; If the input data is not transformed (and all variables have the
; same units), it probably isn't incorrect to think of the units as
; being applied to the PCs, because the PCs were backed out based
; upon the EOFs and the input data. But definitely if the input data
; was standardized, the PCs are unitless and are themselves standard-
; ized (RJ93, p. 227).
; - Remember that array dimension notation in IDL (row-major) is diff-
; erent than in standard linear algebra notation (column-major). In
; this function, every time array dimensioning is described, it fol-
; lows the IDL form. The sources in References, I think, almost al-
; ways follow the standard linear algebra form.
; - All keyword parameters are optional unless otherwise stated.
; - No procedures called with _Extra keyword invoked.
; - Non built-in functions and procedures called: PROMAX_HW64.
;-
; Copyright (c) 2002 Johnny Lin. For licensing, distribution conditions,
; and contact information, see http://www.johnny-lin.com/lib.html.
;----------------------------------------------------------------------------
FUNCTION RPCA_LA00, in_data $
, CORREL_PCA = in_correl_pca $
, ISIGN = in_isign $
, N_PC_RETAINED = in_n_pc_retained $
, PC_REGRESS = in_pc_regress $
, REMOVE_MEAN = in_remove_mean $
, STANDARDIZE = in_standardize $
, EOF_LOADINGS = out_eof_loadings $
, PC_VAR_FRAC = out_pc_var_frac $
, MEAN_LOC = out_mean_loc $
, SDEV_LOC = out_sdev_loc $
, _EXTRA = extra
; -------------------- Error Check and Parameter Setting --------------------
COMPILE_OPT IDL2
ON_ERROR, 0
epsilon = (MACHAR()).eps ;- set machine precision tolerance
dims = SIZE(in_data)
if (dims[0] ne 2) then MESSAGE, 'error--bad input dimensions'
p = dims[1] ;- set number of locations (a.k.a. variables, in RJ93)
n = dims[2] ;- set number of obs. in time (a.k.a. objects, in RJ93)
if (MIN([p,n]) lt 2) then MESSAGE, 'error--too few pts.'
X = in_data ;- protect input and create matrix X, which is Z
; in LA00, p. 3574, prior to any transformation
; (e.g. removal of mean); X is dimensioned (p,n).
in_type = dims[3] ;- set input data type
;- set the number of unrotated EOF/PCs (factors) we want retained in
; calculating the rotation (r_input, see "Notes" in header block for
; more information:
if (KEYWORD_SET(in_n_pc_retained) eq 1) then begin
r_input = in_n_pc_retained
if (ABS(r_input) gt p) then MESSAGE, 'error--bad r_input'
endif else begin
r_input = p
endelse
;- set flag whether or not to use covariance in calculating the initial
; PCA (covar_pca=1 means use covariance in that calculation); the flag
; is set to false if keyword Correl_Pca is true:
if (KEYWORD_SET(in_correl_pca) eq 1) then begin
covar_pca = 0
endif else begin
covar_pca = 1
endelse
; ----------- Calculate A Few Statistical "Scales" For Input Data -----------
;
; Section input: X
; Section output: X (unchanged), mean_value, sdev_value
;
; Selected variable key:
; X Input data, dimensioned (p,n).
; mean_value Mean over the input timeseries for each p variable (e.g.
; location), dimensioned (p).
; sdev_value Std. dev. over the input timeseries for each p variable
; (e.g. location), dimensioned (p).
mean_value = MAKE_ARRAY(p, Type=in_type)
sdev_value = MAKE_ARRAY(p, Type=in_type)
for ip=0,p-1 do begin
mean_value[ip] = MEAN(X[ip,*])
sdev_value[ip] = STDDEV(X[ip,*])
endfor
; -------------------------- Transform Input Data ---------------------------
;
; Section input: X (undefined on section exit), mean_value, sdev_value
; Section output: Z, mean_value (unchanged), sdev_value (unchanged)
;
; Selected variable key:
; X Input data, dimensioned (p,n).
; Z Transformed (optionally) input data, i.e. the original
; data transformed by removing the mean or other operations
; (if coded in the future), dimensioned (p,n).
; mean_value Mean over the input timeseries for each p variable (e.g.
; location), dimensioned (p).
; sdev_value Std. dev. over the input timeseries for each p variable
; (e.g. location), dimensioned (p).
;
; Transformation done in this section: remove mean or standardize,
; depending on the values of the input keywords.
Z = TEMPORARY(X) ;- set initial Z to X
;- flag that is true (=1) if more than 1 input data transformation
; keyword set and is false (=0) if none or 1 such keyword is set:
mult_key = ( (KEYWORD_SET(in_remove_mean) eq 1) and $
(KEYWORD_SET(in_standardize) eq 1) )
case 1 of
mult_key: MESSAGE, 'error--multiple transformation keywords set'
(KEYWORD_SET(in_remove_mean) eq 1): $ ;- remove mean from Z
begin
PRINT, 'RPCA_LA00: Removing mean over objects'
for ip=0,p-1 do begin
Z[ip,*] = Z[ip,*] - mean_value[ip]
endfor
end
(KEYWORD_SET(in_standardize) eq 1): $ ;- standardize Z
begin
PRINT, 'RPCA_LA00: Standardizing input'
for ip=0,p-1 do begin
Z[ip,*] = ( Z[ip,*] - mean_value[ip] ) $
/ sdev_value[ip]
endfor
end
else: PRINT, 'RPCA_LA00: Input data not transformed'
endcase
; ----------- Reduce Magnitude of Data's Ensemble Mean (optional) -----------
;
; Section input: Z
; Section output: Z (changed, if option is selected), Isign_val
;
; If keyword Isign is true, Z is operated on by multiplying by Isign to
; decrease the magnitude of the ensemble mean of the data (LA00, p. 3589).
;
; Selected variable key:
; Z Transformed (optionally) input data, dimensioned (p,n).
; Isign_val Sign function (LA00, eqn. A.3), dimensioned (p,n).
; s Vector s at i-1 (LA00, eqn. A.2), dimensioned (p).
; s_mi Vector s_i- (LA00, eqn. A.2), dimensioned (p).
; s_pl Vector s_i+ (LA00, eqn. A.2), dimensioned (p).
; Z Data, after any transformation(s), dimensioned (p,n).
if (KEYWORD_SET(in_isign) eq 1) then begin
PRINT, 'RPCA_LA00: Reduce ensemble mean magnitude'
s = MAKE_ARRAY(p, Type=in_type) ;- initialize s vector to 0
Isign_val = REPLICATE(1, p,n) ; and Isign_val to 1
for i=0,n-1 do begin
s_pl = s + REFORM(Z[*,i]) ;- calc. s_i+ and s_i- at each time
s_mi = s - REFORM(Z[*,i]) ; value n
if (NORM(s_pl) gt NORM(s_mi)) then begin ;- test norms of s_i+ and
s = TEMPORARY(s_mi) ; s_i- and change from or
Isign_val[*,i] = -1 ; keep Isign_val to default
endif else begin ; depending on the result
s = TEMPORARY(s_pl)
endelse
endfor
Z = Isign_val * TEMPORARY(Z) ;- adjust Z
tmp = TEMPORARY(s) ;- deallocate s
endif
; ------------------------------- Regular PCA -------------------------------
;
; Section input: Z
; Section output: A, eigenval, Z (unchanged)
;
; Selected variable key:
; A Unrotated EOFs, dimensioned (p,p). EOFs (factor
; loadings) of Z are column vectors of A.
; A_trans Transpose of A.
; eigenval The eigenvalues corresponding to the EOF/PCs (factors)
; represented by A and F, dimensioned (p).
; F Unrotated PCs, dimensioned (p,n). PCs (factor scores)
; of Z are column vectors of F. Note this variable
; is not passed out on section exit because it's not
; needed later on in the function; instead it's
; deallocated before section exit.
; Z Data, after any transformation(s), dimensioned (p,n).
;
; Algorithm: I decompose Z into F and A, using eqn. 3.6 in LA00. Re-
; member though that any given F and A solution is not unique. Below
; are the differences in my implementation of F and A from that de-
; scribed in LA00:
;
; (1) LA00 appears to have a typo in the sentence after eqn. 3.6, where
; the "rows" of F are described as the PCs of Z. It should be the
; "columns" of F are the PCs of Z, which would then be consistent
; with RJ93, eqn. 3.1 and R86, p. 315. In this function PCs are
; also columns of F.
; (2) LA00 describes the EOFs in A as being scaled to unit length. In
; this function, I instead follow the practice (e.g. RJ93, p. 215)
; where EOFs are scaled so the sum of squares of the values in one
; EOF equals the eigenvalue for that EOF; this way I'll be able to
; use later EOFs to calculate variance explained by that EOF/PC
; (factor), which I can't do if all EOFs are scaled to the same
; length.
; (3) Note that for an A as described in (2) above, the corresponding F
; that enables one to to reproduce Z is not the one that's output
; from PCOMP; you need to transform it. FYI, one such transforma-
; tion that seems to do the job (it'll get you an error of 0-10% of
; Z for F ## TRANSPOSE(A)) is for F to be scaled so that it has
; zero mean and unit standard deviation, for each variable (e.g.
; location). However, since F is not needed after this section, I
; don't worry about it.
;
; I choose to meet (2) to enable me to calculate variances explained by
; each EOF/PC (factor), even after rotation (when I no longer have eigen-
; values). The built-in IDL function PCOMP returns A such that condition
; (2) is met. Finally, note that Z is going to be used later on in the
; function, which is why I note that it is output from this section
; unchanged.
A_trans = 1 ;- initialize variable for PCOMP
eigenval = 1
tmp = PCOMP( Z $ ;- calc. PCA, EOF, eigenvalues
, Coefficients=A_trans $
, Covariance=covar_pca $
, Eigenvalues=eigenval )
tmp = 0 ;- dealloc. since is uneeded
A = TRANSPOSE(TEMPORARY(A_trans)) ;- calc. A
eigenval = REFORM(eigenval) ;- redimension eigenval
; -------------------------- Find Promax Solution ---------------------------
;
; Section input: A, eigenval (both undefined on section exit)
; Section output: A_star, F_star, phi_p, r, tot_var_orig
;
; Calculate Promax solution of oblique EOFs (A_star) based on the un-
; rotated EOFs (A) then calculate oblique PCs (F_star) using Z and A_star
; (see RJ93, p. 224).
;
; Selected variable key:
; A On section input, unrotated EOFs, dimensioned (p,p).
; EOFs (factor loadings) of Z are column vectors of A.
; In the section, only the first r EOFs are used in
; calculations.
; A_star Oblique EOFs, dimensioned (r,p). The r oblique EOFs
; (factor loadings) are column vectors of A_star.
; This is the primary pattern loadings from the Promax
; method.
; F_star Oblique PCs, dimensioned (r,n). Oblique PCs (factor
; scores) of Z are column vectors of F_star.
; phi_p Correlation between primary factors, floating array
; dimentioned (r,r). Each [i,j] element is the
; correlation between oblique EOF/PCs (factors) i and
; j. This is phi_p in RJ93, p. 221.
; r Number of EOF/PCs to actually use in rotation calcs.
; r_corr Correlation matrix, dimensioned (p,p). Same as R in
; RJ93, p. 224, except I divide by n-1 instead of n,
; which is an unbiased estimator for covariance (the
; numerator of the correlation). Note algorithm for
; r_corr assumes that Z is standardized.
; tot_var_orig The total variance in a row-normalized version of A,
; if A isn't already row-normalized (scalar).
; Z Data, after any transformation(s), dimensioned (p,n).
;
; Note that exponent m is manually set to 2 in the Promax routine call
; in this section, following LA00, p. 3590. Notation for section input/
; output variables follow LA00.
;- set r to the smaller of the number of non-NaN EOF/PCs (r_non_nan),
; number of non-zero eigenvalues (eig_non_0), number of non-zero length
; EOF/PCs (len_non_0), and r_input:
r_non_nan = ( HISTOGRAM(FINITE(A[*,0])) )[1]
tmp = TEMPORARY(eigenval[0:r_non_nan-1])
eig_non_0 = ( HISTOGRAM( tmp/TOTAL(tmp) le 2.0*epsilon ) )[0]
tmp = SQRT( TOTAL(A[0:r_non_nan-1,*]^2, 2) )
len_non_0 = ( HISTOGRAM( tmp/MAX(tmp) gt 2.0*epsilon ) )[1]
r = r_non_nan < eig_non_0 < len_non_0 < r_input
if (r lt 2) then MESSAGE, 'error--too few good EOFs'
;- retain r EOFs (factor loadings) and calc. total variance of row
; normalized retained A (tot_var_orig, for comparison later on):
A = TEMPORARY(A[0:r-1,*])
tmp = SQRT(TOTAL(A^2, 1, /Nan))
tmp = REBIN(REFORM(tmp, 1,p), r,p)
tot_var_orig = TOTAL((A/tmp)^2, /Nan)
tmp = 0
;- obliquely rotate retained EOFs:
tmp = PROMAX_HW64( TEMPORARY(A), M=2, P_P=P_p, Phi_P=phi_p )
A_star = TEMPORARY(P_p)
;- calculate obliquely rotated PCs (factor scores): if keyword
; Pc_Regress is set true, we estimate using regression; if not, PCs
; are calculated directly; note algorithm for r_corr requires that
; Z is standardized:
if (KEYWORD_SET(in_pc_regress) eq 1) then begin
if (KEYWORD_SET(in_standardize) eq 0) then $
MESSAGE, 'error--Z may not be standardized'
r_corr = (1.0 / FLOAT(n-1)) * (TRANSPOSE(Z) ## Z)
F_star = Z ## INVERT(r_corr, status) ## A_star
endif else begin
F_star = Z ## A_star ## INVERT(TRANSPOSE(A_star) ## A_star, status)
endelse
if (status ne 0) then $ ;- check if matrix inversion
MESSAGE, 'error--bad inversion' ; to calc. F_star went ok
; ------------------- Multiply PC by Isign_val (optional) -------------------
;
; Section input: F_star, Isign_val
; Section output: F_star (changed, if option is selected)
;
; If keyword Isign is true, multiply each PC by corresponding Isign_val
; to ensure LA00, eqn. 3.7 still holds (see LA00, p. 3590).
;
; Selected variable key:
; F_star Oblique PCs, dimensioned (r,n). Oblique PCs (factor
; scores) of Z are column vectors of F_star.
; Isign_val Sign function (LA00, eqn. A.3), dimensioned (p,n).
if (KEYWORD_SET(in_isign) eq 1) then begin
F_star = TEMPORARY(F_star) * Isign_val[0:r-1,*]
endif
; ------ Calc. Fraction Var. Explained by Each Rotated EOF/PC (Factor) ------
;
; Section input: A_star, phi_p, tot_var_orig
; Section output: pc_var_frac
;
; Selected variable key:
; A_star Oblique EOFs, dimensioned (r,p). The r oblique EOFs
; (factor loadings) are column vectors of A_star.
; pc_var_frac The fraction of the variances of all the variables
; (e.g. locations) given by each EOF/PC (factor).
; Vector dimensioned (r). This is the sum of the
; direct and joint contributions to the communality,
; divided by the total communality. Note that in
; calculating pc_var_frac, the joint contributions are
; attributed to the factor that is lesser, e.g. the
; joint communiality for factors 1 & 2 is attributed to
; factor 1, not factor 2.
; phi_p Correlation between primary factors, floating array di-
; mentioned (r,r). Each [i,j] element is the correlation
; between oblique EOF/PCs (factors) i and j. This is
; phi_p in RJ93, p. 221.
; tot_contrib Total contributions of each oblique EOF/PC (factor)
; to the variance of all the variables (e.g. locations),
; dimensioned (r,r). Same as array V in H60, eqns.
; 13.28 and 13.29 and text immediately below eqn. 13.29.
; Each column is the direct and joint contributions to
; variable variance by a EOF/PC (factor). The direct
; contributions are on the diagonal and the joint
; contributions between that column's factor and all
; other "greater" factors are below the matrix diagonal.
;
; Because unrotated EOF/PCs (factors) are orthogonal to each other, the
; communality is given only by the direct contributions term (RJ93, eqn.
; 3.14). Oblique factors require calculation of both the direct and
; joint contributions term (H60, eqn. 13.27) over all variables (e.g.
; locations). Note that the joint contributions terms in RJ93, eqns.
; 3.12 and 3.13 are incorrect (they forget the factor 2 that comes out
; from the expansion of c_i^2).
tot_contrib = MAKE_ARRAY(r,r, Type=in_type) ;- initialize tot_contrib
idx = LINDGEN(r) ;- calc. direct contrib.
tot_contrib[idx,idx] = TOTAL(A_star^2, 2)
for qq=1,r-1 do begin ;- calc. joint contrib.
for pp=0,qq-1 do begin
tot_contrib[pp,qq] = 2.0 * phi_p[pp,qq] $
* TOTAL(A_star[pp,*] * A_star[qq,*])
endfor
endfor
;- calculate fraction of variances attributed to each EOF/PC (factor):
pc_var_frac = TOTAL(tot_contrib, 2) / TOTAL(tot_contrib)
;- check total variance of the retained EOF/PCs is conserved through
; rotations:
tmpx = TOTAL(tot_contrib)
tmpa = TOTAL(tot_var_orig)
if (NOT ( ABS(tmpx-tmpa) le (ABS(tmpx+tmpa)*2.0*epsilon) )) then $
MESSAGE, 'error--total variance not conserved'
; ------------------------------ Prepare Output -----------------------------
out_eof_loadings = MAKE_ARRAY( r_input, p $ ;- initialize EOF,
, Type=in_type $ ; PC, & var. frac.
, Value=!VALUES.F_NAN ) ; output arrays and
out_pc_var_frac = MAKE_ARRAY( r_input $ ; fill with NaN
, Type=in_type $
, Value=!VALUES.F_NAN )
Result = MAKE_ARRAY( r_input, n $
, Type=in_type $
, Value=!VALUES.F_NAN )
out_mean_loc = TEMPORARY(mean_value) ;- output means (optional)
out_sdev_loc = TEMPORARY(sdev_value) ;- output std. dev. (optional)
out_eof_loadings[0:r-1,*] = TEMPORARY(A_star) ;- output EOFs and var.
out_pc_var_frac[0:r-1] = TEMPORARY(pc_var_frac) ; frac. (both optional)
Result[0:r-1,*] = TEMPORARY(F_star) ;- output PCs
RETURN, Result
END ;=== end of function ===
; ========== end file ==========