In terms of "bang-for-the-buck,"
where and MATLAB
find functions are arguably
the single-most important functions available in those languages.
Combined with the ability to easily extract any arbitrary list of
elements, IDL and MATLAB enable scientists to write compact and
readable data analysis code.
For instance, let's say we have an array of one month of
of daily temperatures (
T), and we want to extract
all temperatures greater than 280 K and put them into a
separate array (
The IDL code to do this would be
bigpts = where(T gt 280.0) bigT = T[bigpts]
To replace the elements of
T specified by
some other value (say 5.0), in IDL just type:
T[bigpts] = 5.0.
How can we do this in Python? The key is instead of creating an array that contains the indices of the array elements that meet our condition, we create a mask which has values of 1 or 0, depending on whether the corresponding element in the original data array meets the condition .
We create an array of 28 points of temperature data:
T = Numeric.sin(Numeric.arange( 28 \ , typecode=Numeric.Float32)) \ * 5.0 + 279.0
T has an amplitude of 5 K, oscillating around a
mean of 279 K.
There are a variety of ways to create masks in
For instance, to create a mask of point greater than
280 K, you can use the
bigpts_mask = Numeric.where(T > 280.0, 1, 0)
or element-wise logical and comparison functions:
bigpts_mask = Numeric.greater(T, 280.0)
To extract these elements into the array
bigT = Numeric.compress(bigpts_mask, T)
To replace the values in the original data array
bigpts_mask is true (i.e. 1), use the
Numeric.putmask command. For instance, to replace all the
T is greater than 280 K with 5 K:
Numeric.putmask(T, bigpts_mask, 5.0)
Numeric.putmask can also be used in replacing array members
element by element, repeating the replacement array if necessary.
data = Numeric.array([3,5,-2,4,6,1,-5,-8] \ , typecode=Numeric.Float32) mask = [1,1,1,0,0,1,0,1] newdata = [-999,-888] Numeric.putmask(data, mask, newdata)
array([-999., -888., -999., 4., 6., -888., -5., -888.],'f')
newdata is shorter than
when doing the
replacement based on
mask, Python repeats
many times as needed to make a replacement array of the same size
data, the applies the "virtual" replacement array
at places where
mask is true.
bigpts is an array of the indices in
that satisfy the condition in the
In IDL the indices are converted to "equivalent" 1-D values,
regardless of the dimension of
T, so they uniquely
refer to elements in
T, as long as the dimensions
T do not change. Thus, in IDL
is a 1-D array (regardless of the dimension of
[Return to text.]
Notes: Thanks to Mike Steder for the help!