# "Search-and-Replace" In an Array

## Question

How do you search for elements in an array that meet a certain test, and then replace or select those elements (like the `where` function in IDL and the `find` function in MATLAB)?

In terms of "bang-for-the-buck," the IDL `where` and MATLAB `find` functions are arguably the single-most important functions available in those languages. Combined with the ability to easily extract any arbitrary list of elements, IDL and MATLAB enable scientists to write compact and readable data analysis code. For instance, let's say we have an array of one month of of daily temperatures (`T`), and we want to extract all temperatures greater than 280 K and put them into a separate array (`bigT`). The IDL code to do this would be :

```bigpts = where(T gt 280.0)
bigT = T[bigpts]
```

To replace the elements of `T` specified by `bigpts` with some other value (say 5.0), in IDL just type: `T[bigpts] = 5.0`.

How can we do this in Python? The key is instead of creating an array that contains the indices of the array elements that meet our condition, we create a mask which has values of 1 or 0, depending on whether the corresponding element in the original data array meets the condition .

We create an array of 28 points of temperature data:

```T = Numeric.sin(Numeric.arange( 28 \
, typecode=Numeric.Float32)) \
* 5.0 + 279.0
```

where `T` has an amplitude of 5 K, oscillating around a mean of 279 K.

There are a variety of ways to create masks in `Numeric`. For instance, to create a mask of point greater than 280 K, you can use the `Numeric.where` command:

``` bigpts_mask = Numeric.where(T > 280.0, 1, 0) ```

or element-wise logical and comparison functions:

``` bigpts_mask = Numeric.greater(T, 280.0) ```

To extract these elements into the array `bigT`, use the `Numeric.compress` function:

``` bigT = Numeric.compress(bigpts_mask, T) ```

To replace the values in the original data array `T` for which `bigpts_mask` is true (i.e. 1), use the `Numeric.putmask` command. For instance, to replace all the points where `T` is greater than 280 K with 5 K:

``` Numeric.putmask(T, bigpts_mask, 5.0) ```

`Numeric.putmask` can also be used in replacing array members element by element, repeating the replacement array if necessary. For instance:

```data = Numeric.array([3,5,-2,4,6,1,-5,-8] \
, typecode=Numeric.Float32)
newdata = [-999,-888]
```

gives

``` >>> data array([-999., -888., -999., 4., 6., -888., -5., -888.],'f') ```

Note that `newdata` is shorter than `data`; when doing the replacement based on `mask`, Python repeats `newdata` as many times as needed to make a replacement array of the same size as `data`, the applies the "virtual" replacement array at places where `mask` is true.

Footnotes:

 `bigpts` is an array of the indices in `T` that satisfy the condition in the `where` call. In IDL the indices are converted to "equivalent" 1-D values, regardless of the dimension of `T`, so they uniquely refer to elements in `T`, as long as the dimensions of `T` do not change. Thus, in IDL `bigT` is a 1-D array (regardless of the dimension of `T`). [Return to text.]

 Here when we mean "mask" we don't mean a "masked array," which is an array that has missing or invalid elements, and are accomodated in the `MA` or `MV` modules. [Return to text.]

Notes: Thanks to Mike Steder for the help!