# MATLAB Programming/Strings

## Declaring Strings

Strings are declared using single quotes:

``` >> fstring = 'hello'
fstring =
hello
```

Including a single quote in a string is done this way:

``` >> fstring = ''''
fstring =
'
>> fstring = 'you''re'
fstring =
you're
```

## Strings as a Character Array

Strings in MATLAB are an array of characters. To see this, executing the following code:

``` >> fstring = 'hello';
>> class(fstring)
ans = char
```

Because strings are arrays, many array manipulation functions work including: size, transpose, and so on. Strings may be indexed to access specific elements.

Performing arithmetic operations on character arrays converts them into doubles.

``` >> fstring2 = 'world';
>> fstring + fstring2
ans = 223   212   222   216   211
```

These numbers are from the ASCII standard for each character in the array. These values are obtained using the double function to turn the array into an array of doubles.

``` >> double(fstring)
ans = 104   101   108   108   111
```

The 'char' function can turn an array of integer-valued doubles back into characters. Attempting to turn a decimal into a character causes MATLAB to round down:

``` >> char(104)
ans = h
>> char(104.6)
ans = h
```

## Special String Functions

Since MATLAB strings are character arrays, some special functions are available for comparing entire strings rather than just its components:

### deblank

deblank removes white spaces from the string.

### findstr

findstr(bigstring, smallstring) looks to see if a small string is contained in a bigger string, and if it is returns the index of where the smaller string starts. Otherwise it returns [].

### strrep

strrep(string1, replaced, replacement) replaces all instances of replaced in string1 with replacement

### strcmp

Strings, unlike rational arrays, do not compare correctly with the relational operator. To compare strings use the strcmp function as follows:

``` >> string1 = 'a';
>> strcmp(string1, 'a')
ans = 1
>> strcmp(string1, 'A')
ans = 0
```

Note that MATLAB strings are case sensitive so that 'a' and 'A' are not the same. In addition the strcmp function does not discard whitespace:

``` >> strcmp(string1, ' a')
ans = 0
```

The strings must be exactly the same in every respect.

If the inputs are numeric arrays then the strcmp function will return 0 even if the values are the same. Thus it's only useful for strings. Use the == operator for numeric values.

``` >> strcmp(1,1)
ans = 0.
```

## Displaying values of string variables

If all you want to do is display the value of a string, you can omit the semicolon as is standard in MATLAB.

If you want to display a string in the command window in combination with other text, one way is to use array notation combined with either the 'display' or the 'disp' function:

``` >> fstring = 'hello';
>> display( [ fstring 'world'] )
helloworld
```

MATLAB doesn't put the space in between the two strings. If you want one there you must put it in yourself.

This syntax is also used to concatenate two or more strings into one variable, which allows insertion of unusual characters into strings:

``` >> fstring = ['you' char(39) 're']
fstring = you're
```

Any other function that returns a string can also be used in the array.

You can also use the "strcat" function to concatenate strings, which does the same thing as above when you use two strings, but it is especially useful if you are using a cell array of strings because it lets you concatenate the same thing to all of the strings at once. Unfortunately you can't use it to add white space (strcat discards what MATLAB considers extraneous whitespace). Here's the syntax for this use.

``` >> strCell = {'A', 'B'};
>> strcat(strCell, '_');
ans =
A_
B_
```

Finally, although MATLAB doesn't have a printf function you can do essentially the same thing by using 1 as your file identifier in the fprintf function. The format identifiers are essentially the same as they are in C.

``` >> X = 9.2
>> fprintf(1, '%1.3f\n', X);
9.200
```

The "9.200" is printed to the screen. fprintf is nice compared to display because you don't have to call num2str on all of the numbers in a string - just use the appropriate format identifer in the place you want it.

``` >> X = 9.2
>> fprintf(1, 'The value of X is %1.3f meters per second \n', X);
The value of X is 9.200 meters per second
```

## Cell arrays of strings

In many applications (particularly those where you are parsing text files, reading excel sheets with text, etc.) you will encounter cell arrays of strings.

You can use the function "iscellstr" to tell if all of the elements in a given cell array are strings or not.

``` >> notStrCell = {'AA', []};
>> iscellstr(notStrCell)
ans = 0
```

This is useful since functions that work with cell arrays of strings will fail if provided with something that's not a cell array of strings. In particular, they all fail if any elements of the provided cell array are the empty array ( [] ) which is somewhat frustrating if the provided text file contains empty cells. You must catch this exception before calling cellstr manipulation functions.

Searching a cell array of strings can be done with the "strmatch", "strfind", and "regexp" functions. Strmatch looks for a string within a cell array of strings whose first characters exactly match the string you pass to it, and returns the index of all strings in the array for which it found a match. If you give it the 'exact' option, it will only return the indexes of elements that are exactly the same as what you passed. For example:

``` >> strCell = {'Aa', 'AA'};
>> strmatch('A', strCell);
ans = 1, 2
>> strmatch('A', strCell, 'exact');
ans = []
>> strmatch('Aa', strCell, 'exact');
ans = 1
```

Strfind looks for a specific string within a cell array of strings, but it tries to find it in any part of each string. For each element x of the given cell array of strings, it will return an empty array if there is no match found in x and the starting index (remember, strings are arrays of characters) of all matches in x if a match to the query is found.

``` >> strCell = {'Aa', 'AA'};
>> strfind(strCell, 'A');
ans = % answer is a cell array with two elements (same size as strCell):
1         % Index of the beginning of string "A" in the first cell
1  2      % Index of each instance of the beginning of string "A" in the second cell
>> strfind(strCell, 'a');
ans =
2
```

The "cellfun" / "isempty" combination is very useful for identifying cases where the string was or was not found. You can use the find function in combination with these two functions to return the index of all the cells in which the query string was found.

``` >> strCell = {'Aa', 'AA'};
>> idxCell = strfind(strCell, 'a');
>> isFound = ~cellfun('isempty', idxCell); % Returns "0" if idxCell is empty and a "1" otherwise
>> foundIdx = find(isFound)
foundIdx = 2
```

The strfind function also has some other options, such as the option to only return the index of the first or last match. See the documentation for details.

The regexp function works the same way as strfind but instead of looking for strings literally, it tries to find matches within the cell array of strings using regular expressions. Regular expressions are a powerful way to match patterns within strings (not just specific strings within strings). Entire books have been written about regular expressions, so they cannot be covered in as much detail here. However, some good resources online include regular-expresions.info and the MATLAB documentation for the matlab-specific syntax. Note that MATLAB implements some, but not all, of the extended regular expressions available in other languages such as Perl.

Unfortunately, MATLAB does not innately have functions to do common string operations in some other languages such as string splitting. However, it is quite possible to find many of these functions in a google search.