MATLAB Programming/Strings

From Wikibooks, open books for an open world
< MATLAB Programming
Jump to: navigation, search

Declaring Strings[edit]

Besides numbers, MATLAB can also manipulate strings. They should be enclosed in single quotes:

 >> fstring = 'hello'
 fstring =
 hello

If you would like to include a single quote this is one way to do it:

 >> fstring = ''''
 fstring = 
 '
>> fstring = 'you''re'
 fstring =
 you're

An important thing to remember about strings is that MATLAB treats them as array of characters. To see this, try executing the following code:

 >> fstring = 'hello';
 >> class(fstring)
 ans = char

Therefore, many of the array manipulation functions will work the same with these arrays as any other, such as the 'size' function, transpose, and so on. You can also access specific parts of it by using standard indexing syntax.

Attempting to perform arithmetic operations on character arrays converts them into doubles.

 >> fstring2 = 'world';
 >> fstring + fstring2
 ans = 223   212   222   216   211

These numbers come from the standard numbers for each character in the array. These values are obtained by using the 'double' function to turn the array into an array of doubles.

 >> double(fstring)
 ans = 104   101   108   108   111

The 'char' function can turn an array of integer-valued doubles back into characters. Attempting to turn a decimal into a character causes MATLAB to round down:

 >> char(104)
 ans = h
 >> char(104.6)
 ans = h

Since the MATLAB strings are treated as character arrays, they have some special functions if you wish to compare the entire string at once rather than just its components:

findstr(bigstring, smallstring) looks to see if a small string is contained in a bigger string, and if it is returns the index of where the smaller string starts. Otherwise it returns [].
strrep(string1, replaced, replacement) replaces all instances of replaced in string1 with replacement

Displaying values of string variables[edit]

If all you want to do is display the value of a string, you can omit the semicolon as is standard in MATLAB.

If you want to display a string in the command window in combination with other text, one way is to use array notation combined with either the 'display' or the 'disp' function:

 >> fstring = 'hello';
 >> display( [ fstring 'world'] )
 helloworld

MATLAB doesn't put the space in between the two strings. If you want one there you must put it in yourself.

This syntax is also used to concatenate two or more strings into one variable, which allows insertion of unusual characters into strings:

 >> fstring = ['you' char(39) 're']
 fstring = you're

Any other function that returns a string can also be used in the array.

You can also use the "strcat" function to concatenate strings, which does the same thing as above when you use two strings, but it is especially useful if you are using a cell array of strings because it lets you concatenate the same thing to all of the strings at once. Unfortunately you can't use it to add white space (strcat discards what MATLAB considers extraneous whitespace). Here's the syntax for this use.

 >> strCell = {'A', 'B'};
 >> strcat(strCell, '_');
 ans =
 A_
 B_

Finally, although MATLAB doesn't have a printf function you can do essentially the same thing by using 1 as your file identifier in the fprintf function. The format identifiers are essentially the same as they are in C.

 >> X = 9.2
 >> fprintf(1, '%1.3f\n', X);
 9.200

The "9.200" is printed to the screen. fprintf is nice compared to display because you don't have to call num2str on all of the numbers in a string - just use the appropriate format identifer in the place you want it.

 >> X = 9.2
 >> fprintf(1, 'The value of X is %1.3f meters per second \n', X);
 The value of X is 9.200 meters per second

Cell arrays of strings[edit]

In many applications (particularly those where you are parsing text files, reading excel sheets with text, etc.) you will encounter cell arrays of strings.

You can use the function "iscellstr" to tell if all of the elements in a given cell array are strings or not.

 >> notStrCell = {'AA', []};
 >> iscellstr(notStrCell)
 ans = 0

This is useful since functions that work with cell arrays of strings will fail if provided with something that's not a cell array of strings. In particular, they all fail if any elements of the provided cell array are the empty array ( [] ) which is somewhat frustrating if the provided text file contains empty cells. You must catch this exception before calling cellstr manipulation functions.

Searching a cell array of strings can be done with the "strmatch", "strfind", and "regexp" functions. Strmatch looks for a string within a cell array of strings whose first characters exactly match the string you pass to it, and returns the index of all strings in the array for which it found a match. If you give it the 'exact' option, it will only return the indexes of elements that are exactly the same as what you passed. For example:

 >> strCell = {'Aa', 'AA'};
 >> strmatch('A', strCell);
 ans = 1, 2
 >> strmatch('A', strCell, 'exact');
 ans = []
 >> strmatch('Aa', strCell, 'exact');
 ans = 1

Strfind looks for a specific string within a cell array of strings, but it tries to find it in any part of each string. For each element x of the given cell array of strings, it will return an empty array if there is no match found in x and the starting index (remember, strings are arrays of characters) of all matches in x if a match to the query is found.

 >> strCell = {'Aa', 'AA'};
 >> strfind(strCell, 'A');
 ans = % answer is a cell array with two elements (same size as strCell): 
   1         % Index of the beginning of string "A" in the first cell
   1  2      % Index of each instance of the beginning of string "A" in the second cell
 >> strfind(strCell, 'a');
 ans =
   2
   [] % 'a' is not found

The "cellfun" / "isempty" combination is very useful for identifying cases where the string was or was not found. You can use the find function in combination with these two functions to return the index of all the cells in which the query string was found.

 >> strCell = {'Aa', 'AA'};
 >> idxCell = strfind(strCell, 'a');
 >> isFound = ~cellfun('isempty', idxCell); % Returns "0" if idxCell is empty and a "1" otherwise
 >> foundIdx = find(isFound)
 foundIdx = 2

The strfind function also has some other options, such as the option to only return the index of the first or last match. See the documentation for details.

The regexp function works the same way as strfind but instead of looking for strings literally, it tries to find matches within the cell array of strings using regular expressions. Regular expressions are a powerful way to match patterns within strings (not just specific strings within strings). Entire books have been written about regular expressions, so they cannot be covered in as much detail here. However, some good resources online include regular-expresions.info and the MATLAB documentation for the matlab-specific syntax. Note that MATLAB implements some, but not all, of the extended regular expressions available in other languages such as Perl.

Unfortunately, MATLAB does not innately have functions to do common string operations in some other languages such as string splitting. However, it is quite possible to find many of these functions in a google search.

Comparing strings[edit]

Unlike with rational arrays, strings will not be compared correctly with the relational operator, because this will treat the string as an array of characters. To get the comparison you probably intended, use the strcmp function as follows:

 >> string1 = 'a';
 >> strcmp(string1, 'a')
 ans = 1
 >> strcmp(string1, 'A')
 ans = 0

Note that MATLAB strings are case sensitive so that 'a' and 'A' are not the same. In addition the strcmp function does not discard whitespace:

 >> strcmp(string1, ' a')
 ans = 0

The strings must be exactly the same in every respect.

If the inputs are numeric arrays then the strcmp function will return 0 even if the values are the same. Thus it's only useful for strings. Use the == operator for numeric values.

 >> strcmp(1,1)
 ans = 0.