Character Vectors and Factors

Dipping back in to “Statistics for Linguists” by Bodo Winter for a bit.

Example given is gender <- c('F', 'M', 'F', 'F'), apparently you can use either single or double quotes. When you execute a call to gender to display it on the console then R responds with double quotes.

You can use class(gender) to confirm that this is indeed a character vector and address elements individually using the square brackets notation as you can with numeric vectors so gender[2] returns "M". You can also use logical statements in the square brackets, which Bodo mentions like it’s been said before but I don’t recall it being. Anyhow, gender[gender == 'F'] returns the three elements that are F, I’d have thought it would be more useful to return the indices of those elements so if you had a matching vector of the names you could use something like names[gender[gender=='F']] to get the corresponding elements in names for those in gender that are F. I tried this and it just returned the first element of names three times. To do that you can use names[gender == 'F'] which works but seems less clear.

Next we’re introduced to factors, but not what they are. gender <- as.factor(gender) converts the vector gender to a factor. The key change seems to be that when you display the vector on the console R doesn’t put the letters in quotes and returns an additional line that reads Levels: F M. From the description it looks like the values are tokenised. The levels() function displays the valid levels for the vector and if you try to replace a value with a different one that isn’t a valid level, e.g. gender[3] <- 'not declared', then you get an error message and the element is repalced with NA. If you need to add a new level you can do so using the levels(function) and the c() function to populate it, levels(gender) <- c('F', 'M', 'not declared'), then you can do your gender[3] <- 'not declared'.

Published by stephenboothuk

A former Oracle DBA, then Technical Business Analyst and now I'm not sure what I am. If you want to find out more about me, my LinkedIn profile can be found at: http://www.linkedin.com/in/stephenboothuk

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: