0. The Basics of R: Practical 0
Selecting subsets
The command unique() determines the number of unique entries in a variable. This can be very useful to find out the details of large data sets. For the gapminder data it can, for example, help to find out for how many countries we have data. To determine the length of a vector you can furthermore use the command length().
Copy the column country from G into a vector called country. Subsequently, apply the commands
unique() and length() to find how many countries the dataset contains.#Error in file(filename, "r", encoding = encoding) :
cannot open the connection to 'https://test.sowiso.nl/images/uploads/exercises/42950/Gapminder.R'
Calls: source -> file
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
cannot open URL 'https://test.sowiso.nl/images/uploads/exercises/42950/Gapminder.R': HTTP status was '503 Service Unavailable'
Execution halted#
Use the following commands:
Alternatively, you could do this in one command:
Use the following commands:
country <- G$country
country_unique <- unique(country)
length(country_unique)
Alternatively, you could do this in one command:
length( unique(G$country) )
You can also use values of one variable to make selections from the dataset. For this the logical operators like == can be used. The following command selects e.g. all rows in G which apply to Europe, and subsequently uses the result to make a subset from the vector country (which is stored in a new vector countryEurope).
inEurope <- G$continent == 'Europe'
countryEurope <- G$country[inEurope]
# equivalent to the above:
countryEurope <- G$country[G$continent == 'Europe']
Make a vector with r data for the year E and the continent r.
You can do this in a few steps:
1) Select all data for E
Save this in a new dataframe GE.
2) Select all rows for which the continent is r.
The syntax is the same as in the first step, but now uses GE to start with.
3) Select the column r
For the third step, select from the dataframe that contains only data from r in E (created in step 2).
1) Select all data for E
Save this in a new dataframe GE.
GE <- G[G$year == E, ]The selection between the square brackets means: 1) select all rows from G for which G$year is E and 2) (after the ,) use all columns.
2) Select all rows for which the continent is r.
The syntax is the same as in the first step, but now uses GE to start with.
GE_r <- GE[GE$ continent == "r",]
3) Select the column r
For the third step, select from the dataframe that contains only data from r in E (created in step 2).
GE_r_r <- GE_r$r
Unlock full access