### 0. The Basics of R: Practical 0

### Selecting subsets

The command `unique()`

determines the number of unique entries in a variable. This can be very useful to find out the details of large data sets. For the gapminder data it can, for example, help to find out for how many countries we have data. To determine the length of a vector you can furthermore use the command `length()`

.

Copy the column

**country**from**G**into a vector called**country**. Subsequently, apply the commands`unique()`

and `length()`

to find how many countries the dataset contains.#142#

Use the following commands:

Alternatively, you could do this in one command:

Use the following commands:

country <- G$country

country_unique <- unique(country)

length(country_unique)

Alternatively, you could do this in one command:

length( unique(G$country) )

You can also use values of one variable to make selections from the dataset. For this the logical operators like `==`

can be used. The following command selects e.g. all rows in G which apply to Europe, and subsequently uses the result to make a subset from the vector country (which is stored in a new vector countryEurope).

inEurope <- G$continent == 'Europe'

countryEurope <- G$country[inEurope]

# equivalent to the above:

countryEurope <- G$country[G$continent == 'Europe']

Make a vector with

**lifeExp**data for the year**1962**and the continent**Africa**.You can do this in a few steps:

Save this in a new dataframe

The syntax is the same as in the first step, but now uses

For the third step, select from the dataframe that contains only data from

**1)**Select all data for**1962**Save this in a new dataframe

**G1962**.G1962 <- G[G$year == 1962, ]The selection between the square brackets means: 1) select all rows from G for which G$year is 1962 and 2) (after the ,) use all columns.

**2)**Select all rows for which the continent is**Africa**.The syntax is the same as in the first step, but now uses

**G1962**to start with.G1962_Africa <- G1962[G1962$ continent == "Africa",]

**3)**Select the column**lifeExp**For the third step, select from the dataframe that contains only data from

**Africa**in**1962**(created in step 2).G1962_Africa_lifeExp <- G1962_Africa$lifeExp

Unlock full access