*37*

TheÂ **sample()**Â function in R allows you to take a random sample of elements from a dataset or a vector, either with or without replacement.

The basic syntax for the sample() function is as follows:

**sample(x, size, replace = FALSE, prob = NULL)**

**x**: a dataset or vector from which to choose the sample

**size**: size of the sample

**replace**: should sampling be with replacement? (this is FALSE by default)

**prob**:Â a vector of probability weights for obtaining the elements of the vector being sampled

*The complete documentation for sample() can be found here.*

The following examples illustrate practical examples of using sample().

**Generating a Sample from a Vector**

Suppose we have vector *aÂ *with 10 elements in it:

#define vectorawith 10 elements in it a

To generate a random sample of 5 elements from vectorÂ *a* without replacement, we can use the following syntax:

#generate random sample of 5 elements from vectorasample(a, 5) #[1] 3 1 4 7 5

Itâ€™s important to note that each time we generate a random sample, itâ€™s likely that we will get a different set of elements each time.Â

#generate another random sample of 5 elements from vectorasample(a, 5) #[1] 1 8 7 4 2

If we would like to be able to replicate our results and work with the same sample each time, we can use **set.seed()**.

#set.seed(some random number) to ensure that we get the same sample each time set.seed(122) #define vectorawith 10 elements in it a #generate random sample of 5 elements from vectorasample(a, 5) #[1] 10 9 2 1 4 #generate another random sample of 5 elements from vectorasample(a, 5) #[1] 10 9 2 1 4

We can also use the argument **replace = TRUE** so that we are sampling with replacement. This means that each element in the vector can be chosen to be in the sample more than once.

#generate random sample of 5 elements from vectorausing sampling with replacement sample(a, 5, replace = TRUE) # 10 10 2 1 6

**Generating a Sample from a Dataset**

Another common use of the sample() function is to generate a random sample of rows from a dataset. For the following example, we will generate a random sample of 10 rows from the built-in R datasetÂ **iris**, which has 150 total rows.

#view first 6 rows of iris datasethead(iris) # Sepal.Length Sepal.Width Petal.Length Petal.Width Species #1 5.1 3.5 1.4 0.2 setosa #2 4.9 3.0 1.4 0.2 setosa #3 4.7 3.2 1.3 0.2 setosa #4 4.6 3.1 1.5 0.2 setosa #5 5.0 3.6 1.4 0.2 setosa #6 5.4 3.9 1.7 0.4 setosa#setseed to ensure that this example is replicableset.seed(100) #choose a random vector of 10 elements from all 150 rows in iris dataset sample_rows #choose the 10 rows of the iris dataset that match the row numbers above sample

Note that if you copy and paste the above code in your own R console, you should get the exact same sample since we used** set.seed(100)** to ensure that we get the same sample each time.