The essentials
Numerical approximations
Approximating \(\pi\)
What do you mean?
Pseudorandom numbers, logistic map, and chaos!?
Simulating ecology
Ecological communities
Species interactions & coexistence
Note: This workshop is about simulations and not programming. Don’t worry about the details but rather, focus on building intuition and a mental model
There are only a couple things you will need to run a simulation: 1) generate a sequence of numbers, 2) have some way to store data, 3) be able to manipulate that data, and 4) have imagination 😄
Before any simulation can happen, we need to a generate some numbers or a range of numbers. Think about it as the “spacetime” or canvas in which the reality of your simulation unfolds. Often times this is a sequence of ordered integers or a random sequence of numbers drawn from some probability distribution.
These are the commonly used functions that you can to generate such numbers in R:
Ordered sequence: seq()
Random sequence
Uniform distribution: runif()
Normal distribution: rnorm()
Examples
seq(from = 1, to = 10, by = 2)
## [1] 1 3 5 7 9
runif(n = 5, min = 0, max = 1)
## [1] 0.5581363 0.4001734 0.3617694 0.5873508 0.2044456
rnorm(n = 5, mean = 5, sd = 1)
## [1] 5.681982 4.245183 3.589591 4.480622 4.185396
Once we have generated some numbers and also when we run the actual simulation, we need to have some way to store the data.
In R, the commonly used data structures are:
Vectors - a sequence of items of the same type (integers, numerics, characters)
vec <- c(1, 2, 3)
vec
## [1] 1 2 3
Matrices or Arrays - generalization of vectors in multiple dimensions
mat <- matrix(data = 1:9, nrow = 3, ncol = 3)
mat
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
Data frames - a collection of variables organized in a table. Each column is a variable and each row is an observation/datum
df <- data.frame(varA = c(1, 5, 10), varB = c('blue', 'red', 'green'))
df
## varA varB
## 1 1 blue
## 2 5 red
## 3 10 green
Lists - the “meta” data structure, can store objects of any type. Can be a list of lists, a list vectors, list of dataframes, or a list of different combinations of data structures!
list(vec, mat, df)
## [[1]]
## [1] 1 2 3
##
## [[2]]
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 2 5 8
## [3,] 3 6 9
##
## [[3]]
## varA varB
## 1 1 blue
## 2 5 red
## 3 10 green
In general, loops are used to do repetitive tasks. Although R has some ways of doing repetitive tasks without loops (via vectorized operations), loops are particularly useful for iterating functions that are time-dependent. If your simulation has some time-dependent component, use loops! The two types of loops you can do are for loops
and while loops
. Sometimes either approach can work, but the main difference between them is that with for loops
you usually know a priori the number of iterations your program will run. However, sometimes you don’t know how many iterations your program will run for. Instead, you might have some condition or criteria that you will use to determine when to stop a program (e.g. accuracy of approximation).
For loops - iterates (i ) through some sequence and applies some function for each iteration
for (i in 1:length(vec)){
print(vec[i])
}
While loops - while conditions is true, keep iterating
val <- 1
while (val < 10){
#some function
do_this
#update test variable
val <- val + 1
}
Some inspiration from the great explainer
“Our imagination is stretched to the utmost, not, as in fiction, to imagine things which are not really there, but just to comprehend those things which are there.” -Richard Feynman
Sometimes an analytical solution is hard to derive or is simply not available. One way to get around this is to approximate a solution by doing what we already know.
Believe it or not, mathematics used to be an experimental science. Before the the concept of infinity or calculus was a thing, people had no way of dealing with curves. But curves are everywhere! To estimate the circumference of a circle, the Greek polymath Archimedes started by fitting a hexagon (a shape with straight lines!) inside a circle. Since he knew the distance between the corners of the polygon to the center of the circle is equal to the radius (r), he can apply Pythagorean’s theorem to get the distance between those corners. Summing up those distances would then give an approximation of the circle’s circumference. But the approximation of a hexagon is crude. Instead, Archimedes continued adding more edges to the polygon, and with each added edge, the approximation gets closer and closer to the actual value.
Instead of drawing polygons and solving the Pythagorean’s equation by hand(!), we can now have computers do it for us!
Suppose we have a unit circle enclosed within a unit square, like so:
It is said that the value of \(\pi\) is equal to 4 times the ratio between the circle’s area and the square’s area:
\[ \pi = 4 * \frac{Area_{circle}}{Area_{square}} \]
Obviously, anyone who’s taken elementary geometry would tell you “but we need \(\pi\) to get the area of the circle!” Instead of using equations, we can fill up the space (that make up the shapes) with quantities that we already know and use that to approximate area. Like estimating the volume of a vase by adding water from a graduated cylinder. Instead of water, we can fill up that space with points of data on a computer. For example, we can use the seq()
function to fill up the space between 0 and 1, and the amount of points within that range gives us some quantitative information about the space. Since the circle is smaller than the square, some points will lie within the circle while some are outside. The number of points that lie within and outside of the circle therefore gives us an approximation of the ratio between the area of the circle and the square. The more points we use, the closer we are to the actual value. Just as we can fill our vase with discrete and coarse objects like marbles, pebbles, or coins, instead of water. Coarser objects simply reduces the resolution of our approximation.
res <- 100 #resolution; the higher the better the approximation
points <- seq(from = 0, to = 1, length.out = res)
xy_points <- expand.grid(x = points, y = points) #gives a pairwise combination of each value
#Plot with points
circle_square_p +
geom_point(data = xy_points, aes(x = x, y = y), size = 0.5, alpha = 0.3)+
lims(x = c(0, 1), y = c(0,1))