2.1 Problem

Well, what about randomness and distributions? Is randomness even relevant to distributions?

Yes! Randomness is fundamental for numerous other concepts, such as independence, variability, sampling, random variables, distributions, random walks, re-randomisation, and much more (see Batanero et al, 2016; New Zealand Ministry of Education, 2012).

The connection between random phenomena and structured distributions appears to be rarely taught but is essential to how everyday events can be expressed statistically. This is just one example of how randomness misconceptions could affect understanding in other areas in statistics.

Learning Aims:

  1. Consider how random events might be expressed over an area using knowledge of distributions.
  2. Explore the relationship between small samples, large samples, and distributions.
  3. Consider how randomness can effect waiting times and build into different distributions.
  4. Recognise the process of random sampling from distributions.


2.2 Spatial Randomness

Introduction

Spatial randomness differs slightly from the typical sequence-based examples we usually see. This is where we consider random events over an area.

For this section, we’re going to be exploring spatial randomness with raindrops. Imagine, you are watching the rain as it falls on the pavestones outside. The pavestones make a 20 by 20 grid, comprising 400 squares.




  • Discussion time! Before moving on, talk in pairs or groups about what 50 raindrops on the pavestones might look like if they left a mark. What assumptions might you make?
  • When you have discussed what this might look like, consider how you might go about simulating this… Hint: what kind of distribution might you use?

See some discussion examples on rainfall (uanga) and glow-worms (pūrātoke)!

Community Time!
In this sub-section, Spatial Randomness, we are learning about how randomness can be seen over an area.
Ask your community about their experiences with seeing random phenomenon, like rainfall and glow-worms, and any stories relating to this.


Simulating Rainfall

Let’s have a bit more of an explore. Using the plot below, change the kind of distribution used in the simulation to produce a spatial plot. This plot represents the pavestones, with the dots recording where each of the 50 raindrops fall.

  • For each plot, discuss what assumptions you might make about the way the rain is falling. Some of the distributions might not be appropriate to use for this example, which ones do you think that might be?
  • Which distribution do you think worked best?

More information on these distributions here.


Short-run & Long-run

Our next exploration will look at some examples of short run and long run simulations.

  • Before you play with the slider, which distribution do you think this might be?

Drag the slider to change the number of observations.

  • Was your distribution prediction correct?

See some discussion examples on short run and long run simulations!


2.3 Waiting Times

Introduction

Waiting times are another example where we can consider randomness.

For this section, we’re going to be exploring waiting times using tweets. Imagine, you’re procrastinating, scrolling on Twitter, waiting for updates from celebrities.

  • Discussion time! Before moving on, talk in pairs or groups about how often you think a celebrity (any celebrity you like) might post a tweet. The explore some examples from the six celebs below.
  • Discussion time! Not all tweets will be random - sometimes celebrities or companies have timed releases of updates. This might be seen by times that are regular; for example, every six hours. With exception to this, discuss whether you think the time when tweets are posted are random or not.

This is also an interesting analysis: Trump Tweets: Android vs. Apple

See some examples on the randomness of tweets!

Community Time!
In this sub-section, Waiting Times, we are learning about how randomness can be seen over time. The context here will reference Twitter, but there are many examples of random waiting times.
Ask your community about their experiences and stories about random times between events, like out fishing, natural disasters, and weather patterns.


Exploring with Scampy

To explore the waiting times distribution and distribution of counts, get the data for your favourite celebrity in the example above using the buttons below and have a play around on Scampy.

This is a tool developed at the University of Auckland by the Department of Statistics. It actually started my PhD! My honours dissertation piloted this tool and from that research, my PhD thesis grew!

I recommend having an explore of the data on Scampy and then coming back here to explore the randomness going on in the example.

See a discussion example using Scampy!


Waiting Times to Distributions

When we have enough data, random observations typically build into distribution. It takes a few tweets to get there though!

In May, 2020, @TheRealDonaldTrump tweeted 549 times! Suppose you only get notifications when a tweet by @TheRealDonaldTrump is made then, on 1st of May, 2020, your phone would buzz 13 times. If we speed this up, it would sound like this:

We are interested in the time between the tweets (the lines between tweets shown in the video). If we plot the time between each tweet, we get the following plot. Use the scale to add more observations to the graph.

  • Discussion time! What did you notice? What kind of distribution might @TheRealDonaldTrump’s tweets follow?

See a discussion example on this data! Then, let’s test some possible distributions. To do this, we’re going to use Anna’s Goodness-of-Fit tool. Click the button below to show the data and then paste this into the sample data box in the tool.

  • Which distribution fit @TheRealDonaldTrump’s tweets best?

How about this one?

  • What do you think? Does this fit better?

See a discussion example!


Getting a Random Sample

Now that we have looked at building distributions from random observations, we now want to explore the reverse process - getting a random sample from a distribution!

We will continue using the @TheRealDonaldTrump tweets.

For our sample, we want the probability of selecting any given observation to remain constant. To do this, we will sample with replacement, which means we put any observations we draw back in the pile for the chance to be drawn again. The following code will give us five random observations with replacement from the @TheRealDonaldTrump waiting time data.

## [1]  9.33  7.28 17.90  0.12  0.03

Let’s see an example of sampling 5 random observations:

This site has been created as part of my PhD thesis on perceptions of randomness. I am always keen for feedback, so please email me any thoughts you have via amy.renelle@auckland.ac.nz. Thank you to my supervisors, Dr. Stephanie Budgett and Dr. Rhys Jones, for their guidance throughout my project. I would also like to thank Anna Fergusson for her help inspiring and creating this website. You can find the references for this site here.