Random Walks#

  • Author:

  • Date:

  • Time spent on this assignment:

Remember to execute this cell to load numpy and pylab.

import numpy as np
import matplotlib.pyplot as plt
def resetMe(keepList=[]):
    ll=%who_ls
    keepList=keepList+['resetMe','np','plt']
    for iiii in keepList:
        if iiii in ll:
            ll.remove(iiii)
    for iiii in ll:
        jjjj="^"+iiii+"$"
        %reset_selective -f {jjjj}
    ll=%who_ls
    return

Exercise 1: Unbiased Random Walks#

  • List of collaborators:

  • References you used in developing your code:

In 1827, the botanist Robert Brown noticed that pollen grains floating in water would bounce around randomly, but was confused as to the reason. In 1905, Einstein successfully described the motion of the pollen particles, dubbed Brownian motion, as a random walk resulting from microscopic particles bombarding the large pollen grain. For the first half of this unit, we will study the random motion of Brown’s pollen grains in one and two dimensions.

First, let’s visualize a random walk in one-dimension. Let’s say that we are looking at a pollen particle stuck in a tube full of water so that it can only move left or right.

RW

We mark its original position as \(x_0=0 \mu m\). Every 30 seconds, we take a picture and record the position of the particle \(x_t\) in \(\mu m\). What we would record is something like this:

RW

In simulating this random walk, we assumed that every time we measured the particle’s position, it only depended on its previous position 30 seconds ago plus some random noise \(\delta\).

\[x_{t+1} = x_t + \delta.\]

In python, we can model the random jumps we see using numpy’s random number generators:

x[t+1] = x[t] + averageJumpDistance * np.random.randn()

where averageJumpDistance is the typical distance the pollen particle jumps every 30 seconds, which I set to 6 \(\mu m\). The np.random.randn() command generates a random number from the Gaussian (or Normal) distribution with mean zero and standard deviation one.

a. Plot a random walk#

Generate your own random walk and plot it’s location as a function of time. Use the same parameters as in the above figure, i.e., simulate \(6 \mu m\) jumps every 30 seconds for 30,000 seconds, or 1,000 jumps.

The first thing you will notice is that random walks can vary a lot. Yours will probably look very different than the one in the above figure.

Generate another random walk using your code and notice that it too is different form your first walk.

On your plot, go ahead and generate 10 different random walks

To accomplish this, it’s perfectly fine to go ahead and use a bunch of for loops. If you are feeling fancy, it is possible to do this efficiently using np.random.randn and np.cumsum.

Answer (start)
### ANSWER ME
Answer (start)

b. Random Walk Averages#

Since each random walk is different, we need to instead look at properties of the distribution of random walks to learn anything about them. The simplest thing to look at is the average.

Generate 2000 random walks and evaluate the average position \(\langle x_k \rangle\) after the \(k'th\) jump, for \(k\) running from 0 to 1000.

Before you compute the average positions, think about what you expect. Once you see the result, does it agree with your expectations? Does it make sense to you?Make sure you pay attention to the scale compared to the scale of a single random walk.

What do I mean by averaging over walks? For example, say that we have two random walks, each made of 1,000 steps. I store the positions of the random walks as two length 1,000 arrays walk1 and walk2. Averaging over the two walks, the average position after each jump is an array I can calculate as

average_array = (walk1 + walk2) / 2.0.

There’s a more graceful way to do this than storing all the arrays, then calculating the average at the end. Instead, add the position after each jump while generating a walk into a single accumulator array. After you are done generating walks divide each element in your accumulator array by the number of walks. Let me illustrate what I mean with a table of hypothetical data, for 2,000 random walks, each of 1,000 jumps.

which random walk

\(x_0\)(µm)

\(x_1\)(µm)

\(x_2\)(µm)

\(x_{999}\)(µm)

0

0

0.48

7.4

 

0.81

1

0

-1.5

1.3

 

7.5

2

0

-3.3

-2.6

 

6.8

 

1,999

0.

88.5

-42.2

nbsp;

74.1

averages

\(\langle x_0\rangle=0\)

\(\langle x_1\rangle=37.8\)

\(\langle x_2\rangle=45.1\)

\(\langle x_{999}\rangle=0.51\)

Here’s my plot:

RW
Answer (start)
### ANSWER ME
Answer (end)

c. Computing average distance#

As you should have noticed, the average is not telling you very much. This is because you’re just as likely to move left as right. Therefore, on average, you’re not going anywhere.

We need to look at something else. In some sense, we want to know how far away you are from the origin. One way that is conceptually nice to quantify this (although not necessarily the best) is to just plot the average absolute value of your location. Go ahead and start by making this plot.

Answer (start)
### ANSWER ME
Answer (End)

A slightly better quantity to look at which answers a similar question is the standard deviation of your random walks

\[ \sqrt{\frac{1}{N} \sum_{j=1}^N (x_k - \langle x_k\rangle)^2} \]
\[=\sqrt{\langle x_k^2\rangle - \langle x_k\rangle^2}\]

For the \(k\)’th jump the RMS tells us how much the pollen particle’s position will have tended to drift by the time it executes the \(k\)’th jump in a typical walk.

All you need to do is calculate the average value of \(x_k^2\) and the average value of \(x_k\). So each time you generate a walk, add the position of each jump into an accumulator array and the square of the position into another accumulator array. At the end, divide the contents of each bin in your accumulator arrays by the number of walks.

Write code to compute the RMS displacement as a function of time using 2,000 random walks and plot the result. You should get something similar to the following curve:

RW
Answer (Start)
### ANSWER ME
Answer (End)

We see that, on average, there is a very clear trend for the pollen particle to drift farther away over time. If we ignore the small value of the average position \(\langle x_k \rangle\) which is zero we can write

\[RMS_\textrm{kth jump} = \sqrt{\langle x_k^2\rangle - \langle x_k \rangle^2} \approx \sqrt{\langle x_k^2 \rangle} \propto \sqrt{t_k}\]

The square root here is very important and characterizes all diffusive motion. Note that a free particle moving with constant velocity will have \(x\propto t\). Such motion is sometimes called ballistic motion. The main takeaway is that diffusive motion is much slower than ballistic motion. Intuitively, this makes sense since, in diffusive motion, a particle is being severely slowed down by random bombardments from its environment.

🦉In addition to the plot above, make another log-log plot (of the same data) and show:

  • you get a line on this log-log plot (this shows that \(x \propto t^{\alpha}\))

  • Find the value of \(\alpha\) by computing the slope of your (log-log) line using np.polyfit. Show that it goes as \(t^{1/2}\)

Answer (start)
### ANSWER ME
Answer (End)

d. The full distribution#

So far we’ve been looking at some summary statistics as of the distribution. These are good pieces of information to start figuring out what’s going on. But, at some point it is helpful to look at the full distribution.

Let’s focus here on what the distribution looks like after 2 steps, 10 steps, 100 steps, and 500 steps.

To do this, you want to plot a histogram of your distribution - i.e. something like

plot.hist(locs[:,100],bins=50)

Plot all four histograms on the same plot.

You should see the distribution always looks gaussian with the width of the gaussian growing as the number of steps grow.

Answer (Start)
### ANSWER ME
Answer (End)

e. Diffusion Equations#

Previously in this course, we’ve worked on diffusion equations. Now we’ve move to stochastic random walks. The diffusion equations we’ve worekd with are deterministic so obviously can’t match each stochastic walk. But, the distribution as a function of time is something you might imagine could be correctly matched with differential equations. This is in fact the case - stochastic random walks are represented by the diffusion equation:

\[ \frac{\partial \rho}{\partial t} = D \frac{\partial^2 \rho}{\partial x^2}\]

Here \(D=a^2/2\) is the diffusion constant where \(a\) is our step size (in this case 6).

Write some code to simulate this diffusion equation from the initial conditions \(\rho(0)=\exp[-x^2/2\sigma^2]\) where \(\sigma=6\). This is where our random walks are after their first step.

To simulate this differential equation, you are going to want to let \(\rho\) be a density of random walks represented by a vector. I set up my \(\rho\) by doing

dx=0.1
points=np.array(range(0,size))
xs=dx*(points-size//2)
rho=np.array([np.exp(-x**2/(2*6)) for x in xs])

You can then plot it by doing

plt.plot(xs,rho/np.sum(rho)*dx)

(we scale it so that the integral of our density is 1.0).

Having this as your initial condition now you want to time-step your density. To do this, you should compute \(\frac{\partial^2 \rho}{\partial x^2}\) and then move forward the \(\rho\). Use a time step \(dt=dx^2/100\).

(It may or may not be helpful to use np.roll(rho,1) and np.roll(rho,-1))

Compare the distribution that you get at times \(t=2.0, t=10.0, and t=30.0\) against the histograms you got from the random walks (remembering to also scale the density of the random walks histograms like plt.hist(locs[:,10],bins=50,density=True).

Answer (Start)
### ANSWER ME
Answer (End)

f. Discrete Walks#

So far we’ve been doing random walks with gaussian steps. We would like to see what changes if instead we did discrete walks - i.e. at each step we are going to walk either left exactly 6 units or right exactly 6 units.

Write code to do this random walk and then print the histogram on separate plots at

  • 2 steps (with the gaussian histogram on top of it)

  • 100 steps (with the gaussian histogram on top of it).

i.e. for the 2 step histogram I wrote

plt.hist(locs_discrete[:,2],bins=10)
plt.hist(locs[:,2],bins=10,alpha=0.3)
plt.show()

For the 2 step histogram you should see a qualitative difference. But once you get to 100 steps, it seems that there is no dependence on the distribution for either the discrete or gaussian steps. This is a form of universality (and something we will see even more starkly in the two-dimensional case below).

Answer (Start)
### ANSWER ME
Answer (End)

Exercise 2: Random Walks in Two Dimensions#

  • List of collaborators:

  • References you used in developing your code:

Now, let’s consider a two-dimensional random walk. We put our pollen particle on a petri dish covered with a thin layer of water so that it can move in a plane.

RW

a. 2D Walks#

We mark the pollen grain’s initial position as \((x_0,y_0)=(0,0) \mu m\). We measure every 30 seconds and see each component of the pollen particle’s position \(x_k,y_k\) jump from its previous position. We’ll expect the typical jump distance to be \(\approx 6\sqrt{2}\) \(\mu m\).

Write code that generates 2,000 two-dimensional gaussian random walks of 5000 steps. Start by plotting y vs x of 10 such random walks.

Do the same thing for 10 discrete (up,down,left,right) random walks.

Answer (start)
### ANSWER ME
Answer (start)

b. Self Similarity#

Brownian random walks are self-similar. As you zoom into pieces of the figure, it looks qualitatively the same on all scales. Make a random walk of 128,000 steps. Then plot (on separate plots)

  • the random walk

  • the first half of the random walk (multiplying the \(x\) and \(y\) by 2)

  • the first quarter of the random walk (multiplying the \(x\) and \(y\) by 4)

  • the first eighth of the random walk (multiplying the \(x\) and \(y\) by 8)

  • the first 1/16 of the random walk (multiplying the \(x\) and \(y\) by 16)

  • the first 1/32 of the random walk (multiplying the \(x\) and \(y\) by 32)

Notice that these plots all look qualitatively similar. This is what I get:

RW
Answer (start)
Answer (end)

c. Standard Deviation#

Like in the 1D case, we expect the average \(\langle x_k \rangle\) to be zero. Let’s just go ahead and plot $\(\langle r \rangle = \langle \sqrt{x_k^2 + y_k^2} \rangle\)$ for 1000 copies of both the gaussian and discrete walk on the same plot

You should obtain a similar curve to that obtained in the one-dimensional case, but increased somewhat in amplitude.Even in two-dimensions, the displacement goes with the square root of time. Show this by also plotting it on a log-log plot and fitting the slope.

It turns out that you also see this behavior in three dimensions. In general, the RMS displacement always scales as the square root of time for any physical system undergoing diffusive motion.

Answer (Start)
### ANSWER ME
Answer (End)

d. Rotational Invariance#

Now we are going to show that in the 2D case, the discrete walks restore the rotational invariance of the gaussian walk.

To do this, we want to put the location of the 2000 2d coordinates at t=1000 steps.

Do this for both the discrete and guassian walks.

You should see that (1) They look the same and (2) More interestingly, the discrete case looks rotationally invariant (like a circle) even though it can only go up, down, left, right. This is an emergent rotational symmetry.

Answer (start)
### ANSWER ME
Answer (end)

e. Animation (extra credit: 5 points)#

Animate the two-dimensional random walk. You may find the animation code from previous assignments helpful.

Exercise 3: Diffusion Limited Aggregation#

In the previous exercises we learned some properties of random walks. In this exercise, we will apply random walks to the application of deposition. This will be done by modeling it with diffusion limited aggregation (DLA).

a. DLA onto a core point at the origin#

In DLA, we do the following:

  • Start with a particle at some fixed point (say the center of a box).

  • Repeat the following:

    • Add a new particle someplace on the boundary (anywhere on the four walls).

    • Move this particle using a discrete random walk (left, right, up, down). If it tries to move in the direction of one of the walls, it shouldn’t stick (but also shouldn’t go anywhere).

    • Once it is touching (up or down or left or right) a currently stuck particle, it should stick.

To accomplish this, I kept a two-dimensional numpy array which kept track of the location of the stuck particles and the currently moving particle.

At the end, you want to go ahead and plot the location of your stuck particles. To do this, use

plt.matshow(A)

where A is your numpy grid.

Start with a box of size \(100 \times 100\).
Use 1000 different particles.

This simulation is a little slow. On colab, it takes me approximately 2 minutes. For testing you might want to turn down the number of different particles that you use.

It also also interesting to see the picture when you have a larger box (\(200 \times 200\)) but this is much slower (and not required as part of the assignment).

Answer (start)
### ANSWER ME
Answer (end)

b. DLA onto a surface (extra credit: 5 points)#

In the previous section you did DLA onto a core point. If you are interested in understanding deposition onto a surface, you want to modify your DLA to come in at the top and stick either at the bottom or if they hit any previously stuck point. Implement this.

Answer (start)
### ANSWER ME
Answer (end)

Exercise 4: Path Integral Monte Carlo (extra credit)#

(Extra credit: 20 points)

  • List of collaborators:

  • References you used in developing your code:

(Some of the explanaitions here are abbreviated even for an extra credit. I am sneakily having you do Quantum Monte Carlo (QMC) here. If you want to do this extra credit [particularly after part a.] you should probably come talk to me)

a. Fixed Points#

Imagine we were to produce the walks in exercise 2 and then throw out any of the walks that didn’t end at \((x,y)=(5,5)\). This would induce some probability distribution over walks. Unfortunately this would be very inefficient. Instead figure out (read about Levy Flights and Brownian bridges) how to generate the same probability distribution as this efficiently. Write code that does so and then plot 10 paths between two fixed points (you can choose where they are fixed).

b. Connected Points#

Start with 100 connected (in order) points all at \((0,0)\) (i.e. \(x_0=(0,0); x_1=(0,0);...\) . We will have \(x_{99}\) connceted to \(x_0.\) Choose a window of ten points at random (maybe 33-43) and use the algorithm you described in 4a to regenerate points \(x_{34},x_{35},..,x_{42}\). Then pick another window and do this again. Repeat this process 1000 times and

  • make a plot of your points (connected by lines)

  • Report the average distance between consecutive points.

  • Make a histogram of the distances between average points.

c. Fixed Region#

Repeat b. with the constraint that you should just not rebuild your window if, after making the new points, any of the points are outside a box that extends \(-1 \leq x \leq 1\) and \(-1 \leq y \leq 1\).

Graph the probability points are at given places in the box.

d. 1D#

Redo this for one dimension.

**Acknowledgements:

  • Problems 1abc 2ac: George Gollin and Ryan Levy and Eli Chertkov (original); Bryan Clark (modifications)

  • Problems 1de 2b: Bryan Clark (original)

  • Problems 3,4: Bryan Clark (original)


© Copyright 2023