Hello everyone! If you remember from my last post (Radar Images and Interpreting the Data Part 1-2), I discussed the image processing I was completing for my research. It involved using IDL scripts and ArcGIS zonal statistics to create and quantify UAVSAR-L (24 cm wavelength) polarization echo images and circular polarization ratio (CPR) data. I mentioned that I was going to replicate data plots that Campbell (2012) published in his paper on high circular polarization ratio data. In this post I will talk about what methods I used to plot the data and summarize the results.
To illustrate the difference in CPR between lava flows at Holuhraun, we need to be able to view the distribution of data and its mean. Histograms are perfect for this and more than one lava flow type can be plotted into the same figure for comparison. Normally, I would use excel to plot my graphs, but I decided to use a different program, one that can create and structure graphs in an appropriate way and that allows me to manipulate the axis, ticks, error bars and legend to my preference. I decided to use Python, in particular, matplotlib. Matplotlib is a Python 2D plotting library where you can create figures and graphs with journal publication quality. On the matplotlib website (https://matplotlib.org/) is a plethora of examples that you can copy and edit. I used their histogram base code and edited it to fit my data. I wanted to show you all a copy of the histogram code I used (a quick point however, I am still very new to Python so it does look a bit messy. I am planning on adding 'for loop' functions to reduce the number of command lines and make the script a lot neater):
I left some notes under each command line to explain what its purpose was for the script.
import matplotlib.pyplot as plt
- imports the Python 2D plotting library at the variable plt import numpy as np
- numpy is a library that adds support for large, multi-dimensional arrays and matrices. It also allows high level mathematical functions to operate on the arrays. import pandas as pd
- imports data structure and analysis tolls into the script. It is an open source BSD-licensed library.
d = pd.read_csv(r'C:\Users\gtolomet\Documents\Holuhraun\Rubbly Lava Flow Histogram CPR.csv') Rubbly_CPR = d['Rubbly CPR']
- reads the data in the csv marked in the directory (d). Tell the script to only read the data under the column named Rubbly CPR. The data is then assigned to the variable Rubbly_CPR. s = pd.read_csv(r'C:\Users\gtolomet\Documents\Holuhraun\Spiny Lava Flow Histogram CPR.csv')
Spiny_CPR = s['Spiny CPR']
- reads the data in the csv marked in the directory (s). Tell the script to only read the data under the column named Spiny CPR. The data is then assigned to the variable Spiny_CPR. m = pd.read_csv(r'C:\Users\gtolomet\Documents\Holuhraun\Mix Lava Flow Histogram CPR.csv') Mix_CPR = m['Mix CPR']
- reads the data in the csv marked in the directory (m). Tell the script to only read the data under the column named Mix CPR. The data is then assigned to the variable Mix_CPR.
#legend = ['Rubbly Lava', 'Spiny Lava', 'Mix']
- adds the following strings to the legend (commented out at the moment).
bin_edges = [0.15, 0.16, 0.17, 0.18, 0.19, 0.2, 0.21, 0.22, 0.23, 0.24, 0.25, 0.26, 0.27, 0.28, 0.29, 0.3, 0.31, 0.32, 0.33, 0.34, 0.35, 0.36, 0.37, 0.38, 0.39, 0.4, 0.41, 0.42, 0.43, 0.44, 0.45, 0.46, 0.47, 0.48, 0.49, 0.5, 0.51, 0.52, 0.53, 0.54, 0.55, 0.56, 0.57, 0.58, 0.59, 0.6, 0.61, 0.62, 0.63, 0.64, 0.65, 0.66, 0.67, 0.68, 0.69, 0.7, 0.71, 0.72, 0.73, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82, 0.83, 0.84, 0.85]
- The histogram bin edges are assigned. Determines the number of bins that will be created in the histogram graph. There is a command to input a number of bins but with the data I am using it does not present the data in high resolution (too few bars).
fig, axs = plt.subplots(3, 1, sharex=True)
- stacking histogram graphs on top of each other. The plt.subplots(3, 1, sharex=True) is broken down like this; subplots means multiple data plots, the 3 means the 3 rows of histograms will be created and the 1 means only one column will be created. The sharex=True tells the script to share the x-axis only).
#plt.hist([Rubbly_CPR], color=['orange'])
- plt.hist plots a single histogram graph (commented out at the moment). The variable in [] is the data that will be plotted.
axs[0].hist([Rubbly_CPR], color=['darkgreen'], bins=bin_edges, edgecolor='black', alpha=1.0) axs[1].hist([Spiny_CPR], color=['darkgoldenrod'], bins=bin_edges, edgecolor='black', alpha=1.0) axs[2].hist([Mix_CPR], color=['maroon'], bins=bin_edges, edgecolor='black', alpha=1.0)
- Each axs[0] represents the histograms. The [x] is the row number (since this is a python language, all lists start at value 0 not 1). The colour (yes, I am now using British spelling for colour. I had to use the American spelling in the code or else it would not work...) for each histogram is assigned, the bins are told to equal the bin_edges written above and the outlines of the bins are assigned a black colour.
- plots a normalized curve over the histograms. "Still working out the bugs for this command line so I am not using it at the moment."
plt.xlabel("CPR")
- plots the x-axis title. The title is defined as a string in the ().
#plt.ylabel("Number of Pixels")
- plots the y-axis title. The title is defined as a string in the (). Not needed when plotting stacked histograms. Only useful when plotting a single histogram. Command axs[I].set_ylabel() below is used for stacked histograms sharing the x-axis.
for i in range(3): axs[i].set_ylabel("No. of Pixels")
- sets the y-axis title for each histogram. The for loop goes through value i which is row number representing of each histogram.
axs[0].legend(['Rubbly']) axs[1].legend(['Spiny']) axs[2].legend(['Mix'])
- assigns a legend onto each histogram instead of grouping the legend onto one histogram. This is makes the graphs less messy.
locs = ['best']
- places the legends in the best location on the histograms. I.e. away from plots, bins, axis, titles and the histogram boundaries.
plt.xticks(np.arange(0.1, 1, step=0.1))
- plots the ticks on the x-axis every 0.1 interval from 0.1 to 1.
plt.setp(axs, yticks=range(0, 4000, 500))
- plots the ticks on the x-axis every 0.1 interval from 0.1 to 1. plt.setp is used instead of plt.yticks because we have to plot the ticks on more than one y-axis. yticks are plotted every 500 intervals from 0 to 4000.
axs[0].set_title('Lava Flow CPR - Holuhraun')
- Sets the title for the histograms above the first histogram. Prevents the title from appearing in the wrong place, such as the bottom histogram.
axs[0].axvline(Rubbly_CPR.mean(), color='k', linestyle='dashed', linewidth=2) axs[1].axvline(Spiny_CPR.mean(), color='k', linestyle='dashed', linewidth=2) axs[2].axvline(Mix_CPR.mean(), color='k', linestyle='dashed', linewidth=2)
- creates a dashed vertical positioned at the mean of each histogram.
axs[0].text(0.65, 2000, r'Rubbly Mean = 0.48') axs[1].text(0.666, 2000, r'Spiny Mean = 0.41') axs[2].text(0.689, 2000, r'Mix Mean = 0.45')
- the mean from each histogram is written onto each histogram. The x, y numbers represent the location of the text in the histograms.
plt.show()
- shows the results after running the script.
The product of the script is shown below. The CPR data from the rubbly, spiny and mix (more than one lava morphology (Voigt et al. 2018)) lava flows are plotted into histograms that are stacked on top of each other. This way, we can observe the data and summarize their differences and similarities.
For a first attempt, I think the histograms came out quite nice!
From the results, the first thing I noticed was that the distribution of data was relatively similar. None of the histograms showed any skews in the data and are not narrowing around the mean. The main difference between them is the frequency of data. The spiny and rubbly lava data has a higher frequency than the mix lava data because the CPR were taken from a larger area. This was not intentional. The rubbly and spiny lava flows are the dominant lava flow types at Holuhraun and cover a majority of the lava field. It was easier to extract more rubbly and spiny CPR data than the mix. Despite the difference in frequency, the histograms reported similar CPR mean values, between 0.4 and 0.5. With similar CPR means, the three lava flows would appear indistinguishable under UAVSAR L-band data. I was not surprised when I saw that the CPR meansvalues were similar because when I was in the field I noticed that there was little discrepancy between the decimetre-scale surface roughness of the rubbly and spiny lava flows (both looked equally as rough). I did not get to study the mix lava flows in the field because they were far within the centre of the lava field and therefore too treacherous to reach. However, Voigt et al. (2018) describes the mix surface morphology in detail and their descriptions indicate a surface with similar roughness to the spiny and rubbly lava flows. Even though the lava flows are different in regards to emplacement and surface roughness texture, they appear similar under L-band radar, making them indistinguishable. This reminded me about my research on Craters of the Moon (COTM) lava flows where I discuss how ambiguous interpretations about lava flow emplacement can arise when using radar data. I wish I could cite the paper but it is still under review (Planetary and Space Sciences journal, Volcanic Analogue NASA FINESSE special issue). I will be sure to cite it as soon as the review process is over and it becomes publically available.
The lava flows appear indistinguishable under L-band radar yes, but what about under high resolution topography data collected using a mobile laser system (MLS) LiDAR instrument?
At the moment, I do not have a lot of data points because I am still waiting on more LiDAR point cloud data from our research associate and collaborator Dr Antero Kukko from the Finnish Geospatial Institute. I can show you what I have plotted thus far but I am going to be holding off interpretations until I have all of LiDAR data.
This scatter plot shows the root-mean-square (RMS) slope (Cs) vs the CPR of the lava flows. The plot was created using a Python 2D plotting matplotlib script, similar to the histogram script presented at the start of this post. I have decided to leave explaining what Cs is and why I am using it to study surface roughness for my next blog post. Right now, all you need to know is the greater Cs is the rougher the surface is at the centimetre scale (the slope between points along a profile varies a lot more). What you will notice at first glance is the spiny lava is slightly scattered along the Cs axis but the rubbly lava is not. At the centimetre-scale, it appears the spiny lava has some variation is surface roughness. Some areas are smoother than others. It is difficult to interpret the data right now because some of the LiDAR scans that I am waiting for were taken from areas at Holuhraun with different surface textures. The results will either cluster around the plots I have at the moment or they will scatter. Until I know for sure, I will not talk about this any further.
Question For You Guys
I wanted to ask for your opinions about a potential next step in my research involving radar and LiDAR data.
I have AIRSAR (Air Synthetic Aperture Radar) and LiDAR data collected from my other field site COTM and was thinking about plotting the CPR and Cs data from those datasets and comparing them to the Holuhraun data. The issue I have with this thought is that the AIRSAR data has a slightly lower resolution than the UAVSAR data. Their incidence angles are also different (I believe the UAVSAR data was taken at a slightly higher incidence angle, which would increase the CPR values).
Do you think it is a good idea to compare the radar and LiDAR results of these two sites despite these differences in radar instrument specifications?
See you all next time!
Web address to previous blog post: https://gavintolometti.wixsite.com/gavinonthemoon/single-post/2019/10/30/Radar-Images-and-Interpreting-the-Data-–-Part-2