Hello everyone, and welcome back to my blog!
This week, I met with my professor’s graduate student colleague, Mr. Swain, who is in charge of guiding me through my internship project. On Monday, he sent me a database of around 370 bacterial species and their many characteristics including cell shape, cell length and width, colony morphology, colony pigmentation, and oxygen preference. As the first part of my project, I need to research the various cell shapes of the bacteria and calculate volume and surface area to find the surface area-to-volume ratio. The goal is to compare ratios and see how they contribute to the metabolisms of each species. Throughout this week, I have been grouping all of the species into around 4 different shapes, each with a unique formula for volume and surface area:
- Vibrious(curved rods)
Picture creds: Airen_Creation
While this task doesn’t involve coding in R just yet, performing the calculations in Excel will help me understand how to better organize and render databases. In two weeks, I will have another meeting with Mr. Swain about what to do next.
Besides the internship, I have made significant progress in learning R. I had originally planned to complete one course per week but since last Friday, I have successfully finished both courses 2 and 3 on Data Visualization and Probability! The Data Visualization course first introduced key terms such as types of distributions, z-scores, means, standard deviations, and types of graphs. It then incorporated how to develop data visualizations on R through using the ggplot2, dplyr, and gapminder command packages. Using functions such as data.frame(), ggplot(aes()), geom_point()/geom_abline(), geom_qq(), filter(), facet.grid(), and mutate(), I learned how to develop various density curves, histograms, and boxplots to manipulate and display my data. In the course on probability, I learned about functions for calculating combinations/permutations and running repeated simulations of random sampling. With pnorm(), qnorm(), rnorm(), replicate(), sample(), expand.grid(), and sapply(), I was able to run automatic simulations of repeated test trials up to 10,000 times to make inferences on probability in situations like playing roulette, winning the lottery, and calculating losses in foreclosures.
For next week, I am excited to start learning the fourth course on Inferences and Modeling. Hopefully, I can keep up my progress and finish two more courses on top of completing the calculations for all 370 bacterial species. Stay tuned for more updates on my coding adventure!