Posts

Showing posts from February, 2025

Project: Update

I would like to have access to a data set that shows how full the Case shuttles are at what time and day. If I had this, I could sink it with weather data to determine if weather conditions have an impact on how full the shuttles are. A data set like this not publicly available, so I have been emailing people that work for transportation and safety at CWRU to find out if this data set exists and if I can have access to it. It is not clear who can provide me this data, so I have emailed 2 people so far. One has answered and said he can't help me and the other has not responded. I also talked to a shuttle driver who said that the percent capacity of each shuttle that is on the Spartan Ride app is not reliable. I will ask a few more people, but if it turns out this data set does not exist or is not accessible, than I will have to pivot slightly. I would most likely move to the RTA data, which there is a data set on ridership for the RTA. However, I would expect slightly different resu...

R: Current Knowledge on Modeling

     Statistics exist to help make sense of a data set. However, statistics should only be used if you pick a model that makes the same assumptions as your data does. For example, you should only take the least squares and make a linear fit if the data is reasonably linear. This is true for any other type of fit as well such as quadratic or exponential. If you don’t look at your data beforehand and take the linear fit of widely dispersed data, you will produce an output that makes no sense.       I found Gigerenzer’s Mindless Statistics to be very interesting in this topic. He stated that the founders of statistics had strict debates about statistics, and had conflicting ideas. None of them thought that there should be one method for all hypotheses. I have been guilty of falling down the path of simply applying the “null ritual” in the past. Some questions do not require the Fisher and Neyman-Pearson hybrid. Instead, the model should be thought more cr...

Homework: Cognitive Diversity in Deciding University

 One huge decision that many people have to make is what university they will go to after they graduate high school. People will usually first think about what they want to get out of college, then apply to programs that fill those wants. After they apply, some schools will reject the student or accept them, and then the student must decide where to go. This is an important decision and can be thought of from many different perspectives of decision making. One way to think about it is Bayesian thinking. If your goal is to become a doctor, what is the probability you will achieve your goal, will your probability of achieving that goal increase given you go to school A or school B? What is the probability that you will have good opportunities to achieve the things you want to at a given school?  This decision can also be thought through as rational choice, although the game is very complicated and has many moving parts and is not the best way to think about this decision. Player...

Project: Deep Research Prompt

I am writing a prompt to put into deep research to have better direction when constructing my project.  Act as an elite researcher who is a specialist in the science of decision-making, specifically specializing in transportation decisions. I need a thoroughly detailed research paper for Case Western Reserve University to determine the best way to improve the quality of their shuttle service based on the needs of their riders. Generate a comprehensive, detailed research report on how weather, rush hours, and frequency of the service impact a student's decision to take the school's shuttle system. Consider looking at how weather, rush hour, and frequency of service impacts ridership on public transit on the city scale, and then apply it to the university scale. Include how bus availability impacts their decision. For example, if the bus is running more frequently, is a passenger more likely to consider waiting at the stop compared to a less frequent bus? In other words, include ...

Homework: Chi Square testing in R

Image
I made a table of Trust and Donation results as shown below: The way to interpret the table is that there are only 5 options for Trust and Donation. There are 93 data entries where someone picked 0 for both trust and donation, 12 where someone picked 1 for trust and 0 for donation, etc. Notice how many of the cells in the table have small values or 0's. I then took a Chi Squared test of the table. Below are the results: To interpret these results, we will pick an acceptable p-value. A typical p-value is 0.05. If the calculated p-value is less that 0.05, we will reject the null hypothesis. In other words, we would would conclude that there is a statistical correlation between the way people played "Trust" and "Donation". If the calculated p-value is greater than 0.05, we would fail to reject the null, meaning that there is not enough evidence to show that there is a significant difference. In this case, the p-value is much smaller than 0.05, meaning that there is...

Project: How weather and time of day impact student's decisions to ride the shuttle

 I want to know how weather and time of day impacts the way people decide to take the CWRU shuttle. The CWRU shuttle is used by many on campus to get from one end to the other, however, CWRU campus is still small enough to be walked throughout. Some students think the shuttles are unreliable. However, this unreliability can be due to the way the shuttles are structured. There is no set time scale, which means that it is very difficult to plan your day around catching a shuttle. Each shuttle has its own time frame that they run. Some only run at rush hours or are more frequent during rush hours, some only run at night, and some only run during the winter months. Sometimes the shuttles overfill and stop accepting people. I would like to study how people decide to ride the shuttle based on changing conditions. I would like to determine how the shuttles can best accommodate people's habits so that the shuttles can be more reliable.

Homework: Bayesian Thinking

  I took a quick test to determine how good I am at Bayesian thinking.  (1) A doctor assigns you at random for a test for disease D. 1 in 10,000 people in the USA have disease D. The test is 99% accurate, meaning that the probability of a false positive is 1%. The probability of a false negative is 0.  You test positive. What is the new probability that you have disease D? I first thought that since the test has a high accuracy and that there are no false negatives, that the probability that I have the disease after testing positive would be pretty high. However, I was surprised to see that the probability of having the disease given that I tested positive is just 1%, which is very low.  Just because there are no false negatives and has a very low false positive rate does not mean it is accurate because we do not know for sure if someone has a disease. Since the disease is very rare, it is more likely for anyone to not have the disease. This skews the calculations fo...

Survey

Mysteries in Decision Making that Interest Me Mystery 1: How does a constantly changing landscape affect our everyday decisions? This involves constantly updating what we know about the world, trying to guess what the outcome of each decision would be, and determining what risks are worth absorbing. Decision theory discusses this exact mystery by determining probability of type I and type II errors and comparing that to acceptable level of risk. Since humans are not good at Bayesian thinking, we are not good at calculating type I and II errors in decision theory, so it might be interesting to determine to what extent we use decision theory in our decision making process. I would survey people about what their acceptable level of risk is for errors and then ask them a series of questions with various conditions to determine how good they are at determining risk of error in those situations. Mystery 2: What is the interplay between logical thinking and emotions? We know that people are n...

Homework: mean, median, variance, and standard deviation

Image
  Above is a bar chart of the number of player ones who pick each option. Above is the console from my code to find the mean, median, variance, and standard deviation for choices in the Donation game. The bar graph for Donation looks very similar to the bar graph for Trust. The format of the dollar sign makes R look at just the specified column. na.rm = TRUE means that only non null values will be considered. If there are any null values, these commands would all give "NA" as an output.  

Homework: Making csv files in R

Image
  Above is the graph that I created this week. It has 10 data points, each labeled with an ID 1 through 10. The axis are ID log10(ID) versus ID^2. Each of these data points also has a small amount of normally distributed noise added to the result. I chose these functions because they are both common run time analysis equations and I wanted to show graphically that the quadratic function grows faster than the logarithmic function. The red line is the y=x line. The data points I created are below y=x, which means that the x values increase faster than the y values. That means that the quadratic function on the x axis grows faster than the logarithmic function on the y axis. Attached is my code. It was fairly convenient to add all of these values directly into R because there were not many data points. However, I would not do it if I have a lot of data points because I would have to manually add all of the values in the first column first before I can add any of the rules in the other...