2023-11-02
Q: How much time are we expected to spend on the case studies?
A: That’s hard to say. I would recommend spending a bit of time after each lecture ensuring I understand the code presented. It will eventually be included in your final report, so you’ll need to understand/describe/explain it. After the case study has been presented, I would expect a few hours from each group member to complete the extension and write the report. Last year students reported typically spending 4-6h on case studies (with a big range around that median).
Q: For the general project plan how much time should we budget towards working on this?
A: Students report spending ~10h on their final project
Q: Are we allowed to work with some of our case study partners for a final project?
A: Absolutely! My hope is through the case studies students will get to know one another a bit and hopefully want to work together again!
Source: https://academic.oup.com/clinchem/article/59/3/478/5621997
Evidence suggests recent smoking and/or blood THC concentrations 2–5 ng/mL are associated with substantial driving impairment, particularly in occasional smokers.link
As of 2021…link
Various approaches:
Focus here: Can we identify a biomarker of recent use?
Which compound, in which matrix, and at what cutoff is the best biomarker of recent use?
Source: Hoffman et al.
Participants were:
Source: Fitzgerald et al.
Source: Hoffman et al.
Source: Hoffman et al.
Source: Hoffman et al.
Three matrices:
Variables:
ID
| participants identifierTreatment
| placebo, 5.90%, 13.40%Group
| Occasional user, Frequent userTimepoint
| indicator of which point in the timeline participant’s collection occurredtime.from.start
| number of minutes from consumptionYou’ll have access once your groups/repos are created…(today I want people to follow along; there will be time to try on your own soon!)
Where We’re Headed…
Results from: Hubbard et al (2021) Biomarkers of Recent Cannabis Use in Blood, Oral Fluid and Breath link
…and if there’s time PPV and Accuracy post 3h
Source: Fiztgerald et al.
OF <- OF |>
mutate(Treatment = fct_recode(Treatment,
"5.9% THC (low dose)" = "5.90%",
"13.4% THC (high dose)" = "13.40%"),
Treatment = fct_relevel(Treatment, "Placebo", "5.9% THC (low dose)"),
Group = fct_recode(Group,
"Occasional user" = "Not experienced user",
"Frequent user" = "Experienced user" )) |>
janitor::clean_names() |>
rename(thcoh = x11_oh_thc,
thcv = thc_v)
❓ What’s this accomplishing?
WB <- WB |>
mutate(Treatment = fct_recode(Treatment,
"5.9% THC (low dose)" = "5.90%",
"13.4% THC (high dose)" = "13.40%"),
Treatment = fct_relevel(Treatment, "Placebo", "5.9% THC (low dose)")) |>
janitor::clean_names() |>
rename(thcoh = x11_oh_thc,
thccooh = thc_cooh,
thccooh_gluc = thc_cooh_gluc,
thcv = thc_v)
BR <- BR |>
mutate(Treatment = fct_recode(Treatment,
"5.9% THC (low dose)" = "5.90%",
"13.4% THC (high dose)" = "13.40%"),
Treatment = fct_relevel(Treatment, "Placebo", "5.9% THC (low dose)"),
Group = fct_recode(Group,
"Occasional user" = "Not experienced user",
"Frequent user" = "Experienced user" )) |>
janitor::clean_names() |>
rename(thc = thc_pg_pad)
❓ We’re doing very similar things across three similar (albeit different) datasets. What would be a better approach?
We’ll need these later in our functions
timepoints_WB = tibble(start = c(-400, 0, 30, 70, 100, 180, 210, 240, 270, 300),
stop = c(0, 30, 70, 100, 180, 210, 240, 270, 300, max(WB$time_from_start, na.rm = TRUE)),
timepoint = c("pre-smoking","0-30 min","31-70 min",
"71-100 min","101-180 min","181-210 min",
"211-240 min","241-270 min",
"271-300 min", "301+ min") )
…and in BR and OF
timepoints_BR = tibble(start = c(-400, 0, 40, 90, 180, 210, 240, 270),
stop = c(0, 40, 90, 180, 210, 240, 270,
max(BR$time_from_start, na.rm = TRUE)),
timepoint = c("pre-smoking","0-40 min","41-90 min",
"91-180 min", "181-210 min", "211-240 min",
"241-270 min", "271+ min"))
timepoints_OF = tibble(start = c(-400, 0, 30, 90, 180, 210, 240, 270),
stop = c(0, 30, 90, 180, 210, 240, 270,
max(OF$time_from_start, na.rm = TRUE)),
timepoint = c("pre-smoking","0-30 min","31-90 min",
"91-180 min", "181-210 min", "211-240 min",
"241-270 min", "271+ min") )
assign_timepoint
🧠 What’s a UDF? What do you think this is doing?
WB <- WB |>
mutate(timepoint_use = map_chr(time_from_start,
assign_timepoint,
timepoints=timepoints_WB),
timepoint_use = fct_relevel(timepoint_use, timepoints_WB$timepoint))
# let's get a sense for what this did
levels(WB$timepoint_use)
[1] "pre-smoking" "0-30 min" "31-70 min" "71-100 min" "101-180 min"
[6] "181-210 min" "211-240 min" "241-270 min" "271-300 min" "301+ min"
Note: map_*
allow you to apply a function across multiple “things” (here: across all rows in a dataframe)
❓What do you think the above is doing?
OF <- OF |>
mutate(timepoint_use = map_chr(time_from_start,
assign_timepoint,
timepoints=timepoints_OF),
timepoint_use = fct_relevel(timepoint_use, timepoints_OF$timepoint))
BR <- BR |>
mutate(timepoint_use = map_chr(time_from_start,
assign_timepoint,
timepoints=timepoints_BR),
timepoint_use = fct_relevel(timepoint_use, timepoints_BR$timepoint))
❓What do you think the above is doing?
Cleaned/wrangled files as CSVs:
Note: can lose “type” of object (factor levels)