top of page
Completed
MIT Koch Institute
TIMELINE
FIELD
ROLE
2025
Data science &
Student Researcher
STATUS
Completed
Computational biology
OVERVIEW
WHAT I DID




TAKEAWAYS
The visuals, as a synthesis, demonstrate that a high-iron diet impacts genes that have the potential to alter cancer outcomes. As demonstrated by the heatmaps, the introduction of iron creates noticeable gene-expression separation, indicating that high iron has a consistent transcriptional effect in PDAC. Specifically, it affects genes like Hmox1, Cbr3, and especially Gsta3––all of which are upregulated oxidative stress response genes, which fits into the biological context that anti-oxidative measures are taken to reduce ferroptosis (iron-driven, regulated cell death) when iron is introduced to the system.
This data can be interpreted two ways. For one, there is clear defensive action against ferroptosis, indicating that high-iron is indeed putting stress on cancerous cells. However, in this case, since anti-oxidative genes are upregulated and others like Tfrc (which regulates cellular iron intake) are downregulated, the PDAC cells are demonstrating defense and adaptation, which is concerning––treatment aims to neutralize cancer cells, not make them more resilient. Further observation, including an analysis of tumor size, may be necessary for positive/negative classification of a high-iron diet.
library(DESeq2)
library(tidyverse)
library(dbplyr)
library(EnhancedVolcano)
library(ComplexHeatmap)
data <- read.delim("/Users/seciluluderya/Desktop/salmon.merged.gene_counts.tsv")
tpm <- read.delim("/Users/seciluluderya/Desktop/salmon.merged.gene_tpm.tsv")
data$gene_name <- NULL
head(data)
countData <- column_to_rownames(data, var = "gene_id") %>% round()
colData <- data.frame(column = colnames(countData),
group = c("PDAC_Control","PDAC_Control", "PDAC_Control", "PDAC_iron", "PDAC_iron", "PDAC_iron", "WT_control", "WT_control", "WT_control", "WT_iron", "WT_iron", "WT_iron"))
dds <- DESeqDataSetFromMatrix(countData = countData,
colData = colData,
design = ~ group)
dds <- DESeq(dds)
res <- results(dds, contrast = c("group","PDAC_iron","PDAC_Control")) %>% as.data.frame()
res <- na.omit(res)
res[res$padj > 0.05, "direction"] <- "ns"
res[res$padj < 0.05 & res$log2FoldChange > 1, "direction"] <- "up"
res[res$padj < 0.05 & res$log2FoldChange < -1, "direction"] <- "down"
res$direction <- ifelse(res$padj > 0.05 | abs(res$log2FoldChange) < 1 , "ns",
ifelse(res$padj < 0.05 & res$log2FoldChange >1, "up", "down"))
group_by(res, direction) %>% summarise(count = n())
#BREAK
pdf("./VolcanoPlot2.pdf", width = 8, height = 6)
EnhancedVolcano(res,
lab = rownames(res),
x = 'log2FoldChange',
y = 'pvalue')
dev.off()
#Create variable for results without the not significant ones (gene names)
#Get actual numeric data for those values
onlySig <- res[res$direction != "ns",]
onlySigColumns <- rownames_to_column(onlySig, var = "gene_id")
onlySigColumns <- onlySigColumns[order(onlySigColumns$log2FoldChange), ]
onlySigColumnNames <- onlySigColumns$gene_id
tpm <- tpm[match(onlySigColumnNames, tpm$gene_id), ]
tpmSig <- tpm[tpm$gene_id %in% onlySigColumnNames, ]
tpmSig$gene_name <- NULL
rownames(tpmSig) <- NULL
tpmSig <- column_to_rownames(tpmSig, var = "gene_id")
onlySigMatrix <- tpmSig[, c(1:6)] %>% as.matrix()
onlySigMatrix <- onlySigMatrix[rowSums(onlySigMatrix) > 0,]
onlySigMatrix <- log2(onlySigMatrix + 1)
Heatmap(onlySigMatrix, show_row_names = F)
count_matrix <- countData[c(1:1000),c(1:6)] %>% as.matrix()
count_matrix <- count_matrix[rowSums(count_matrix) > 0,]
count_matrix <- log2(count_matrix + 1)
heatmap(count_matrix)
vsd <- vst(dds, blind = FALSE) # variance stabilizing transformation
# Quick DESeq2 PCA plot
plotPCA(vsd, intgroup = "group")
EnhancedVolcano(
res,
lab = rownames(res), # gene names
x = 'log2FoldChange', # x-axis
y = 'pvalue', # y-axis
pCutoff = 0.05, # significance threshold
FCcutoff = 1, # log2 fold-change threshold
pointSize = 2.5,
labSize = 3,
title = 'Simple Volcano Plot'
)
bottom of page