Biosensor Data | My Data Science Website

top of page

Secil.

Research & Projects

Stat of the Week

← Back to all projects

Completed

Brown University

Cleaning and Analysis of Patient Data from Biosensor Technology

A data cleaning and analysis of a patient's TAC (Transdermal Alcohol Concentration) over an interval using Python's Pandas dataframes library.

TIMELINE

2025

FIELD

Data science

ROLE

Student Researcher

STATUS

Completed

Background Information

Biosensors are devices that record data at repeated intervals. The one in this project is specific to alcohol level measurement, represented as TAC (Transdermal Alcohol Concentration), and is purposed for individuals struggling with substance abuse. The data is collected from wearable biosensor technology taking the form of a wristband or watch. It functions by measuring alcohol concentration data through the skin (hence the name "transdermal" in TAC), and recording it for interpretation. The ultimate goal of this technology is to alert users when alcohol levels get alarmingly high, enabling self-moderation, and protecting from dangerous levels of intoxication. In the future, if this technology becomes highly accurate, it could even be used to determine if users are sober enough to perform actions like driving.

My Task

Interpreting data is a crucial part of the development and accuracy of this biosensor technology. My role was to assist with this: coding with dataframes from the Python Pandas library, I took a real biosensor dataset and cleaned it of any gaps, artifacts, and nonwear, providing a final analysis of the data's usability.

Code Overview

Cleaning this data required thorough planning and organization. After importing appropriate libraries, setting display options, and reading in the data file, I made three functions purposed for detecting useless data––one for gaps, one for nonwear, and one for artifacts (unusual spikes/dips in the graph). As observed in the graph pictured to the right, the function for artifacts tested positive over this interval. To clean the data of this issue, I approached

the problem with three steps: erase the data encapsulating the spike, create a line of best fit for that interval, and replace the gap with said line. After cleaning the data, I provided the improved graph representing the data, with a line in place of the previous artifact. It is displayed below:

As you can see, the artifact in the original dataset has been erased and replaced with a line of best fit, which can be attributed to the graph's overall upscale.

Click HERE to see my complete, organized code

Analysis

After handling the artifact, I performed an analysis about different features of the graph as well as their significance in a real-world context. This involved some math, the results of which are shown below:

Percentage of nonwear data: 0.0 %
Percentage of useless and later altered data: 0.05 %
Percentage of values outside of plausible TAC range: 0.03 %
Peak TAC value: 99.72
Rise duration: 5.43
Fall duration: 6.57
Rise rate: 18.35
Fall rate: 15.19
TAC level average: 32.37

As well as the general analysis:

The data collection for this artifact and alcohol event is trustworthy and can be used as valid information. Aside from
one artifact, which only represents 0.05% of the data, there were no other critical issues such as gaps or nonwear. The biosensor did experience some instances where the TAC value was recorded as a negative value (not realistically possible), but that only represents 0.03% of the data, so it is also not something worth fretting about.

The data does suggest that there was a drinking event. Around 10-11pm on March 5, we start to see an increase in TAC with a rise rate of 18.35 for about 5.43 hours. Later, we see the TAC level begin to fall at a rate of 15.19 for about 6.57 hours.

The user seems to be drinking at a very moderate level, with a peak TAC level of 99.72 ug/L and an average of 32.37 ug/L throughout the day, so there is no need for any warnings on that subject. Though the artifact could have simply been a fluke in the device, it may be useful to notify the user in case it was due to an environmental cause like misuse.

The Future

TAC-tracking biosensor technology is a new and exciting field of health research. While it is already bettering people's lives, there are still many design implementations to consider for these biosensors in the future. Many of these improvements, however, cannot be achieved without precise data collection strategies, which are the backbone of the device's purpose––to collect data at intervals.
At Brown, my peers and I debated different ways to approach data collection. Our instructor told us, funnily enough, that in the early development stages of the technology, the researchers themselves would drink for accurate data collection. We all agreed that this wasn't a sustainable practice as the biosensors became widely used, however.
Some of our ideas to help with accuracy included:
- Having users to write a note before drinking
- Having users track their drinks with photographs through an app
- Having users record their meals throughout the day
- Combining TAC tracking with sleep and heart rate data

We also discussed more physical design-based elements, like reducing the tracker's size or making it look more like a watch to prevent social stigma.

Overall, my experience at Brown as well as my project taught me that you can learn so much when you fully immerse yourself in a topic and commit to overcoming setbacks and confusion along the way.

Full Code

Above is the bulk of the important data-cleaning and analysis related code. The rest includes some function execution and print statements.

Secil Uluderya · © 2026

Home

·

Blog

·

Research

Stat of the Week

About

·

·

·

Contact

datanet.blog

bottom of page