top of page

Create Your First Project

Start adding your projects to your portfolio. Click on "Manage Projects" to get started

SafeSight Analytics – Industrial Safety Dashboard

NLP-driven incident analysis dashboard that clusters safety reports and identifies severity drivers using Looker Studio.

SafeSight Analytics:

AI-Driven Safety Incident Analysis for Methanex

 

  • Data Science / Industrial Safety Analytics

  • This project was completed as part of the Google Cloud × Methanex × UBC Sauder Hackathon 2026

  • In collaboration with Christy Hoang, Doreen Zhu, Lena Enkhbayar, and Monica Li

Objective

Industrial safety incidents are often investigated individually, making it difficult to identify recurring patterns across locations and operations.

The goal of this project was to apply data science and natural language processing (NLP) to historical incident reports in order to:

  • Identify recurring safety patterns across sites

  • Detect high-risk operational conditions

  • Predict incident severity from narrative reports

  • Provide data-driven prevention insights

The ultimate aim is to shift safety management from reactive investigation to proactive risk prevention.​

-----------------------------------------------------------------------------------------------------------------------------------

Primary Tools

Python (NLP & Machine Learning)

  • TF-IDF text vectorization

  • BERTopic clustering

  • Random Forest classification

Google Cloud Ecosystem

  • BigQuery (data storage and analytics)

  • Looker Studio (interactive safety dashboard)

AI & Application Layer

  • Claude AI (text extraction and incident analysis)

  • Streamlit (interactive severity prediction app)

-----------------------------------------------------------------------------------------------------------------------------------

Overview

Safety reports were originally stored as unstructured PDF documents, making them difficult to analyze systematically.

Key challenges included:

  1. Incident reviews are reviewed case-by-case

  2. Limited ability to identify cross-site patterns

  3. Data silos preventing organizational learning

  4. No predictive insight into high-severity risk scenarios

To address this, we built a data pipeline that converted unstructured incident reports into structured datasets for analysis.

Screenshot 2026-03-06 at 10.26.47 PM.png
bottom of page