top of page
Abstract Pathway

USA Wildfires

Client

CareerFoundry

Year

2021

Icon_Fire.png

Deriving insights from a historical US wildfires data set.
 

"...select the data set to analyse, with the goal of conducting an exploratory visual analysis in Python...use various advanced analytical approaches to uncover insights...the results of which will be presented in a Tableau dashboard/storyboard."

Self-Guided Context

  • This analysis was commissioned to explore data originally collected to support the US national Fire Program Analysis (FPA) system over a 27 year period between 1992 - 2018.
     

  • Main aims of the dashboard:

    • Provide insight into trends

    • Explore data regarding causes

    • Suggest possible ways to reduce the number of wildfires and / or the acres burned

Assumptions_edited.png

Key Questions

  • What are the leading causes of wildfires?

  • Where do fires occur?

  • Are there any trends?

  • What can be recommended to limit the impact of wildfires?

Tools.png

Tools

  • Excel
     

  • Python
     

  • Tableau

Data.png

Data

Skills.png

Skills

  • Python:

    • Importing Libraries

    • Importing and Exporting datasets

    • Data wrangling & merging

    • Visualisations with Python libraries

    • Data correlation

    • Regression analysis

    • Machine learning
       

  • Tableau:

    • Various graphs including combination, treemaps & temporal

    • Creating a storyboard

Further Python & Tableau Analysis

Having undertaken advanced analysis in Python including methods to find correlations, I was not able to derive significant insights.

​

Using a few of the data subsets I created in Python, and various different types of graphs I was able to derive a number of insights relating to causes, location (state) and sizes in Tableau.

Wildfires2
Wildfires3

Understanding the Data & Initial Exploration

I chose the data set for two reasons:

 

Firstly, as the impact of wildfires is being increasing felt across the world, I wanted to gain some understanding with regards to the causes.

 

Secondly, I was keen to use a larger data set from which I was hoping to derive numerous insights. The original data set contained >2 million records.

Wildfires1

Although I was able to determine an upward trend in the ares of land affected  by wildfires, drilling further down into the data became quite challenging due to the number of records and columns. My initial Python data exploration did not yield much by way of insights.

 

To overcome this, I created various subsets of the data based on location, causes and size of the fires.

Insights & Concluding Thoughts

The number of large fires has been increasing over time

Over the last 10 years, the main causes of wildfires (other than Other/Unknown) have been:

Natural / Accidental Industry / Accidental

Although small in number, the largest size fires (5,000+ acres) have by a large margin caused the most damage, +122m acres of burned land

Over the last 5 years, Alaska had the largest area of land impacted by natural cause wildfires.

Concluding Thoughts:
 

• Wildfires cause mass devastation, impacting large populations, wildlife and the environment as a whole
 

• Although the number of wildfires has not significanlty grown, the are getting larger and impacting larger areas of land
 

• Dealing with wildfires costs the government $ billions which is money which could be spent elsewhere
 

• Can any further preventative measures be taken to deal with the root causes?

Key Learnings

Technical learnings
 

Importing a SQL database into SQL
 

Due to its size (>2 million records), the data set I chose was only available SQL. Although I had the option to use SQL to create a subset of the data before exporting to excel and then importing to Python for analysis, I opted to learn how to connect the SQL database directly into Python using resources outside of the scope of the course. I chose to do this as the knowledge will be useful for future projects.
 

 

Categorical values in K-means clustering machine learning
 

A K-means clustering algorithm helps identify groups in data sets which would often not be easily visible. One of the limitations of the techniques is that categorical data has to be converted to numerical values. Although I understood the requirements, my tutor pointed out the method I used to convert categorical data was incorrect. The feedback allowed me the opportunity to learn another technique in Python: 'One-hot encoding'. Again, this is something which will be useful for future work.

​

​

Other learnings

 

Large data insights and data exploration through visualisation

 

I purposefully chose a large data set with the expectation of quickly finding interesting insights. However, it quickly became clear that working with a large data set also comes with the challenge of having too much detail and not being able to derive much from analyses. To overcome this, I created various subset of the data before conducting more basic data exploration through visualisations in Tableau. This was key in finding interesting trends and correlations in the data and formed the basis of the dashboard I created. This is something I will now have in mind earlier on when I next work with a large data set.

​

Recommendations

A 3-Pronged Approach:

1) Introduce Laws & Regulations (Malicious & Accidental Industrial Causes):
 

  • Review and revise existing national and state laws which safeguard against wildfires
     

  • Increase severity of punishments i.e. penalties and jail terms for people convicted of arson related offences to act as a deterrent

2) Create Public Awareness (Accidental & Recreational Causes):
 

• Launch public campagin to create public awareness of the impact of wildfires. Look to target specific populations e.g., gun owners, campers

3) Leverage the Latest Technology:

 

• Employ latest drone technology use to identify wildfires early on and also help contain fires where possible

 

• Make best use of live satellite images and data to identify and monitor higher risk states e.g. Californa, Arkansas

CONTACT ME

Thanks for submitting!

DATA ANALYST

Email:

icon-tableau-1.png
Screenshot 2022-06-20 at 20.45.10.png
284-2843425_transparent-linkedin-icons-png-icone-linkedin-png-png.png

© 2022 By Farid Chehraz. Proudly created with Wix.com

bottom of page