Prediction Mapping Using GIS Data and Advanced ML Algorithms
eXtreme Gradient Boosting, K Nearest Neighbour, Naïve Bayes, Random Forest for Prediction Geo-Hazards and Air pollution
Description
In this course, four machine learning supervised classification based techniques used with remote sensing and geospatial resources data to predict two different types of applications:
Project 1: Data of Multi-labeled target prediction via multi-label classification (multi class problem). Target (Y) that has 3 labeled classes (instead of Numbers): Names, description, ordinal value (small, large, X-large)..Multiple output maps. Like:
Increase specific type of species in certain areas and its relationship with surrounding conditions.
Air pollution limits prediction (Good, moderate, unhealthy, Hazardous..)
Complex diseases types: potential risk factors and their effects on the disease are investigated to identify risk factors that can be used to develop prevention or intervention strategies.
Course application: Prediction of concentration of particulate matter of less than 10 µm diameter (PM10)
This project was published as research articles using similar materials and with major part of analysis (with slight modification to the code). "Demystifying uncertainty in PM10 susceptibility mapping using variable drop-off in extreme-gradient boosting (XGB) and random forest (RF) algorithms" in Environmental Science and Pollution Research journal.
Project 2: Data of Binary labeled target prediction. Target with 2 classes: Yes and No, Slides and No slide, Happened –Not happened, Contaminated- Clean.
Flooded areas and it contribution factors like topographic and climate data.
Climate change related consequences and its dragging factors like urban heat islands and it relationship with land uses.
Oil spills: polluted and non polluted.
Course application: Landslide susceptibility mapping in prone area.
If you are previously enrolled in my previous course using ANN, then you have the chance to compare the outcomes, as we used the same landslide data here.
Eventually, all the measured data (training and testing), were used to produce the prediction map to be used in further GIS analysis or directly to be presented to decision makers or writing research article in SCI journals.
This course considered the most advanced, in terms of analysis models and output maps that successfully invested in the (1) machine learning algorithm and geospatial domains; (2) free available data of remote sensing in data scarce environment.
IMPORTANT:
LaGriSU Version 2023_03_09 is available (Free) to download using Github link
(search for /Althuwaynee/LaGriSU_Landslide-Grid-and-Slope-Units-QGIS_ToolPack)
*LaGriSU (automatic extraction of training / testing thematic data using Grid and Slope units)
Best regards
Omar AlThuwaynee
What You Will Learn!
- Two applications of susceptibility prediction mapping in GIS, 1) Landslides prediction maps 2) Ambient air pollution prediction maps
- Step by step analysis of ML algorithms for classification: eXtreme Gradient Boosting (XGBoost) K nearest neighbour (KNN) Naïve Bayes (NB) Random forest (RF)
- Run classification based algorithms with training data model accuracy, Kappa index, variables importance, sensitivity analysis of explanatory and response data
- Hyper-parameter optimization procedure and application
- Model accuracy test and validation using; confusion matrix and results validation using AUC value under ROC plot
- Produce prediction maps using Raster and vector dataset
Who Should Attend!
- All students, researchers and professionals that interested in using data mining with GIS Data
- All students, researchers and professionals that work on: Health [viruses susceptibility, noise maps, Epidemic expansions, Infectious Disease, Famine
- All students, researchers and professionals that work on: Hazards [ flooding, landslides, geological based, drought, air pollution..]