Sunday, January 27, 2019

Risk Mgmt Unit 3: Predicting Risk: Approaches using Artificial Intelligence and Machine Learning

Upon successful completion of this unit, learners will be able to identify how to use artificial intelligence and machine learning to predict levels and types of risk, both known and unknown.  Links to open source platforms, languages, and computing environments are provided.  It is not necessary to learn the computing languages or to develop new code or programs; the goal of this unit is to familiarize learners in order to work effectively in teams with data scientists, domain experts, and financial decision-makers.

Unit Presentation:

Video:   https://screencast-o-matic.com/watch/cqVZht3OAh



PDF (contains links to readings, etc.) 
http://zenzebra.net/risk/risk-management-nash-pt3.pdf  


Scenario 3:  Predicting Risk: Approaches using Artificial Intelligence and Machine Learning

Julia, Patricio, and Reyna are part of a team that is tasked with classifying old shallow-water offshore wells in the Gulf of Mexico in new ways that will help them develop a plan to boost production. 


They feel very fortunate in that around a million geological and production records have been scanned, and they cover the 150 or so wells in the field.  It’s a treasure trove of data, and they want to incorporate it with the new data in order to develop a profile of the best wells, as well as the good, mediocre, and underperforming wells.

Your Task: Help Julia, Patricio, and Reyna develop a plan to analyze the data, and then help them determine where, when, and how they can use artificial intelligence and machine learning to create profiles.

Here are a few things to consider:
 How will you select the data to use?
 How will you organize it?


What does it mean for a well to be:
  Excellent
  Good
  Mediocre
  Bad
 

What are the attributes or clusters of characteristics you’ll use?
 What approach will you use to select data?
  To clean the data?
  To analyze the data?
 What kind of AI / ML approach will you use?
 How will you use the results?


Readings:

Overview thoughts / concepts

Lists of uses of AI / Machine Learning the energy industry
 Upstream
  Classify wells using your own unique set of criteria
  Identify high-value (or potential high-value) blocks
 Midstream
  Classify infrastructure (pipelines, etc) with your own criteria
  Predict overall performance and the location of bottlenecks
 Downstream
  Refining
  Retail / distribution
 Wind energy  Identify high-value, high-return new locations
  Identify small businesses that would benefit from local energy
 Solar energy
Workflow for machine learning (in general)


● Pinpoint the problem you want to solve.
● Identify the data you’ll need to use
● Collect the data
● Clean the data
● Organize your data (put into a model - if structured, may use Open Source models such as those from Apache HaDoop)
● Find a model
● Develop algorithms (May use repositories and also cloud-based interfaces)
● Train the model
● Test with data sets
● Reality check
● Decision points
 

How do I clean data?
 What is “dirty” data? 
  Does not make sense
  Bad labels
  Incorrect formatting
  Too many “nulls”
  Part of the data in a different order or different columns

Brendon Bailey’s Guide:  Use Excel or Python to Clean Data?

Use Excel if: You have fewer than 1 million records
You need to do the job quick and easy
There is a logical pattern to cleaning the data and it’s easy enough to clean using Excel functions
The logical pattern to cleaning the data is hard to define, and you need to clean the data manually

When you might use Python or another scripting language:

Use Python if: You need to document your process
You plan on doing the job on a repeat basis
There is a logical pattern to cleaning the data, but it is hard to implement with Excel functions


Brendon Bailey. “Data Cleaning 101” TowardDataScience.com
 https://towardsdatascience.com/data-cleaning-101-948d22a92e4

 
Where do you keep the data?
 cloud solutions (Google, Amazon Web Services (AWS))

Software for risk analytics (free / open source):


Spotfire (http://www.spotfire.com)
Qlik.com (free Spotfire alternative, Qlik.com)
Jupyter Notebook https://jupyter.org/
 iPython
 R
 C++
 Julia


A Gallery of interesting Jupyter Notebooks (ready to share)
https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks


How do we predict where and when high-risk situations may take place?
 Analyze data
 Probabilistic analysis (Spotfire, etc.)
 Using geospatial elements

What is the ideal combination of variable or factors to tell us when / where / how conditions are ideal for a) optimization; b) an accident or problem ?
 Use multivariate analysis
 Bring together all risk factors: geological, logistical, political, economic, legal, environmental, etc.
 Weight them by importance (assign a percentage)


https://www.kinetica.com/wp-content/uploads/2017/09/OilGas_jt1.0mn.pdf https://medium.com/syncedreview/how-ai-can-help-the-oil-industry-b853dda86be6

Learn and Use Machine Learning

Tensorflow: https://www.tensorflow.org/tutorials/keras/


Tensorflow Machine Learning Cookbook: https://github.com/nfmcclure/tensorflow_cookbook

AI and Probabilistic Models

Part I
https://medium.com/tensorflow/industrial-ai-bhges-physics-based-probabilistic-deep-learning-using-tensorflow-probability-5f215c791863


Part II
https://medium.com/tensorflow/predicting-known-unknowns-with-tensorflow-probability-industrial-ai-part-2-2fbd3522ebda


Bougher, Benjamin Bryan. (2016)  Machine Learning Applications to Geophysical Data Analysis. Open Collections. University of British Columbia.
https://open.library.ubc.ca/cIRcle/collections/ubctheses/24/items/1.0308786


Bougher, Ben B. (2016) Using the scattering transform to predict stratigraphic units from well logs. Seismic Laboratory for Imaging and Modeling (SLIM), The University of British Columbia, Vancouver

https://www.slim.eos.ubc.ca/Publications/Public/Journals/CSEGRecorder/2016/bougher2015CSEGust/bougher2015CSEGust.html

Data:  Trenton Black River gamma ray logs

Methodology:  supervised learning ("uses labelled datasets to train a classifier to make predictions about future data" (Bougher, 2016))

Methodology - what's the algorithm?  Bougher uses a scattering transform - and then it fieeds a K-Nearest Neighbours (KNN) classifier).

How can I do this?

Using convolutional neural networks to solve a mineral prospectivity mapping problem
Framing the exploration task as a supervised learning problem, the geological, geochemical and geophysical information can be used as training data, and known mineral occurrences can be used as training labels. The goal is to parameterize the complex relationships between the data and the labels such that mineral potential can be estimated in under-explored regions using available geoscience data.

Granek, Justin. (2016). Application of Machine Learning Algorithms to Mineral Prospectivity Mapping. Open Collections. University of British Columbia.
https://open.library.ubc.ca/cIRcle/collections/ubctheses/24/items/1.0340340



TAMUT  MBA in Energy Leadership: Click link to apply - more information

For more information about the courses (and this full course), please contact me. 



Blog Archive