Getting started with deep learning

Jul 22, 2021 01:52 · 8423 words · 40 minute read

Hello and welcome to this Australian BioCommons webinar on getting started with deep learning. My name is Melissa Burke and I’m the Australian BioCommons Training and Communications Officer and I will also be your host for this webinar. This webinar is part of a series in which we aim to share useful information about the latest digital techniques data and tools for the life sciences community. Each month we hear from our national and international peers on a bioinformatics topic that we hope will help Australian researchers to achieve their best medical agricultural and environmental research.

00:43 - Before we begin we would like to take a moment to acknowledge the traditional owners and the custodianship of the lands on which we meet today. In my case in Meanjin or Brisbane this is the Turrbal and Jagera people. We pay our respects to their ancestors and their descendants who continue cultural and spiritual connections to Country. We recognise their valuable contributions to Australian and global society Today we’re thrilled to welcome Dr Titus Tang to speak to us about getting started with deep learning.

Titus is a Senior Deep Learning Engineer at the Data Science and AI Platform at Monash University. He facilitates the development and delivery of various deep learning and AI training events at Monash and also provides advice and hands-on assistance to researchers who are looking to apply deep learning and AI techniques in their research. Welcome to the webinar Titus and I’ll now hand over to you to start your presentation. Thanks Melissa for the introduction.

I’ll just bring up my screen. All right, so, thanks Melissa and Christina from Australian BioCommons for this opportunity to speak here today. My resume probably isn’t as prestigious as the various speakers that you usually have here presenting at this webinars. While I do have a PhD it isn’t directly in deep learning as the field of deep learning was at its relative infancy when I started my PhD. I hope that doesn’t make me sound much ,much older than I am.

However I now do consider myself a deep learning engineer and almost everything that I have come to learn about deep learning is done through self-learning largely of the internet and so I thought perhaps my experience of learning something completely new mostly by myself and then converting it into a practical skill, which is essentially what my job as a deep learning engineer, I thought this journey of mine could be meaningful to others and this is what I want to talk about today.

So getting started with deep learning is a short and concise introduction to what deep learning is. About the key steps you could take just like I did in order to get started in the field. You can think of today’s content as my personalised checklist of things you need to know or to prepare for in order to get started with deep learning. This is naturally spoken from my perspective so please take away from today’s talk what you will and adapt it for yourself.

03:27 - So in today’s talk I’m going to assume close to zero technical knowledge about deep learning I’ll be speaking about deep learning at a very very high level and covering steps that you could take in order to get started with deep learning hands-on. And along the way are we providing pointers to various resources oftentimes links to those resources will be included in the slides and I believe Melissa will be sharing a link to these slides that you can access in order to access those links if you wish.

This lecture is not about discussing in-depth how things works inside the black box that is deep learning. We only have about 40-45 minutes to talk about the various things that I want to talk about today and so we just do not have the time to go into a technical discussion about how to implement things at the lowest level. Very importantly I want to emphasise that today’s talk is not about machine learning but about specifically deep learning, a subset of machine learning.

04:36 - So let’s start with the question how is deep learning useful? If you want to learn about something we probably want it to be useful for our research or for our work. You would have probably seen lots and lots of examples of AI in the news or in the literature and I could confidently say that a huge majority and overwhelming majority of new advances in artificial intelligence these days are based off deep learning. And so here are just a small handful among the vast collection of examples that you can look up online.

First off is OpenAI’s GPT-3. OpenAI is a AI research firm based in, I believe, San Francisco and they released, perhaps about a year ago, what was at the time the world’s most powerful language model an AI that is able to generate text language almost at the level of human experts. GPT-3 here stands for generative pre-training pre-trained transformers which is a huge AI model with up to 175 billion parameters. Huge model that takes literally months to train but is able to perform tasks at the human expert level.

Then we have DeepMind which is another AI research firm brought over by Google and they have been focusing on various reinforcement learning among others, reinforcement learning artificial intelligences. And what they have showcased these technologies on is often in the in the field of competitive gaming. For example they have developed AI bots that are able to play the game of Starcraft and the game of dota at the professional level and they’re beating out 99.

8% of human players which is itself a very tough a very good great achievement. And then I’m sure you have heard about self-driving cars self-driving trucks. These have been advancing greatly over the last few years. And I think I’m willing to stick my neck out and say that and and and claim that you know I’m willing to trust, in the context of um uh urban roads, urban and well-developed roads, I’m willing to trust an AI self-driving car or the average human.

And the last example here is you know this is just a point to highlight that deep learning is being applied to pretty much any field that you can think of these days and um naturally you can guess that weather predictions is one of them um and you can you can be confident that deep learning has been applied to predicting weather um very successfully these days. Now moving into something closer to home is Alpha Fold developed by Deepmind by Google. Alpha Fold was released perhaps about two years ago if I recall correctly and they have developed it is an AI that predicts the 3D structure of a protein based solely on its genetic sequence.

And this problem is important to solve because the structure of a protein influences how it behaves in an organism and malfunction of this structure often leads to various kinds of diseases. Now we know that modelling the physical structure of a protein is difficult because you have to model the interactions of various components of that structure relative to the other structures, but on the other hand genetic sequencing is relatively easier and so what Alpha Fold has done is to reframe the problem of physical modelling into a form that is easier to do.

And what this basically does is to say they have collected a bunch of amino acids they sequence the genetics of these acids and also they measure the physical properties of these acids and then they have a model that learns to take in as input the genetic sequence and outputs the physical structures of these acids. And more specifically they have developed a deep neural network that predicts distances between pairs of amino acids and the angles between the chemical bonds that connects those acids which is a great step forward in the field of of physical modelling.

09:12 - Something at a much smaller scale but still very useful is DeepLabCut. DeepLabCut is an open source Python package that allows you to do 3D markers post destination using deep learning and neural networks. So effectively if you’re working with animals in a lab and you would like to track the pose, the the movements, and the um and the the movements of individual limbs of an animal, what you could do is to use this package, teach the neural network to track these different parts of an animal as they move around in their environment.

So this package is as I mentioned available as a Python package that is open source and that which you could quickly adopt into your research workflow.

10:02 - And something more towards science of, the realm of science fiction is brain to text communication. Here we have a user that imagines writing a character and a new neural network is then trained to read electrical signals from the motor cortex of the person and to translate these electrical signals into the characters that they use that the user is thinking of writing. The researchers of this who wrote this paper was able to demonstrate users that are able to write at about 90 characters per minute with an accuracy of larger than 99% using a general purpose autocorrect and this is comparable to the average able-bodied smartphone typing at about 115 characters per minute.

So this is great for people who do not have um who have limited physical abilities to be able to type on the phone for example.

11:03 - Now I probably don’t need to spend too much time talking about examples of deep learning. You’re already here you already you’re probably familiar with what deep learning can do so let’s move on to the next question about what is deep learning? I always like to use this example to introduce deep learning and here we are comparing deep learning with what I could refer to as traditional software development. So let’s say you are a traditional traditional software developer or a programmer and your boss has tasked you with this simple task.

Given a an image you should have to write a computer program that takes in this image, processes it in however you define it to do, and it spits out an answer if whether that image contains a dog or a cat. And the question I’m going to pose to you here is how would you go about solving this problem in the traditional software point of view? So normally when I post these questions in the various workshops that I run um I get responses like let’s try to detect differentiate between a dog and a cat by detecting for pointy ears.

Pointy ears seems to be the most common um answer that I get um, or the shape of the nose, or the shape of the face, the shape or the snout the size of the eyes the presence of whiskers and so on. Now let’s building drill down into one of these specific examples and just let’s just use pointy ears. Keep in mind that computers aren’t really the smartest things in the sense that they do exactly what you tell them to do. And so you cannot just tell a computer pointy ears because the computer has no understanding of these human concepts that we have constructed.

So you really need to go down to the lowest level possible and at the lowest level an image is effectively just an array of numbers where each pixel is just three numbers usually um representing the red green and blue channels in that pixel. Now how would you go about writing some software that takes these these numbers processes them in some form to extract what we but we understand as pointy ears. But what are pointy ears? Well you could define a set of pointy ears as being an ear that is that has two straight lines that form an acute angle, for example.

But then you need to define what is an acute angle and what is the straight line or in fact you need to define what is a line because the computer has no concept of a line. So um and even if you’re able to define the concept of a line to a computer, and obviously people have done that, um you do then encounter situations where you don’t see a pointy ear in a cat. Maybe the ears are occluded maybe the ears aren’t pointy maybe the ears consist of lines more than two lines that form the ear or maybe you see a different part of cat that for which you don’t see the ears.

So hopefully you get the point that traditional software development is very useful in a lot of sense but it is also very limited in many fundamental ways and even in this case of differences between images of dogs and cats something that a three or four year old could probably do very well, experienced software developers, traditional software developers have trouble writing computer programs that are able to solve this problem in a general fashion let alone more complex problems involving different object kinds.

So what I’ve talked about here is can be sort of oversimplified into this simple flow chart here. First of all you start with observing the data and we we didn’t need to do that here because we have looked at lots and lots of dots and casts throughout our lives so that part is probably the easiest part here. Next we need to identify a bunch of features that is again very relatively intuitive for us to do we just have to look at the images and pick out what is what are the salient features in those images.

And the next step here is probably the most difficult part, if not impossible part, of this whole process. We want to translate these human concepts that we have extracted from our data into low-level machine executable instructions that a computer could work on and as we have discussed this process is difficult and oftentimes impossible from a traditional software development point of view. And so this is where deep learning comes in and this is where it is so powerful.

With deep learning you can see that the process here is very different. We start off with collecting a bunch of data we then move on to labelling the data. And by label I mean to take every single data point that we have and to associate it with a human concept. So, for example, in this case a dog or a cat. We then take all that data and we feed it into a neural network. Now what’s different here is that we do not tell the neural network what features to look out for it is the neural network’s job to say given this data these are the features that I should extract as a neural network in order to solve the problem that I’m requested to solve.

So you can think of the neural network here um automating a process. With traditional software development what we are trying to do here is to automate the process of executing instructions we are taking a human concept we are converting it to low-level instructions and it is machine’s job to execute these instructions automatically. With deep learning we are taking this process of automation one step further we are saying that’s not only up let’s not only automate the process of executing instructions we are also now automating the process of defining what those instructions are and that replaces the hand coding or features which was previously the impossible part of that process.

So if we are taking off the burden from ourselves the process of hand coding features then as deep learning engineers our job now shifts towards the left hand portion of the graph which is the collection, the curation and the management of data sets.

17:21 - So in other words as a deep learning engineer whenever we are given a problem our question is not to ask what are the features I should look out for in my data rather our objective is to ask what kinds of data should I prepare and should I collect insulin label such that a neural network can can do its job best.

17:48 - So that’s deep learning at a very very high level. But you would have seen these terms artificial intelligence machine learning and deep learning thrown around and being used somewhat interchangeably so I just want to clarify these terms here. Artificial intelligence can be seen as the broad umbrella that encompasses everything and this is my own definition I like to define it as any human design system that provides consistent, number one, and beneficially biased output in response to input, number two.

Now I’m using the term to beneficially biased because I find that the words right or correct being a bit too strong in this context. And so artificial intelligence systems could include include everything um pretty much everything in AI um but that also includes rule-based systems and any artificial biological organisms we might create in the future. Within the subset of artificial intelligence is machine learning. Now machine learning is basically the development of a machine that autonomously learns biases from data, what is right what is wrong, and and draws borders between what is right and wrong.

So you might heard of these terms in support vector machines, decision trees, random forests and so on all of these fall under the umbrella of machine learning or what I would call traditional machine learning. Deep learning is a very specific sub field of machine learning and whenever it comes to with deep learning we always work with deep neural networks.

19:20 - So what are deep neural networks? Well the name implies that it is modelled of some kind of neural network and or specifically modelled of the human brain. The human brain, a large network in itself, consists of many, many different neurons interconnected to each other. So in an artificial neural network it takes pretty much the same structure where we have artificial neurone, also called also known as perceptrons or cells whatever you like to call it, each of these take in inputs from the external world or from various other cells, it processes those inputs, and then outputs its process inputs to the next layer to the next bunch of cells.

20:08 - Now obviously in a network you would have multiple of these cells of these neurons interconnected to each other and so in an artificial neural network this is typically how we structure them. Here each circle represents a single cell we have cells interacting with each other. Now each colour here represents a single layer and usually cells in a layer perform the same task that being the case there is usually very little need from cells within a layer to communicate with each other.

And so what you see here is the lack of lines between cells of the ray of the same colour by lots of lines or connections between cells of subsequent layers. So in the neural network you naturally have an input layer that receives input from the world, this could be for example pixels on the of an image. You could have output layer that provides the final prediction that the model is making and then you can have a bunch of hidden layers in between that processes all that information.

The name the word deep, in the name deep learning or deep neural networks, comes from the fact that as as a community we have figured out that the more layers a neural network has the more powerful it is, assuming that data is not a limitation. And if that is the case and the deep the word deep here comes from the fact that the best neural networks often very deep layer wise.

21:32 - And so these days you could have neural networks, it’s common to have neural networks like in the order of tens of layers or even hundreds of layers. So that’s a structure that’s the structure of a neural network in a nutshell at a very high level. I think the big question here we want to ask is how does a neural network learn such that it is able to eventually elicit artificial in artificially intelligent behavior. And I’d like to call this process a bit of it informally a bit of a trial and error.

While the formal term, formal process is called gradient descent. So this blue box here is you can consider as a neural network it takes as input from the world, this could be an image of a dog for example, and it processes that in input to provide some prediction or our output here. Now if you recall from before whenever we create a data set we also provide it with labels so it’s associated with each data point in that data set. What we then do is to compare the output of the model with the label that we have the label tells indicates whether the output of the model is right or wrong.

So for example you could have a model predict a number between 0 and 1 representing a dog or a cat and you could have the label being say one zero a dog and one a cat. So you compare the outputs with the labels and if it is a cat the label will be one. If your model got it wrong and said that it is 0. 2 it means that it is 80% confident that it is a a dog and twenty percent confident that it is a cat. So it is sort of eighty percent wrong so what we will do is to then take the difference between the outputs and the labels, that is formally known as the loss, and we will take this loss as a guide to change the behavior of the neural network such that on its next iteration given the same input it would provide a slightly better output.

And our goal here is to improve is to change this behavior of the neural network in small incremental steps such that the output approaches the label. And once it is able to predict the label exactly you will have accomplished your task.

23:53 - So this process of changing the behavior tweaking the behavior of every single cell in the neural network such that its output approaches the label that process is called back propagation and the technique the mathematical technique that we use is called gradient descent. Pretty much all of the pretty much the majority of the advances in AI that you see in the news or in the literature revolves around deep learning and pretty much all of deep learning falls back down to this one fundamental concept of gradient descent wire a back propagation wire gradient descent.

Now I don’t have the time to go into all the details and all the math about how all this works but if you like to look this up I’ll be pointing to resources about this over the next few slides you could look up the terms back propagation and gradient descent also known as stochastic gradient descent.

24:50 - So how do we get started with deep learning? What are the  basic steps that you want to take to get started? You can think of this slide as a bit of a checklist to map your progress as you get started in this in this journey. And this side is also pretty much a preview of my next 10 also slides and I will go through each of these items and talk about and give you pointers as to what you could um where you could go to in order to get more information.

25:22 - So first of all you need to get you need to understand a few concepts especially in general data science and then deep learning. If for example you already you have already built traditional machine learning models to um to work with your work to use in your research um you will probably be familiar already familiar with various concepts on the left here. So things such as data cleaning and wrangling if a data is noisy or messy we need to understand the differences between training, validation and test data sets.

This is important in order to reduce bias while training your model. You need to understand the differences between overfitting and under fitting especially if you want a model that is independent and fair in its judgment. You need to be aware of various common evaluation metrics, so accuracy is probably the most it’s probably something that everybody’s familiar with but you need to be familiar with other metrics such as recall sensitivity, precision, f1 whatever metrics that are out there or even metrics that you design yourself that are sensible for your research.

Be familiar with some terms such as unsupervised versus supervised learning, which represents the completeness of labels you have in your training data, and understand the differences between regression, classification and clustering, among others. With deep learning um I’ve really briefly covered the general structure of the neural network the cells and layers and so on and I’ll cover a little bit a little bit more over the next few slides, but I think in general just understand how a digital network is structured and it’s key components.

Understand how a neural network learns, its loss function is how it performs back propagation and stochastic gradient descent and understand that deep learning and neural neural networks are oftentimes referred to as a black box where you have lots of knobs and dials that you have to turn in order to get the model to perform right. tTese knobs and dials are referred to as hyper parameters and there are quite a few standard hyper parameters that you should be familiar with when tuning your model such as batch size the number of epochs and the learning rates.

When it comes to constructing neural networks I like to think of it as the plumbing system in your house where you start off with the uh in input mains, it flows through various parts of your house and then it probably, for example, ends up in the kitchen sink tap output. And as a plumber your job here is to make sure that data flows from input to output in a smooth and correct manner. And that means getting pipes of the correct shapes and sizes to go from from your front door to your kitchen sink.

And I think this is the point that is often underemphasized when it comes to constructing and debugging neural networks that go wrong. So you need to pay attention to the shapes of individual layers in your neural network among other things of course um such as the input shapes the the order of dimensions of the data that you need to pass into your models, the output shapes of the models and so on. I think this focusing on this aspect of the construction of the neural networks really helps to give you a low level hands-on understanding of the flow of information through that pipeline if you’re interested to explore in that direction.

29:05 - Here are some deep learning resources that I would recommend you to check out in order to get yourself familiarised with the various concepts that i mentioned on the previous slide. First of all I would recommend you to check out your local university for various workshops events and seminars such as this one here today um about the topic right. Oftentimes these events are held for free for for staff at the university since so that’s a very good resources to take advantage of.

If you’re the kind of person who prefers self-learning in your own time online here are some really great channels that really introduces the fundamental concepts that will be in a very clear and precise way. 3Blue1Brown and Brandon Rohrer provide very good explanations about two minute papers uh provides you with, as the name indicates, two minute summaries offer of the latest research that is happening um in the field. I do have to say that oftentimes, because this is a YouTube channel, oftentimes these topics um tend to be biased towards topics that are a bit more newsworthy so to speak.

If you like to read up blogs and articles online towardsdatascience. com is a great page you could go to to learn about pretty much anything about machine learning and and deep learning. If you have the endurance and stamina to sit through a 10 week or a short week course do remember that Moocs exist feel free to take full advantage of them.

30:45 - playground. tensorflow. org is a interactive fun website that you could use just to visualise how data flows through a very small toy model um a toy neural network. I highly recommend it they check it out after you understand what is happening behind the scenes. And last but not least and I will spend the next few slides talking about this, is Galaxy, I’ll leave that for just a bit. Once you’re familiar with the fundamental concepts and would like to explore something more advanced here are a few channels that you could use that you can look at online on YouTube which are great for the more technical discussions.

And last but not least here is this website which i will refrain from pronouncing the name because I’ve actually never heard someone pronounce the name in in real in in person to me and so I just don’t want to embarrass myself that is an open source repository of a lot of research papers and it is my go-to place to look up the latest papers in in deep learning.

31:56 - So Galaxy is a as on the indicated on the website is a scientific workflow data language integration and publishing platform that aims to make computational biology accessible to research scientists that do not have programming experience. Now they have a lot of content up there but I just want to narrow in on the machine learning content that they have as shown right here and in particular the deep learning tutorials that they have, that they have up there.

These deep learning tutorials come in three parts and they cover the three most basic form of neural networks, feed forward neural networks that work best with tabular data, recurrent networks that work best with time series data, and convolutional networks that work best with spatially related data such as images. There is also an introduction to deepening in the general term that introduces you to the high level concepts. So I highly recommend you that you check out these resources in the introduction to deep learning tutorial they have this table here that sort of summarises the various vicious directions that you might be working on and the candidate models that you could use in your research and it covers the basics you know cnns convolutional networks recurrent networks and deep neural networks.

And and then of course there are various more advanced stuff such as gans generative adversarial networks variational auto encoders and graph convolutional networks that you could explore once they’re more familiar in the field.

33:37 - In general this is how I would summarise the deep learning workflow. If you recall as a deep learning engineer you are a curator of data so you start off with collecting and labelling your data. Labelling the data is perhaps the the most effort consuming part of the whole process and it is often a deal breaker in many situations. You will also have to select a model, now I’m using the term select rather than design a model because there have been lots of work done up to this date where researchers have spent lots of the time in designing the best models for to solve a problem.

So my recommendation is to not reinvent the wheel, use a model off the shelf that you have gotten off somewhere from somewhere and use that in your research as status and once you have a better idea of what exactly you need you can then to fight you can then start fine tuning the model to suit your needs. Now I have drawn this graph in a largely linear fashion but in practice that is often not the case the real world is messy and non-linear. So oftentimes you will train your model you validate it and you realise that it is not a model that is a problem it is a data that is a problem and so you need to go back right to the start to to have a look at your data and perhaps collect more balanced data sets.

I like to say that deep learning is part art and part science like I said there are a lot of knobs and dials that you need to turn and tweak in order to get the model to work right. And in from the part of things generally I’d like to say or I’d like to recommend prioritise in this fashion. I will start with put the turning of knobs aside start with data volume and quality make sure that you have the right amount of data and make sure that the quality of the data is good enough for training model.

If but once you are sure that that is resolved then look at your loss function, a loss function is basically a mathematical instruction I would say that tells the model how to learn from the data. And if your instructions are wrong obviously the model will learn a different objective than what you intended to learn. So check your loss function once actually your loss function is correct you can start playing around with your model architecture and the size of a model.

And only at the very end you look at tuning the dials to tweak the performance of your model.

36:08 - Often times people who are new to depending start in the reverse direction and start taking his dials right away which usually isn’t very the reward for it usually isn’t really great. Now there are very there are many many ways you could uh implement deep learning solutions without having to program. So Galaxy is one of those approaches where you could you can train a neural network model with minimal programming skills. You could also use some other commercial software such as Matlab to train a neural network but if you wish to have something customisable to your needs then you probably need to pick up some form of programming.

And and if you’re working with deep learning i would say 95% of or around there of deep learning is done in Python.

37:05 - I won’t go too much into this but just a couple of notes first of all is that that be familiar with Python packages if you need the right code that does a specific task that you think that somebody else probably needs needs to have done you can safely guess that there is a Python package out there that already does it for you. So don’t reinvent the wheel, use code written by someone else. Some commonly used packages for deep learning includes numpy ,pandas, matplotlib opencv, and scipy and perhaps scikit-learn although these two are more applicable to machine learning than to deep learning.

37:46 - And then comes the big question that is perhaps one of the most frequently asked. Should I use TensorFlow and Pytorch or Pytorch? And for those of you who are not familiar TensorFlow and Pytorch and among others are software packages, or rather pre-written code, that people have other people have have developed that allows you to build deep neural networks and train them and evaluate them with minimal effort in other words you don’t need to write your own code from scratch all you need to do is just use the pre-written code available in these packages here.

And Tensorflow and python are the two most popular which one to use well I I’m unfortunately not able to answer that question for you. The two of them are very similar. Here’s a graph showing the adoptions over time to give you a bit of a historical context. Tensorflow was the first to come out among others and at the time Tensorflow one that’s version one was the best in the field and so everybody went on TensorFlow one. And then Pytorch came out. Pytorch originated from torch which was written in a less used programming language called lua they created a python version of it because everybody in machine learning was using python and, hence Pytorch, and Pytorch was a great step it was a great improvement over Tensorflow version one and so everybody started migrating towards Pytorch.

39:20 - At that time there happened to be a lot of newcomers into the field of deep learning and all those newcomers also went on board to Pytorch. Tensorflow finally woke up and realised that they were losing the battle and they they came up with Tensorflow version 2, which is what we have now, which is very similar to Pytorch and so at this stage I would say Tensorflow version 2 or Pytoch they are pretty much similar. There are smaller differences when you’re working with more advanced features but I think um to answer your question of should I use Tensorflow or Pytorch it is more likely that you would choose whatever your colleagues are using rather than any specific features that might not exist within the two packages.

Unless you’re looking for much more advanced touch such as such as for example I am familiar that the abilities in distributed parallel training of mod of deep models across multiple gpus how you use them is slightly different in Tensorflow and Pytorch.

40:26 - Regardless of which package you use both websites have excellent tutorials and boilerplate code that you could use to kickstart your your programming journey in building these models. So I highly encourage you to check out these resources and included in these resources are Google Collab notebooks that allow you to run this code with single clicks so which is really useful if you want to play around with the code without having to set up any softwares in your own environment.

41:03 - All right so with all the theory and with all the theory in place and all the tools available for you to start building models then the next step you could take is to start training your own models. As an as a first step to familiarizing yourself with the process of training models. The first thing you’re going to look at is your data set. T’m going to emphasise that good quality data is essential for training a model. If you provide garbage in you will get garbage out.

So you do want to think about what types of data you will need how much data you would need or or how much data you can collect and the quality of the data make sure that it is balanced. For example if you’re training to detect differentiate between dogs and cats and you have 99 images of dogs and one image of a cat that isn’t a very good data set. Think about how you will collect the data, think how you how about how you will store the data and label the data.

Labelling the data like I said is probably the biggest challenge you would face so then it’s therefore it’s something that you should pay careful attention to.

42:08 - Next you want to pick the right neural network architecture so in this diagram here I’ve sort of simplified the the structure of the neural network into this in this diagram here where you need to be concerned about your input shape and output shape because that determines the inputs and outputs to your network. In the middle here we have the network itself which consists of a backbone and a head and I will elaborate on this later. And then finally you will train your neural network on a loss function basically a mathematical function that tells the neural network what to do.

So choose the right neural network for the problem at hand. When it comes to the backbone you know the backbone usually depends on the date on the type of data that you have if it’s images you want to go with a convolutional neural network, if it’s time series data such as languages, waveforms signals and so on that come in a in a linear fashion in a time based fashion, you want to look at recurrent networks, lstms long short-term memories and transforming models.

And if you’re dealing with very simple tabular categorical or numerical data then just very standard fully connected layers or what is just referred to as deep neural networks word suffice.

43:28 - The head and the loss function depends on the objective that you want the model to perform so if you’re in a model to perform classification then you need a head that is suitable for classification along with a cross-entropy loss function. If you run the model to perform image segmentation and you want to look okay unit architecture and if you’re going to perform and run a model to perform object detection which is effectively joining bounding boxes on a on an image that you look at you need to look at something else.

Once you’re familiar with this basics then we can look at other popular branches of deep learning including gans generative adversarial networks that what we normally refer to as creative AI. You can look at reinforcement learning where you have agents AI agents that interact in a simulated environment or you can look at unsupervised, semi-supervised, or self-supervised learning or various forms of model training that work with limited labels.

44:31 - And last but not least I just want to I believe this will be my last slide here to pick the right computing hardware deep learning is a very compute intensive workflow. If you’re just experimenting with toy models you could just use your laptop or your desktop pc um you obviously will take literally an order of magnitude longer to train your models but if you’re really working with toy models if you’re working with tutorials that is usually sufficient. Even to train serious models you probably need to look at the use of gpus graphic processing units or tpus tensor processing units.

So the first option is to get a gpu for your desktop pc this is very convenient you have your own customisable compute resource that you could use whenever you wish to but obviously there is limited performance. I think the largest desktop pc you could have the most gpus you could have in a desktop pc is probably around four. The second option is to use free cloud compute resources such as Google collaboratory this is free to use but there are of course limitations such as such a fact that you can only use one gpu and you can run a script for 20 of 24 hours at a time.

Third option is to buy your own mini supercomputer if you had hundreds of thousands of dollars to burn uh for example the nvidia dgx a100 which is I believe the third generation of mdgx has five teraflops of compute at the cost of about $US200,000. You can rent um compute time on cloud services such as Google cloud platform, AWS and Azure but the costs do add up significantly. The last time I checked, running a gpu full-time for about one to three months basically incurs you the cost of the entire gpu of course you’re paying for other services such as the accessibility of the resource, storage, cpu computing and other stuff but cost to add up.

And last but not least you might have HPC resources available at during your institution to access for free or through a membership-based payment so do check that out. In the interest of time I’m just going to cut it short here and open up for questions. I’ll hand it back to you Melissa. So yes we do have time for questions if you have a question for Titus please write it into the Q&A box and I can see there’s already some questions coming through there. So the question that we are going to start with is, we’ve talked a little bit about the difference between deep learning, machine learning and AI, and the question is are there applications where traditional machine learning is a better alternative than deep learning? So we we often like to jokingly call deep learning or neural networks as a black box and it’s largely true.

It is the black box because usually these models are large they are able to build very complex models around the data that they that they learn around and so it is not easy to decipher how these models make decisions. And in situations where you need your models to be interpretable you need to know that the model made this decision because of these various factors then deep learning is problematic in that sense. And there are machine learning traditional machine learning approaches that do better in that sense such as decision trees such as linear regression and so on um these the trade-offs here is effectively model complexity versus explainability and I think that is a trade-off that at this stage is something that becomes that we can’t get the best of both worlds of.

48:32 - Thank you for that answer. So the next question we have on the list here is um again related to when should you choose different techniques. And the question is do you have any general advice on what type of problems, especially in terms of the quality and amount of available training data, are a good case for applying deep learning? Can you repeat the question again? This person is asking whether or not they should use deep learning when their training data is potentially sparse relative to the number of parameters to be determined or if it is a bit noisy as well.

49:14 - Yeah all right, so um deep learning, so the the fundamental ideas behind it was invented perhaps three four decades ago. It is it is not a very new concept. Deep learning has only taken off over the last decade or so because of the improvements in big data and the improvements in compute compute powers. And deep learning is really powerful when you have a lot of data if you’re working with very sparse data with very noisy data noisy is something that we could potentially overcome but if you have really small data sets that do not give a complete representation of you know the various possibilities that could occur in the world um then you’re going to have challenges with deep learning.

Exactly how much data you will need for each particular problem that i would put it under the art part rather than the science part of deep learning so it really comes down to just giving it a try, see how it works see what are these biases, and see if those biases can be overcome.

50:22 - Okay so this is all that, unfortunately all that we have time for today, we have a lot more questions coming through but we are going to have to wrap it up. So thank you very much Titus for sharing your expertise with us today and thank you to everybody who’s come along as well before you leave I have a couple more things to tell you about which may be of use to you, so if you just bear with me a moment and I will share my screen again. Okay so the first things first is as mentioned this is part, one of many webinars that we really run at Australian BioCommons and we have quite a few coming up in August with that might be of interest to you.

The next one will be in the first week of August and the details of this uh will be up on the website shortly and we’ll be focusing on the Nextflow Tower and it’ll be a demonstration of how you can use that software. We also have a webinar on getting started with R so if you’re very new to bioinformatics and you want to use to to kickstart your bioinformatics journey this is the one to come along to. And then later in that same week we have two webinars where we’ll be exploring the types of compute that are available to Australian researchers and the different options that you have and we’ll also be looking at what you need to do in order to submit and prepare an application to the NCMAS scheme for access to the high performance computing computing facilities in Australia.

So details of all of those webinars can be found on the BioCommons website underneath the events tab. So thank you again Titus and thank you again to all of you for joining us today. Finally i would like to acknowledge our funding The Australian BioCommons is enabled by NCRIS funding via Bioplatforms Australia funding we hope that you’ve enjoyed this webinar and we hope to see you again soon but in the meantime enjoy the rest of your day and bye for now. Thanks bye. .