Maier Unleashed - Artificial Intelligence - A Bitter-Sweet Symphony of Modelling

Apr 1, 2020 08:54 · 1582 words · 8 minute read important techniques finally create automated

hi everybody so I have this small video that I want to create today, where I want to introduce you a bit to the recent past of AI and why I think that general models are useful but model driven approaches may also be very interesting for future research. So, in the past we have seen tremendous progress in artificial intelligence and it’s not just very recent we’ve seen that over decades. Of course, we are building on all these great abstractions that people have invented over the millennia such as matrix multiplications. However the process was not achieved steadily so there were many significant ups and downs on the way and some of these phases people were even afraid to openly commit to the term artificial intelligence as the reputation of the field was severely damaged anybody working on AI was considered a dreamer in some of these periods. And that’s how I thought maybe I can multiply my tiny little bit of creativity into infinity.

01:31 - So this led also to different names and subdivisions of the fields sometimes these heard people talking about machine learning some say they do data mining and others do pattern recognition when they’re very close to the signal. So there was a recent blog post by Richard Sutton and he identified the lack of computational power as one of the main reasons for those up and downs and I really liked his analysis so this is why I wanted to show you this small graph here. And then with increase in compute power these general models were really able to achieve much better result than the handcrafted expert models. But, however, at some point when these general models became computationally too expensive Moore’s law struck and further progress using general approaches became impossible. A good theory of problem-solving under limited resources like here in this universe or on this little planet has to take into account these limited resources.

02:45 - You can see this here in this figure we have the power compute power increasing and at the point where we just need too much compute power for the generalized models we essentially experience a backlash and then we have to go back in to something else where we go back and include models into our approaches just to achieve the goals that we want to achieve computationally more efficient so based on this analysis the conclusion is drawn that the incorporation of expert knowledge is a waste of resources as only one needs to wait as enough compute power emerges and then we can solve them next milestone of artificial intelligence you can build a machine that learns to solve more and more complex problems and more and more general problem solver then you basically have solved all the problems at least all the solvable problems so this conclusion is quite radical and deserves a little more analysis hence we have a short historical perspective that is in fact a short summary of three articles that have been published recently on KT nuggets the first AI hype started in the 1950s and led to important developments in the field Minsky developed the first neural network computer called stochastic neural analogy reinforcement computer snark which was inspired by biological design and map neurons into an electrical machine further important developments where the perceptron made biologically inspired neurons trainable in a computer program another important development of that era was the invention of rule-based grammar models that were employed to solve first simple neural language processing tasks also the concept of the Turing test falls into this period of AI with these great developments high expectations on AI were gradually growing faster and faster yet these concepts could not live live up to their expectations and failed in many daily life applications such as matrix multiplications these doubts were further supported by theoretical observations for example that the perceptron is not able to learn the logical or exclusive or or function so it’s a not linearly separable therefore it cannot be modeled by a perceptron as a result funding and AI was tremendously cut which was also known as da I winter and winter is coming in the 80s the 80s I thought about how to build this machine that learns to solve all these problems the fascination of AI returned this second boom was fostered by the development of more important techniques such as back propagation algorithm that made multi-layer perceptrons trainable and the theoretic observation that neural network that a neural network with a single hidden layer is already a general function of proximity recurrent Nets were developed and reinforcement learning made also gained theory trainable another breakthrough of that time was the development of statistical language models that gradually started to replace replace systems even deep convolutional networks were already explored at the time yet the ever-growing computational demand supported by numerical instabilities resulted in long training times at the same time other model base complex techniques such as support vector machines and assembling emerged that gradually reduced importance of neural networks as same or better results could be obtained in less time in particular convex optimization that guarantees a global minimum and variational methods became important concepts effectively putting an end to the second AI high at that time neural networks were known to be inefficient and numerically unstable many researchers would no longer invest their time in this direction as other methods were more efficient and could process data much faster the stuff that works best as a really simple the furred AI period is essentially the one that we are currently experiencing and it’s again driven by many important breakfasts because of self-improvement that is really the pinnacle of that why you then not only learn how to improve on that problem and on that but you also improve the way the machine improves and you also improve the way it improves the way it improves itself and that was my 1987 diploma thesis which was all about that so we’ve seen computers beat bird cars go players they create art we have to rewrite lots of science fiction because we considered computers always in able to generate asset Peters pieces of art and suddenly they can do it and we tackle previously close to impossible appearing tasks such as image captioning where we convert entire images into into a text so a major recurring theme in this present period is that hand crafting and feature engineering is no longer required and deep learning algorithms solve everything out of the box in fact important concepts have been discovered that we’re able to generalize on many state-of-the-art approaches deep convolutional networks have replaced multi scale approaches like sift and wavelet fear we also trainable convolutional neural networks have demonstrated to outperform the so called mell frequency substant coefficients which is the feature vector or the type of feature computing that has been done for decades in speech recognition and they were there for almost 50 years and now it seems that much of the theory that has been developed in the past has become obsolete and there is very little theory behind the best solutions that we have at the moment so we don’t need these things anymore we just do everything with neural networks right but you have to keep in mind that the network design is often inspired by the classic feature extraction models and this is what people tend to forget so the image processing features distinct similarities for fateful it transforms the audio processing still forms implicit filter bangs so as such the knowledge is still there but it’s encoded in a different form still one has to acknowledge that the trainable versions of previous algorithms clearly outperform their predecessors our current analysis of course likes the third component and that is an important driver of our current success in AI and thence the availability of digital data for the training of these machine learning I will for the far future we can already speculate that such knowledge from AI will be replaced and years to come by true generally AI that is able to create maintain and reuse such models on its own as Sutton and others so it has to have the learning algorithm in a representation that allows it to inspect it and modify it so that it can come up with a better learning algorithm so I call that in meta learning in my opinion however approach to combine domain knowledge with deep learning are not in weighing because the next level of generalist is generalization will be achieved by lessons learned from model driven area that is still data while developing model revenue I will understand how to build good trainable AI solutions and finally create automated methods to achieve the same of course it is difficult to predict especially about the future either way beard general model free approaches or model driven science we are looking at exciting years to come from machine learning pattern recognition and data model so I hope you liked this little video and if you liked it you can leave a like you can subscribe or you can also leave some comments and ask further questions that I will happily take so thank you for your for your attention and if you really liked it I might be making more of these videos.