Ram Dayal Goyal - Artificial Intelligence and Machine Learning Scientist

A Technology Leader and Professional
In
Artificial Intelligence (AI), Machine Learning (ML) and Data Science

Profile

AI/ML/Data Science

Patents/Publications

Non-Technical

Contact

I have a great passion towards artificial intelligence since my graduation days 1993-97. Those days the terms Artificial Intelligence, machine learning, pattern recognition neural networks etc. were confined to mostly academic research only. I was focussed and determined to get full specialization for all this wonderful stuff irrespective of job market. I would say I was most fortunate :-) who got thoroughly educated in all the facets of artificial intelligence and I have been using these continuously in my career of 20+ years. Here I am presenting my views about a few very important points/questions of this topic:

Confusion of Terms AI/ML/Data-Science/Data-Mining
Various Facets of Artificial-Intelligence/Machine-Learning
Is R/Python mandatory for AI/ML
What is Needed to Start Learning AI/ML
Big Data Analytics
Deep Learning
Setting and Meeting Appropriate Business Expectations

Confusion of Terms AI/Machine Learning/Data Science

Many people ask me about the difference and many people define it in their own words. Nothing is wrong or right - ultimately they convey same things in different ways. In my thinking:

Computer programming was created - to do repetitive computational tasks automatically. A lot of progress was made by statisticians to "analyse" business data for better business decision. Still it has been felt that tasks which are too easy for human being like cognition, gaming etc. are extremely difficult (or near impossible) to get done by machines (computers).

Altogether, we wanted our machines should have such capabilities (intelligence) so that they could assist us to maximum possible extent. They should be able compute beyond what has been provided as baseline data. Thus, "inductive learning" algorithms came into existence. Subsequently our great engineers, scientist, mathematicians and doctors came together and created mathematical model of neurons and their connected architecture with phenomenal cognitive capabilities. Later on Fuzzy computations etc. also came and modern machines became more and more intelligent co-worker.

In this way term "Artificial Intelligence" was coined. "Machine Learning" is study of how to make a machine capable of learning like humans do. Thus, both "Machine Learning" and "Artificial Intelligence" are, in fact, interchangeable terms. In modern era, when more focus is towards various angles of business/profitability - the term "Data Science" was coined. Prior to that "Data-Mining" was also used in the same context. The purpose was intelligent decision from data just like human being. Thus in true sense, Data Science is also interchangeable with "Machine Learning". But a few people consider it as a sub-field of Machine Learning due to high focus towards business data only.

So, rather getting into the such confusion, one must focus toward the actual problem solving which may fall into various facets as explained below.

Various Facets of AI/ML

Consider Artificial Intelligence or Machine Learning as a broader term or set which contains many subsets dedicated to serving one or more subfields like:

Computer Vision
- Machine's capability of "interpreting" the visual content - images and videos
Neural Networks
- Interconnected architecture of artificial neurons performing specialized tasks of classification
Pattern Recognition
- Identification of "meaningful characteristics" in data
Natural Language Processing
- "Understanding/Interpretation" of language we speak/write
Fuzzy Logic
- A big leap from Crisp Set theory to Fuzzy-Set - accommodation of real life terms like "low, high, medium, less, more etc."
Combinatorics
- Analysis of various combinations of a particular problem space - Puzzles/Gaming etc.
Expert System
- Intelligent rule based system designed to work for specific hierarchical problem space
Search Engine and Information Retrieval
- Searching for information in a big pool of data with human like intelligence
Data Mining
- Overall analysis and decision making w.r.t. a given pool of data
etc.

Is R/Python Mandatory for AI/ML

During one of my talk, an engineer asked me "should I read a book artificial intelligence in Python". I replied if he wanted to study "python" then any book on python was enough and If he wanted to study "AI/ML" then any high-level programming language was good enough. This is because AI/ML are just concepts and can be coded, in any high-level language like Java, C++ etc. Only good thing is rich libraries of many frequently useful components are available in R/Python etc. but not limited to these languages only. For various facets of AI, so many helping libraries have been developed by community. I will recommend an engineer should have deeper understanding of concept and then take judicious decision about selecting one or more.

What is Needed to Start Learning AI/ML

Very good algorithmic/data-structure knowledge with complexity analysis
Good understanding of problems/domain at hand
Moderate to good knowledge of algebra, calculus, probability, statistics. This can be gathered in parallel while learning other core concepts.
Visualization knowledge is plus but not mandatory. It is a complimentary skill.
Start with simple data classification problems
I recommend implement yourself atleast one algorithm; later on play with libraries of your choice.

Big Data Analytics

This has become a big need and very hot topic in last few years. Consider when data is so large in volumne, variety or velocity that one machine is not able to store/process it. We will have to use special methods to store, update, retrieve, process the data scattered in different machines. Hadoop is one of distributed storage system which handles all such overhead of big data and we feel that data is lying in one machine only. Moreover we need to use the computing powers of multiple connected machines. All that has lead to evolution of "Distributed Computing".

Ideally, Machine Learning (Data Science, Artificial Intelligence) is independent of size of data but while dealing with big data, AI/ML algorithms are designed in special way to handle the complexity of computations. Consider "Spark" a distributed computing paradigm provides capabilities to handle large volume of data and its Machine Learning libraries help AI/ML expert in solving problems dealing with big data. Beware that knowledge of Hadoop/Spark and AI/ML are two entirely different things.

Analytics is getting insights from a pool of data which help the decision makers for better profitability. It starts with simple statistics from the data and advances toward more complex information/data mining. But again it is same as Data Science dealing with large amount of data.

Deep Learning

Again a new fancy term has been been coined by community for "Neural Networks". Earlier, due to lack of computing power, most neural-networks based solutions revolved around maximum 3 layers of connected neurons. That has solved so many major problems too. But now we have tremendous computing power and memory and hence we can easily experiments with more "deeper" architecture. One new thing that is evolving in this field (deep learning) is - "learning features automatically" rather computing them a prior. Certainly this research may yield many new insights of "machine learning" but the baseline remains same.

Setting Appropriate Business Expectations

I am surprised when I find organizations claiming use of AI/ML, dont get benefits or expected return on investment (ROI). Sometime, they claim use of AI/ML but reality is different. Quite often we hear "We will use AI/ML and will solve this". I had seen people using term "understand" freely for "user behaviour", "natural language documents" etc. But they fail to provide the corresponding ground reality and finer detail for production ready programming. AI/ML are just normal computer programs and not a magic and not even tools which one can buy and deploy in any system directly. At other end, a few people think - just calling a machine learning library API call will solve the problem fully. So a crisp definition and scope of AI/ML tasks is needed for any kind of problem for real benefit.

Understanding problem domain fully, Analysing all possible human ways to solve, finding different options to assist those first and then a deeper dive to reach more and more automation is what will make AI/ML practitioner truly beneficial to any organization. All the business strategies are time bound and mile-stones oriented and hence the solution also should be fragmented in small and clear goals rather trying to make "human brain" and ultimately fail.