Tag: Statistics

Top ML Reddit Discussions, NLP Roadmap & Much More!
Top 5 Machine Learning GitHub Repositories & Reddit Discussions Why do we include Reddit discussions in this series? I have personally found Reddit an incredibly rewarding platform for a number of reasons — rich content, top machine learning/deep learning experts taking the time to propound their thoughts, a stunning variety of topics, opensource resources, etc. […]

Math Heavy Topics in Data Science!￼
1. Supportvector machine Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a nonprobabilistic binary linear classifier (although methods such as Platt scaling exist to use SVM in a probabilistic […]

Analyzing Diabetes Patterns amongst Indians, A Beginner’s Guide to Pearson’s Correlation Coefficient, Deep Learning in Cyber Security & Much More!
1. Juicing out the Diabetes Patterns amongst Indians using Machine Learning The data indicates an increase of 266% in the population of diabetics is going to be witnessed by developing countries. The score of the training model was a magnificent 100% which means it classified all the elements correctly as is evident as a result […]

Interview with a Kaggle Master & More
1. Exclusive Interview with 2x Kaggle Master Gilles Vandewiele! “I think one of the nice things about the data science field is that it is so multidisciplinary and that anyone who aspires to become a data scientist can do so.” – Gilles Vandewiele Golden words! As a beginner in data science, this quote gives me […]

Top 5 Data Science & Analytics Online Courses
There is an increasing demand for data science experts in different industries today. In this datadriven economy, it is only natural for data to be a valuable asset in the efficient working of an organization. However, finding a job in this sector requires you to have a set of certain skills and some solid educational […]

Resources to learn Linear Regression
Linear regression shows the linear relationship between the independent(predictor) variable i.e.Linear regression is a quiet and the simplest statistical regression method used for predictive analysis in machine learning. How a Math equation is used in building a Linear Regression model? Do you know that this one equation helps in building a linear regression model in the machine learning world? Yes, you heard it right.From the school days, we have come across the equation of the straight line.

10 DATA SCIENCE SPECIALIZATIONS
It is absolutely normal that we come across a growing number of specializations within the field of Data Science. This will help you in understanding in which area you want to work eventually. 1. DATA MINING Data mining, also known as knowledge discovery in data (KDD) is the process of finding and extracting anomalies and discovering patterns in large data sets to predict outcomes. In simple terms, the main aim of data mining is to extract information with intelligent methods and transform the information into a comprehensible structure. 2. DATA VISUALIZATION It provides a convenient way of understanding trends, outliers, and patterns in data. Data visualization is the domain that deals with the graphic representation of data through visual elements like charts, graphs, maps, and other visualization tools. 3. DATA PROCESSING We can also call it manipulation of data by computers inclusive of output formatting or transformation. Data processing is when you collect data and transform data into useful information. 4. DATA CONSULTANCY It also involves educating companies or clients about various aspects of data technology. They provide a wide range of methods that optimizes business intelligence by leveraging existing data. 5. MARKET DATA ANALYTICS Through it, we can identify the strengths, weaknesses, opportunities, and potential threats of a company. It looks into the depths of consumer segments, buying patterns, competition, and the economic environment. 6. CYBERSECURITYRTY DATA ANALYSIS These experts produce intelligence to improve the security and privacy of data of an organization from external and internal threats. Cybersecurity uses data science to protect software and devices from cyberattacks. 7. DATA ARCHITECTURE Data architecture refers to how an organization collects, store, transform, distribute and use data. These days it is important for organizations to have centralized data architecture in accordance with industry standards. 8. DATA ENGINEERING Their primary job is to design, manage and optimize the flow of data with databases throughout the organization. It is the practice of designing and building data systems for collecting, storing, and analyzing data at scale. 9. BUSINESS INTELLIGENCE AND DATA ANALYTICS It describes the strategies, technologies, and tools companies further use to obtain important business information. Data analytics ad data analytics are their subsets that provide data management solutions to understand contemporary data and gain relevant insights. 10. COGNITIVE MACHINE LEARNING Cognitive computing systems work with humans and provide them with advice in making informed decisions. It intends to use the best algorithm and come up with an accurate action/result.

Decision Tree From Scratch!! Part I
Introduction In this blog post, I am going to talk about a powerful supervised learning algorithm that is often used in Machine Learning competitions. It is called the Decision Tree algorithm. It can be used for both classification & regression tasks. In this post, I will discuss the need for treebased algorithms, the basics of […]

Logistic Regression in Machine Learning (from Scratch !!)
Introduction In this blog post, I would like to continue my series on “building from scratch.” I will discuss a linear classifier called Logistic Regression. This blog post covers the following topics, Basics of a classifier Decision Boundaries Maximum Likelihood Principle Logistic Regression Equation Logistic Regression Cost Function Gradient Descent Algorithm After the discussion of […]

Probability Distributions that every Data Scientist must know
Introduction Probability of an event tells us how likely is that, the event will occur. The applications of probability begin with the numbers p0, p1, p2… that give the probability of each possible outcome. There are dozens of famous and useful possibilities for p. I will discuss four of them in this post. Before going […]