1. Juicing out the Diabetes Patterns amongst Indians using Machine Learning
- The data indicates an increase of 266% in the population of diabetics is going to be witnessed by developing countries.
- The score of the training model was a magnificent 100% which means it classified all the elements correctly as is evident as a result of the confusion matrix.
- It can be seen that both the training and testing dataset were balanced.
- In the confusion matrix analysis for the testing dataset, it was observed that very few elements were misclassified compared to the Decision Tree model.
- Machine Learning models if synchronized properly with the knowledge of anatomy and physiology, clinical parameters, laboratory parameters, and medicines can prove to be a game-changer in the ongoing fight against diabetes.
Categories: Diabetes, Machine Learning
2. Exploratory Analysis Using SPSS, Power BI, R Studio, Excel & Orange
Case: Please carry out an Exploratory Data Analysis and create a compelling story based on the given dataset; also predict which Article will be more popular in the near future.
- In the visualizations (publishing according to the days & popularity according to the days), it can be observed that Mashable usually publish fewer articles on weekends as people don’t like to read more articles on weekends the reason can be any — Maybe people will get only Saturday and Sunday as holidays, and they might want to relax or travel instead of reading articles.
- Now, the popularity has been compared with the different themes (Analysis is based on the Data of the past two years): From the Visualisations, it can be observed that Business Articles were less popular on the Mashable website.
- From the Visualisations, it can be observed that Lifestyle Articles were more popular on the Mashable website.
- From the Visualisations, it can also be observed that Social Media Articles were more popular on the Mashable website.
Categories: Data Exploration
Link to the entire article: https://www.analyticsvidhya.com/blog/2020/12/exploratory-analysis-using-spss-power-bi-r-studio-excel-orange/
3. Beginner’s Guide to Pearson’s Correlation Coefficient
- Correlation can be found between continuous variables using python: We can see in the above scatterplot, that as the carlength, curbweight, and carwidth increase, the price of the car also increases.
- We can say that there is a positive correlation between the above three variables with car price. The value of ‘r’ is equal to near to +1 or -1 which means all the data points are included on or near to the line of best fit respectively.
- You need to consider outliers that are unusual only on one variable, called a ‘univariate variable’ or for both of the variables known as ‘multivariate outliers’.
- If we plot age vs amount then, we can certainly, see that there is a correlation between the age of a person and the loan the amount is given to that person, as age increases the loan amount given to the person decreases and vice versa.
Categories: Pearson’s correlation coefficient
Link to the entire article: https://www.analyticsvidhya.com/blog/2021/01/beginners-guide-to-pearsons-correlation-coefficient/
4. Stock Prices Prediction Using Machine Learning and Deep Learning Techniques (with Python codes)
- Instead of using the simple average, we will be using the moving average technique which uses the latest set of values for each prediction.
- There is not a huge difference in the RMSE value, but a plot for the predicted and actual values should provide a more clear understanding.
- Although the predictions using this technique are far better than that of the previously implemented machine learning models, these predictions are still not close to the real values.
- There are a number of time series techniques that can be implemented on the stock prediction dataset, but most of these techniques require a lot of data preprocessing before fitting the model. LSTM is one of them.
- This article implements LSTM as a black box and checks its performance on the particular data.
Categories: Auto Arima, KNN, Linear Regression, LSTM, Moving Average, Facebook Prophet, Python, Stock Market Analysis, Stock Prediction, Time Series, Time-Series Forecasting
5. Using the Power of Deep Learning for Cyber Security (Part 1)
- Deep learning is not a silver bullet that can solve all the InfoSec problems because it needs extensive labeled datasets.
- The automatic differentiation is used to calculate the gradient that is needed in the calculation of the weights to be used in the network. in the paper “ Inferring Application Type Information from Tor Encrypted Traffic ” extracted burst volumes and directions to create an HMM model to detect the TOR applications that might be generating that traffic.
- However, the architecture uses a plethora of other meta-information that can be obtained to classify the traffic. For example, if one needs to train a classifier to detect the application used by TOR, then only the output layer needs retraining, and all the other layers can be kept the same.
Categories: Cyber Security, Deep Learning, Deep Learning Security, Infosec
Link to the entire article: https://www.analyticsvidhya.com/blog/2018/07/using-power-deep-learning-cyber-security/
6. DeepMind’s Computer Vision Algorithm Brings the Power of Imagination to Build 3D Scenes from 2D Images
- Without having properly labeled data, the model might as well not exist! Often to train complex models, we have to manually tag and annotate the images to be used by the algorithm.
- In other words, the AI algorithm is able to use the 2D images to understand or “imagine” how the object looks from various angles (which are not seen in the image).
- The GQN has the ability to learn about the shape, size, and color of the object independently and can then combine all these features to form an accurate 3D model.
- Furthermore, the researchers were able to use this algorithm to develop new scenes without having to explicitly train the system as to what object should go where.
- GQN is not limited to tagging and annotating images, it can further be used by autonomous robots to have a better understanding of their surroundings.
Categories: Artificial Intelligence, Computer Vision, Deepmind, Google, DeepMind, Object Detection
Link to the entire article: https://www.analyticsvidhya.com/blog/2018/06/google-ai-create-3d-objects-using-2d-snapshots/
7. Part 15: Step by Step Guide to Master NLP – Topic Modelling using NMF
- In this method, each of the individual words in the document term matrix is taken into consideration. But the one with the highest weight is considered as the topic for a set of words.
- To measure the distance, we have several methods but here in this blog post, we will discuss the following two popular methods used by Machine Learning Practitioners.
- Let’s discuss each of them one by one in a detailed manner, by clicking on the link below.
Categories: Topic Modelling using NMF
Link to the entire article: https://www.analyticsvidhya.com/blog/2021/06/part-15-step-by-step-guide-to-master-nlp-topic-modelling-using-nmf/
- Top ML Reddit Discussions, NLP Roadmap & Much More!
- Data Science Resources, ETL Practices, Beginner’s guide to Seaborn
- Math Heavy Topics in Data Science!￼
- Analyzing Diabetes Patterns amongst Indians, A Beginner’s Guide to Pearson’s Correlation Coefficient, Deep Learning in Cyber Security & Much More!
- Interview with a Kaggle Master & More