Data Science ENG Archivi - Meccanismo Complesso

Statsmodels – the Python library for statistics

Statsmodels is an open-source library that offers a wide range of tools for estimating statistical models, running statistical tests, and visualizing data.

Data Science ENG / Statistics ENG

The Sign Test, a nonparametric method with Python

The Sign Test is a nonparametric method used to compare two related samples or to test whether the median of a population differs from a specific value. It is especially useful when the data does not meet the assumptions necessary for parametric tests, such as normality of the distribution.

Data Science ENG / Statistics ENG

Calculating Measures of Dispersion in Statistics with Python

Measures of dispersion in statistics provide an indication of the variability or spread of data within a set. In other words, they show how much the data deviates from the mean or central value. These measures are critical because they provide valuable information about the distribution and consistency of data, allowing analysts to better understand the nature and characteristics of a data set.

Data Science ENG / Statistics ENG

Non-Parametric Statistics

Non-parametric statistics is a branch of statistics that focuses on the analysis of data without making rigid assumptions about their distribution.

Data Science ENG / Statistics ENG

Calculating centrality measures with Python: Mean, Median and Mode

Centrality measures, such as mean, median, and mode, are fundamental in descriptive statistics.

Data Science ENG / Statistics ENG

The Cumulative Distribution Function (CDF) in Python

The Cumulative Distribution Function (CDF) is a mathematical function that provides the probability that a random variable is less than or equal to a certain value. In other words, the CDF provides an overview of the probability distribution of a random variable. In Python, you can use CDF through libraries like NumPy, SciPy or Statmodels. These libraries provide methods to calculate the CDF for different probability distributions, such as normal distribution, binomial distribution, Poisson distribution, etc.

Data Science ENG / Statistics ENG

Joint probability and Union probability

Joint Probability and Union Probability are fundamental concepts in probability theory, and represent different ways of describing relationships between events.

Data Science ENG / Machine Learning ENG

Ensemble Learning: Unity is strength in Machine Learning

Ensemble Learning is a technique in the field of Machine Learning in which multiple learning models are combined together to improve the overall performance of the system. Rather than relying on a single model, Ensemble Learning uses multiple models to make predictions or classifications. This technique takes advantage of the diversity of models in the ensemble to reduce the risk of overfitting and improve the generalization of the results.

Data Science ENG / Machine Learning ENG

Linear Regression with Elastic Net in Machine Learning with scikit-learn

Elastic Net is a linear regression technique that adds a regularization term by combining both the L1 penalty (as in Lasso regression) and the L2 penalty (as in ridge regression). So, it is based on the linear regression model, but with the addition of these penalties to improve the performance of the model, especially when there are multicollinearities between the variables or you want to make a selection of the variables.

Data Science ENG / Machine Learning ENG

Linear regression with Lasso in Machine Learning with scikit-learn

Lasso (Least Absolute Shrinkage and Selection Operator) regression is a linear regression technique that uses L1 regularization to improve generalization and variable selection. Lasso regression is a powerful technique for linear regression that combines dimensionality reduction with the ability to select the most important variables, helping to create more interpretable and generalizable models.

Category: Data Science ENG