P Value?

The p-value can be defined the  probability of observing a test statistic as   it was calculated from the data, assuming the null hypothesis is true.

Null Hypothesis (H0): In statistical hypothesis testing, we start with a null hypothesis, which is a statement that there is no effect or no difference. it takes  like a default assumption.

Alternative Hypothesis (Ha): This is the opposite of the null hypothesis which  represents with which we are trying to  prove. It suggests that there is a statistically significant effect or difference.

Test Statistic: To evaluate the null hypothesis, we will  calculate a test statistic from the  data. This statistic test depends on the specific test which we are  performing .

Probability Distribution: we can  compare the test statistic to a probability distribution which is appropriate for the  data and test.

Example:

For example Imagine that we are  having  a magical coin, and if we want   to check if it’s a fair coin or if it’s biased and always lands on heads. 

Null hypothesis can be defined as  that the coin is fair, meaning it has an equal chance of landing on heads or tails each time we flip it.

alternative hypothesis is that the coin is not fair, and it’s biased towards landing on heads. This is what we want to find out.

Now,  flip the coin 10 times, and  keep track of how many times it lands on heads. Let’s say it lands on heads 8 times out of 10 flips.

The p-value is like a special number that helps us to decide if the  coin is fair or not based on the results you got (8 heads out of 10 flips).

If the coin is fair,  expect it to land on heads about 5 times out of 10 flips because it’s a 50-50 chance. So, we use the p-value to see how likely it is to get a result as clear as 8 heads by random chance if the coin is actually fair.

If the p-value is very low (say, less than 0.05), it means it’s very unlikely to get 8 heads by chance if the coin is fair.

But if the p-value is high (say, more than 0.05), it means it’s quite likely to get 8 heads by chance even if the coin is fair. we can  say, this result could happen by luck, so I can’t be sure if the coin is biased or not.”

So, the p-value helps us to  decides a way to measure how sure we can be about our guess (null hypothesis) based on the data we were collected.

Linear Regression

I learnt Linear regression which is a statistical approach allows us study relationship between two continuous variables.

mathematically we can write the expression for linear regression is as follows

Y=α+βX

where Y is a dependent variable and X is an independent variable.

As per the Diabetes Dataset provided, the variables included in this particular dataset as follows %Obesity and %Inactivity.

which X represents %Obesity and Y represents %Inactivity as per the Equation and Dataset Provided.

Linear Regression helps us to predict the Diabetes disease by using the variables %obesity and %Inactivity in a linear way.

For the first step we need to plot the all points which are related to the dataset provided ,this is for analyzing and understanding  the data in an  efficient way for statistical Analysis.