Posts

Showing posts from May, 2017

The Journey Begins

I have started on a journey to know more about data science. There is a lot of science and statistics behind using big data tools and this blog is about making a few notes along the way. One of the things I learnt today was about T-test. There is the Student's t-test and the Welch t-test . They are hypothesis tests on two samples. The t-test can be used, for example, to determine if two sets of data are significantly different from each other. Below is a piece of code using the scipy library of python on using the Welch t-test. The baseball data is from the Lahman database.