Thanks for visiting! My name is Chanin Nantasenamat, Ph.D. and in my daytime job I’m an Associate Professor of Bioinformatics and in my free time I am a Content Creator running the Data Professor YouTube channel.
Below is a listing of all articles that I have written that is conveniently categorized for your selection. Please drop a comment to suggest some future topics!
I get asked quite often on my YouTube channel (Data Professor) the following questions about how to break into data science:
So I thought that it would probably be a great idea to write an article about it. And so, here it is. It should be noted that the 10 things that I wish I knew about learning data science is based on my personal journey as a self-taught data scientist. …
On Day 26 of the 66 Days of Data, I continued with coding an implementation in Python for calculating molecular fingerprints for a big chemical library.
This corresponded to 30,000 compounds * 2,000,000 compounds = 60,000,000,000 compound pairs. The former and latter sets represent the 2 compound library that I will use for this coding project.
The concept is simple actually.
For any of the 60,000,000,000 compound pairs, compute the Tanimoto coefficient which is a relative measure of the molecular likeness of 2 molecules where a value of 1 indicates that the 2 query compounds are the same…
On Day 25 of the 66 Days of Dat, I’ve pondered some more about
scikit-learn for an upcoming blog that I’m working on.
Here’s a preview of the first illustration I’ve made on the data representation of tabular datasets used for building models in
On Day 24, I’ve continued to work some more on writing a full blog post about
scikit-learn for data science. Aside from statistics, probably 80% or more of any data problem that you can think of can be handled by machine learning.
Motivated to distill the fundamentals of the scikit-learn library that is as beginner friendly as possible, I’ve set out to write a full blog post about it. As with other
How to Master …. …
On Day 23 of the 66 Days of Data, I started the day off by listening to 12 student presentations on their Mini-Project data analysis of various Kaggle healthcare datasets. And ended the day by being in the live chat of the Premiere video podcast with Nate at StrataScratch.
This is indeed an exciting day, where students are presenting the fruits of their hard work where they have coded a data analytics workflow for a Kaggle healthcare dataset. This is an amazing feat considering that they had no prior knowledge of Python about 3 weeks ago.
Students are given 10-15…
On Day 22 of the 66 Days of Data, I’ve spent time doing a Q and A session for the course I’m teaching as well as coded in Python for analyzing a large chemical dataset.
As the course comes to a close, the day was spent to provide students the opportunity to ask anything that they may have about the course. Most questions pertained to the Mini-Project assignment. In addition to answering questions, I’ve also provided a high-level overview summarizing the big concepts of the course as well as the high-level account of the data analytics workflow that students can…
On Day 21 of the 66 Days of Data, I started off the day by teaching the hands-on tutorial on
scikit-learn to an undergraduate class of Medical Technology students via Zoom. This introductory Python for Health Data Science course is compressed to only 3 weeks from the typical 16 weeks semester and as also mentioned in a prior blog post, it is amazing how students are able to
This also marked almost the last day of class in the sense that there will be no more lectures. …
On Day 20, I’ve spent time wrapping up the Jupyter notebook for the hands-on tutorial of
scikit-learn for a Python course I’m teaching. This included polishing the code cells such that it runs properly as well as adding proper documentation such that it is apparent what the code cells are doing.
I’ve also added high-level overview contents of
scikit-learn to the Jupyter notebook in order to improve readability. Particularly, to help in understanding the principles behind various estimators provided in the
scikit-learn library, I’ve drawn some illustrations to summarize the essence of estimators.
I will be sharing these illustrations in…
On Day 19 of the 66 Days of Data, I worked on creating a Jupyter notebook for teaching students about the use of the
scikit-learn library for basic model building for an introductory Python for #datascience course
I’ve worked on crafting the introductory text cells to get students acquainted to
scikit-learn before diving into the specifics. Here’s a high-level definition of what
scikit-learn is and what it can do.
Founder of Data Professor YouTube Channel | Associate Professor of Bioinformatics | Head, Center of Data Mining and Biomedical Informatics