Collection of cover art for all my Medium blog posts

All my blogs on 1 page (Updated June 6, 2021)

Thanks for visiting! My name is Chanin Nantasenamat, Ph.D. and in my daytime job I’m an Associate Professor of Bioinformatics and in my free time I am a Content Creator running the Data Professor YouTube channel.

Table of Contents

Below is a listing of all articles that I have written that is conveniently categorized for your selection. Please drop a comment to suggest some future topics!

How did I become a data scientist?

New to data science?


Data Science

The Ultimate Roadmap for Starting Your Data Science Journey

Modified (with License) from the Photo by koctia on Envato Elements

I get asked quite often on my YouTube channel (Data Professor) the following questions about how to break into data science:

  • How to become a Data Scientist?
  • What is the roadmap to being a Data Scientist?
  • What courses should I take to learn Data Science?

So I thought that it would probably be a great idea to write an article about it. And so, here it is. It should be noted that the 10 things that I wish I knew about learning data science is based on my personal journey as a self-taught data scientist. …


Created (with license) using the image by koctia from envato elements.

Getting Started

Here’s the Essential Scikit-learn you Need for Data Science

Scikit-learn is one of many scikits (i.e. short form for SciPy Toolkits) that specializes on machine learning. A scikit represents a package that is too specialized to be included in SciPy and are thus packaged as one of many scikits. Another popular scikit is the scikit-image (i.e. collection of algorithms for image processing).

Scikit-learn is by far one of the pillars for machine learning in Python as it allows you to build machine learning models as well as providing utility functions for data preparation, post-model analysis and evaluation.

In this article, we will be exploring the essential bare minimal knowledge…


66 Days of Data

Documenting my Data Science Learning Journey

Photo by Thought Catalog on Unsplash

In this post, I combined what I did for Days 27–33 into a single post as I’ve been away for a week from posting contents on Medium. Hope you like the concised nature of this post. I’ve also added links to resources or YouTube videos that I watched during this period, just in case you’re interested in checking them out.

Day 27

  • Explored the multiprocessing library in Python for handling large calculation such as the molecular fingerprint problem mentioned earlier.

Day 28

  • Catching up on writing the backlog of 66 Days of Data full blog post on Medium that I’ve been behind on for…


66 Days of Data

Documenting my Data Science Learning Journey

Photo by Terry Vlisidis on Unsplash

On Day 26 of the 66 Days of Data, I continued with coding an implementation in Python for calculating molecular fingerprints for a big chemical library.

How big?

This corresponded to 30,000 compounds * 2,000,000 compounds = 60,000,000,000 compound pairs. The former and latter sets represent the 2 compound library that I will use for this coding project.

The concept is simple actually.

For any of the 60,000,000,000 compound pairs, compute the Tanimoto coefficient which is a relative measure of the molecular likeness of 2 molecules where a value of 1 indicates that the 2 query compounds are the same…


66 Days of Data

Documenting my Data Science Learning Journey

Photo by Austin Distel on Unsplash

On Day 25 of the 66 Days of Dat, I’ve pondered some more about scikit-learn for an upcoming blog that I’m working on.

Preview of the Illustration for my upcoming scikit-learn blog

Here’s a preview of the first illustration I’ve made on the data representation of tabular datasets used for building models in scikit-learn:


66 Days of Data

Documenting my Data Science Learning Journey

Photo by Markus Winkler on Unsplash

On Day 24, I’ve continued to work some more on writing a full blog post about scikit-learn for data science. Aside from statistics, probably 80% or more of any data problem that you can think of can be handled by machine learning.

Motivated to distill the fundamentals of the scikit-learn library that is as beginner friendly as possible, I’ve set out to write a full blog post about it. As with other How to Master …. …


66 Days of Data

Documenting my Data Science Learning Journey

Photo by Teemu Paananen on Unsplash

On Day 23 of the 66 Days of Data, I started the day off by listening to 12 student presentations on their Mini-Project data analysis of various Kaggle healthcare datasets. And ended the day by being in the live chat of the Premiere video podcast with Nate at StrataScratch.

Student Presentation of the Mini-Projects

This is indeed an exciting day, where students are presenting the fruits of their hard work where they have coded a data analytics workflow for a Kaggle healthcare dataset. This is an amazing feat considering that they had no prior knowledge of Python about 3 weeks ago.

Students are given 10-15…


66 Days of Data

Documenting my Data Science Learning Journey

Photo by Alex Kondratiev on Unsplash

On Day 22 of the 66 Days of Data, I’ve spent time doing a Q and A session for the course I’m teaching as well as coded in Python for analyzing a large chemical dataset.

Q and A for the Introductory Course on Python for Data Science

As the course comes to a close, the day was spent to provide students the opportunity to ask anything that they may have about the course. Most questions pertained to the Mini-Project assignment. In addition to answering questions, I’ve also provided a high-level overview summarizing the big concepts of the course as well as the high-level account of the data analytics workflow that students can…


66 Days of Data

Documenting my Data Science Learning Journey

Photo by David Travis on Unsplash

Hands-on scikit-learn tutorial

On Day 21 of the 66 Days of Data, I started off the day by teaching the hands-on tutorial on scikit-learn to an undergraduate class of Medical Technology students via Zoom. This introductory Python for Health Data Science course is compressed to only 3 weeks from the typical 16 weeks semester and as also mentioned in a prior blog post, it is amazing how students are able to

This also marked almost the last day of class in the sense that there will be no more lectures. …

Chanin Nantasenamat

Founder of Data Professor YouTube Channel | Associate Professor of Bioinformatics | Head, Center of Data Mining and Biomedical Informatics

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store