Thanks for visiting! My name is Chanin Nantasenamat, Ph.D. and in my daytime job I’m an *Associate Professor of Bioinformatics* and in my free time I am a *Content Creator* running the ** Data Professor** YouTube channel.

Below is a listing of all articles that I have written that is conveniently categorized for your selection. Please drop a comment to suggest some future topics!

**How a Biologist Became a Data Scientist**

How I Transitioned from a Non-Technical Background into Data Science

**101 Data Science Quotes**

Powerful Quotes To Inspire Your Data Science Journey**Data Science Starter Kit**

A guide for getting started…

I get asked quite often on my YouTube channel (Data Professor) the following questions about how to break into data science:

*How to become a Data Scientist?**What is the roadmap to being a Data Scientist?**What courses should I take to learn Data Science?*

So I thought that it would probably be a great idea to write an article about it. And so, here it is. It should be noted that the 10 things that I wish I knew about learning data science is based on my personal journey as a self-taught data scientist. …

On Day 26 of the 66 Days of Data, I continued with coding an implementation in Python for calculating molecular fingerprints for a big chemical library.

How big?

This corresponded to 30,000 compounds * 2,000,000 compounds = 60,000,000,000 compound pairs. The former and latter sets represent the 2 compound library that I will use for this coding project.

The concept is simple actually.

For any of the 60,000,000,000 compound pairs, compute the Tanimoto coefficient which is a relative measure of the molecular likeness of 2 molecules where a value of 1 indicates that the 2 query compounds are the same…

On Day 25 of the 66 Days of Dat, I’ve pondered some more about `scikit-learn`

for an upcoming blog that I’m working on.

Here’s a preview of the first illustration I’ve made on the data representation of tabular datasets used for building models in `scikit-learn`

:

On Day 24, I’ve continued to work some more on writing a full blog post about `scikit-learn`

for data science. Aside from statistics, probably 80% or more of any data problem that you can think of can be handled by machine learning.

Motivated to distill the fundamentals of the scikit-learn library that is as beginner friendly as possible, I’ve set out to write a full blog post about it. As with other `How to Master …. …`

On Day 23 of the 66 Days of Data, I started the day off by listening to 12 student presentations on their Mini-Project data analysis of various Kaggle healthcare datasets. And ended the day by being in the live chat of the Premiere video podcast with Nate at StrataScratch.

This is indeed an exciting day, where students are presenting the fruits of their hard work where they have coded a data analytics workflow for a Kaggle healthcare dataset. This is an amazing feat considering that they had no prior knowledge of Python about 3 weeks ago.

Students are given 10-15…

On Day 22 of the 66 Days of Data, I’ve spent time doing a Q and A session for the course I’m teaching as well as coded in Python for analyzing a large chemical dataset.

As the course comes to a close, the day was spent to provide students the opportunity to ask anything that they may have about the course. Most questions pertained to the Mini-Project assignment. In addition to answering questions, I’ve also provided a high-level overview summarizing the big concepts of the course as well as the high-level account of the data analytics workflow that students can…

On Day 21 of the 66 Days of Data, I started off the day by teaching the hands-on tutorial on `scikit-learn`

to an undergraduate class of Medical Technology students via Zoom. This introductory Python for Health Data Science course is compressed to only 3 weeks from the typical 16 weeks semester and as also mentioned in a prior blog post, it is amazing how students are able to

This also marked almost the last day of class in the sense that there will be no more lectures. …

On Day 20, I’ve spent time wrapping up the Jupyter notebook for the hands-on tutorial of `scikit-learn`

for a Python course I’m teaching. This included polishing the code cells such that it runs properly as well as adding proper documentation such that it is apparent what the code cells are doing.

I’ve also added high-level overview contents of `scikit-learn`

to the Jupyter notebook in order to improve readability. Particularly, to help in understanding the principles behind various estimators provided in the `scikit-learn`

library, I’ve drawn some illustrations to summarize the essence of estimators.

I will be sharing these illustrations in…

On Day 19 of the 66 Days of Data, I worked on creating a Jupyter notebook for teaching students about the use of the `scikit-learn`

library for basic model building for an introductory Python for #datascience course

I’ve worked on crafting the introductory text cells to get students acquainted to `scikit-learn`

before diving into the specifics. Here’s a high-level definition of what `scikit-learn`

is and what it can do.

Founder of Data Professor YouTube Channel | Associate Professor of Bioinformatics | Head, Center of Data Mining and Biomedical Informatics