Anyone working on data projects, with any kind of tabular datasets is bound to come to a situation where they will need to join or bring two or more datasets/tables together to form a more relevant dataset/table for analysis.

In this blog, I go over some of the common scenarios of JOINS. Each case of JOIN is explained best using set logic visualization; Venn diagrams and the corresponding SQL syntax that goes with it (the syntax followed here is as used in PostgreSQL, but the logic behind the joins is the same). With the goal of joining two tables, the…

Through the towardsdatascience podcast episode “Making AI safe through debate”

I was recently listening to the Towards Data Science podcast episode called “Making AI safe through debate” where the host Jeremie Harris interviews Ethan Perez. It was a fascinating conversation on AI safety, the future of AI and the possible ways to get there, through Ethan’s expertise in Language models and IDA strategies.

Oh my background? Yeah, I am a recent graduate of the Data Science Bootcamp program from the Flatiron School. The world of seeing data through Machine Learning has been somewhat new to me but very very exciting…

Visualizing Time Series Data. A short Introduction.

Time series data sets refers to data that has observations on a variable or many variables ‘Time’ is an important dimension in time series data and is best understood using data visualizations. Here, the chronological ordering of observations conveys important information. Another feature of time series data, that can require special attention is the at which the data are collected, i.e., daily, weekly, monthly, quarterly, annually. Visualizing can be an efficient way to view the data for trends and patterns. …

Growing up Tic Tac Toe has been a much loved game. The rules are simple and very easy to learn. So, I thought about whether the game could be written with Python. I looked it up and yes! it could be done. I looked at the codes that were used and saw that a Python module called Pygame can be used. I coded along with the instructions and hope to explain in this blog what I learned about how to Tic-Tac-Toe.

Pygame is a Python module designed to write video games. It has a nice SDL (Simple DirectMedia Layer) library…

The different steps of a Data Science Process. This blog explores the methods involved in step 4

A critical step in the data science process is the Exploratory Data Analysis or EDA. Linus Torvald once said, “Bad programmers worry about the code. Good programmers worry about the data structures and their relationships.” Although Torvald states this in the context of programming, it essentially applies also in the data science setting. Examining your data thoroughly to understand the underlying data structure, is imperative to building good and even better models.

Recently, when I did my first data science project as I started on the EDA step, I searched for answers to the best practices in this step. I…

A beginner’s journey using simple Python’


If you’ve worked in a managerial role, you have most probably spent hours every week trying to put together the weekly staff schedule. When you have different roles, different shifts, different availabilities and other variables to consider, it only gets more complicated. So, I wanted to attempt to model a kitchen staff schedule. Since, this is my initial take on this, I have made only simple assumptions here. The goal is to make a function that takes an employee’s name, available days, available time…

