I like EdX. Today, I reached a point in one of the courses I was taking, that I paid, where I can view my certificate (passing grade reached even if I haven’t reached the end of the course). There is really some satisfaction when getting a certificate, even if you aren’t even sure if it can help you.

EdX’s “Using Python for Research”, a MOOC offered by HarvardX is great. I really liked how it was taught. It could be challenging to understand some parts, but the exercises and homework were not very hard as I expected it to be based on the material. It can sometimes be anti-climactic where the material builds up, then you find that the exercises were thankfully not too hard. Being stuck in an online course is just not a good experience, especially if there isn’t much support. If it was taught in an in-person class, I imagine the exams would be a lot harder than the homework presented by the MOOC. There are some students who noted that not everything was taught by the instructor but this is understandable given the scope and limitations of the course (it is only a 4-week course). For example, in the k-nearest neighbors discussion, explanation of a code to plot data was not explained. It did, however, discuss matplotlib in another part of the course so it understandably did not need to discuss this code, which is an application of the topic on plotting.

Code for plotting the results of a kNN analysis shown above:

def plot_prediction_grid (xx, yy, prediction_grid, filename):
    """ Plot KNN predictions for every point on the grid."""
    from matplotlib.colors import ListedColormap
    background_colormap = ListedColormap (["hotpink","lightskyblue", "yellowgreen"])
    observation_colormap = ListedColormap (["red","blue","green"])
    plt.figure(figsize =(10,10))
    plt.pcolormesh(xx, yy, prediction_grid, cmap = background_colormap, alpha = 0.5)
    plt.scatter(predictors[:,0], predictors [:,1], c = outcomes, cmap = observation_colormap, s = 50)
    plt.xlabel('Variable 1'); plt.ylabel('Variable 2')
    plt.xticks(()); plt.yticks(())
    plt.xlim (np.min(xx), np.max(xx))
    plt.ylim (np.min(yy), np.max(yy))

I liked all the case studies in the course, and I think all are helpful getting started understanding data science. It is a good follow up course to EdX’s “Introduction to Computing Using Python”, though it might be confusing if the student has no basic statistics and some introductory machine learning or multivariate data analysis background. For me, taking this course was like doing data science from scratch (just like the book, “Data Science from Scratch” by J. Grus, which I had gone over but never finished–I prefer this course over the book. The only thing about the course is that it could have taken advantage of Jupyter Notebook, which I used to take notes, try the examples and do the exercises. Still, I highly recommend this course.