Articles of Debasish Bose

36 posts, Location: , ,

Differential Flow Plot For User Navigation

Before and after a major UX change rolled out across 3 major online mastheads, I thought, it would be interesting to plot the change in readers' navigation behaviour as they browse through different page types. Sankey diagram is good visualization for these kind of problems. However, instead of plotting absolute values, differential numbers (in %) have been used. By default, Sankey plugin for D3 doesn't support loops or cycles. Luckily, I've found one at http://bl.ocks.org/soxofaan/7c96560677ead0425fe7. You
- Differential Flow Plot For User Navigation

Read more

Data Journalism Case Study of Self-Employment And Entrepreneurship

For a while, I was looking for a representative dataset to study various factors (or predictors in terminology of Statistics) influencing self-employment or entrepreneurship. Although Crunchbase or LinkedIn have some fantastic datasets, harvesting data any of these sources is difficult, if not impossible. Only other public dataset with a good demographic data is Census data. US has multiple "census-like" surveys to capture these kind of data - Decennial census (10 year) - census.gov ACS or American Community Survey (1
- Data Journalism Case Study of Self-Employment And Entrepreneurship

Read more

NSW Anti-Discrimination Board Statistics: Trend Analysis

An effort to visualize trend in racial complaints reported to the Anti-Discrimination Board, NSW, Australia. These incidents may be serious enough to be reported, and thus form a tiny sample compared to the whole population of unreported ones. Source: http://data.gov.au/dataset?q=discrimination As CSV is not available, have to manually download HTML data and stitch things together. Here is the code (main function fetch_discrimination_data) to download HTML data using rvest substrRight <- function(
- NSW Anti-Discrimination Board Statistics: Trend Analysis

Read more

Trends in Residex: Housing Price Index of India

To justify (narrow down to an appropriate city, as well) an upcoming property investment in India, I was looking for some government-published dataset on residential housing prices. After, accidentally discovering National Housing Bank (http://www.nhb.org.in/), I was searching for some CSV. Unfortunately there was none. Closest thing was - http://www.nhb.org.in/Residex/Data&Graphs.php Even if I have to scrape it, let's be it !! Let's use rvest library for the scraping. It
- Trends in Residex: Housing Price Index of India

Read more