There’s not much to update this week, I’ve mostly just been doing lifestyle adjustments in preparation for my remote internship, which starts for me in a week. Comments are open for the first time on this blog. This was definitely a daunting decision to make, but I found it worthwhile. My social media addiction post last week got a really good response; I have two comments on the post and several other people reaching out to me on private channels, and some good conversations started.

Trying to understand social media addiction

I’ve spent a little bit of time combing through research papers about social media, mostly by using Google Scholar. I didn’t have a particular goal in mind when setting out on this exploration, but I was primarily interested in the links between social media usage and mental health, as well as the analysis of social media addiction (how prevalent it is, its forms, patterns that differ among cultures and are common along cultural lines, etc).

Logs 2020-05-11: Sequences of Sets

Here’s a quick little math one. In a probability theory class that I was sitting in on, one of the core concepts taught was the limit infimum and limit supremum of a sequence of sets. If $$A_n$$ is a sequence of sets (that are all subsets of some larger set $$E$$) then the two constructs are also known as $$A_n$$ almost always and $$A_n$$ infinitely often respectively.

Quarantine Logs 2020-04-13: Circuit-Sim Progress!

Been a week! This is by far the most frequently I’ve ever posted. I’m hoping to keep it up.

I’m happy to update that I made a little progress on the circuit simulator I’ve been working on. Here I’ll get into some of what that’s all about in a little bit more detail. All the relevant code is up on Github, though it’s really bare-bones and without documentation as of the time of writing.

Learning about circuit analysis introduced me to the concept of nodes. A node is a point in a circuit where two or more components meet. This concept is important because of Kirchhoff’s Current Law, which states that the sum of currents leaving a node is 0.

$$\text{Current leaving node i} = \sum_{j \in N(i)} I_{i,j} = 0$$

Here’s an example:

Moving to Hugo and Netlify

I just moved this website over from my makeshift homemade setup on my self-hosted Digital Ocean box to a more convenient stack. See the very first post here to see what the old stack looked like. I’m using Hugo with the whiteplain theme, keeping some of the simplicity of the old design. I’m currently hosting on Netlify It takes a nontrivial amount of work to migrate a website across setups like this.

CockroachDB internship project: Speeding up some interleaved table deletes by a factor of 10 billion

Last summer, I did an internship with Cockroach Labs, makers of CockroachDB, a SQL database built for massive distribution. I was working on the SQL language semantics in Cockroach, and I was able to work on many different facets of the project in that area.

Overall, my theme for the summer was finding ways to improve the performance of mutation statements - that's your INSERTs, UPDATEs, and DELETEs. At the tail end of the internship, I was able to contribute a major performance gain by adding a fast path to a particular kind of DELETE, involving a kind of table called an interleaved table. This post is about this particular performance fix and everything about how it works.

All the work described in this post actually comes from this pull request.

Trying to organize my Twitter timeline, using unsupervised learning

I'm a frequent user of Twitter, but I realize that among the major social networks it could be the hardest to get into. One of the big obstacles for me was that, as I followed more and more people representing my different interests, my timeline became an overcrowded mess with too many different types of content. For example, at one point I started following many tech-related accounts and comic book art-related accounts at the same time, and when I would go on Twitter I could never reasonably choose to consume content from only one of the groups.

Even after learning to adapt to this, I still thought that it would be nice to be able to detect distinct groups among the twitter accounts that I followed. The impetus to finally start a project about this came when I started using cluster analysis algorithms in my machine learning class - the algorithms used seemed to be exactly the right idea for this kind of community detection. With that I set off on the task to collect and analyze the data from my own Twitter follow list, with clusters!

The work I've done since then is still in progress (mostly because the results I'm getting aren't that great yet), and as I make more progress I'll be making more posts about it!

All the code is available on Github.

More details below!

**This post reflects the technology used in an earlier version of this website.** I’ve just implemented RSS on my blog. You can find the feed at /recent.atom /index.xml Implementing this was relatively simple: there was a library for it! The Flask website had a really easy tutorial that was really easy to adapt to my own database models. Anyways, I guess that means you can follow this blog on your favourite reader client.

Destroying Cockroaches and the Hackathon Experience

On the weekend of March 18th-19th I and a few of my fellow UBC students Gareth Ellis, Alexander Hoar, and Jeffrey Doyle worked together (and lost a ton of sleep) for the hackathon nwHacks 2017. One of the new and more prominent sponsors of the event was Cockroach Labs, creators of the distributed database CockroachDB, and we thought it'd be fun to work on a project involving CockroachDB and shoot for the "Best Use of CockroachDB" sub-contest they were running (and giving out a nice cash prize!).

CockroachDB is a SQL database that sets itself apart from other relational database systems by being distributed and really fault-tolerant. Leveraging the Raft algorithm for assuring consensus across nodes in a cluster, it's able to create a CP (consistent and partition-tolerant) system while at the same time being Highly Available (source). When I installed it for the first time in the days leading up to the hackathon I was surprised at how easy it was to set up a cluster (just set up an instance and let the other instances join with the first one!) and use the admin interface present on each node. We decided to do a project that would exemplify CockroachDB's strengths by stress-testing a cluster, attempting to disrupt the consistency of the system.

Our project is DESTROY ALL ROACHES!! (GitHub repo) a Javascript-based game where you kill cockroaches on screen by remaining in close proximity to them. Each cockroach actually corresponds to an actual instance of CockroachDB running in the server's backend (where all of the cockroaches are part of the same cluster). Every time the player kills a cockroach, the web server executes a kill -9 command on the particular instance that the cockroach was associated with. The server spawns new cockroaches (and new instances) depending on the ratio between how many cockroaches were killed and how many cockroaches have been spawned at the beginning of the game session.

Our project was quite challenging to implement and is built upon quite a few hacks, which was expected given that we are completely misusing CockroachDB on purpose. I'm really proud of the four of us for being able to finish each of our parts of the project, and integrate all of them together, within 24 hours.

Technical details below!