This Week I Learned: 2020 W22
A new-to-me SQL clause and a Ruby idiom
Published May 31, 2020
When getting used to a new application (or domain area), I’ve found it useful to look at its data. In particular, in my team’s region of the Rails monolith, I’ve learned so much from having SQL access to a sanitized read replica. It’s been mostly simple questions, really:
How many records are in this table? For example, if
select count(*) from treturns less than 100 records, it’s likely to be some set of domain constants.
What is the distribution of enumerated values? First, if there’s a column named
status(or similar), you’re dealing with some kind of lifecycle. What are its valid states? (
select distinct(status) from t) How many items are currently in each of those states? (
select status, count(*) from t group by status)
If there are date fields (that aren’t the typical
updated_at), how are those distributed? I can’t think of any generalizable examples for this. But in Payments Engineering, this is how you could discover that our world runs in weekly, monthly, and yearly cycles.
This week, I was exploring the lifecycle status and date distributions for a particular table. The question I wanted to answer was this: Which users started using the product today?
I had enough SQL experience to know that this would get me all the user IDs with the date they first used the product:
(I changed the domain-specific names to make the interesting part clearer.)
But only getting the users whose first usage date was today (for example), looks like it would need a nested query, because you can’t filter based on the value of an aggregate function.
Unless you know about
HAVING, since it lets you do exactly that!
Here’s the tweak that makes this work:
first_usage_date alias isn’t necessary, but I think it’s more readable that way.)
I suspect this is part of ANSI SQL, but I don’t feel like paying $185 to buy a copy of the standard from ISO. So if you want to read up on the actual mechanics, check your database’s documentation. The Postgresql docs probably apply even if you’re using something else.
extend self idiom
extend allows you to add a module’s methods to a given object (which I know as a “mixin”).
This has been written about many times elsewhere, so I’ll skip the explanation here.
What I learned about this week was the
extend self idiom:
Usually, the instance method
hello would only be useful when you use
include M in another class to add
M’s methods to that class.
This trick uses
extend (instead of
include) to expose
M’s instance methods as class methods too!
Ruby global variable tracing
I encountered my first real-world usage of a Ruby global variable this week:
$stats is a handle to the metrics API client.
I recognized it because the dollar-variable syntax is taught in most Ruby intros (shoutout to Learn X in Y minutes).
It’s usually followed by some kind of warning to avoid global state.
This is surprising to me because, so far, my struggles with Ruby have been all about everything being globally visible all the time.
In trying to figure out the differences between a “real” global variable and something defined at the top-level (technically on
Object), I actually found one way they’re better:
trace_var lets you set up a handle to run on every assignment to that variable.
(Technically you can do this by defining the getter-setter method pair for any given name, but with globals you get it for free.)
The intended use case seems to be logging the newly-set value:
Set $x to 10
The docs don’t seem to mention this, so it may not be guaranteed, but I consistently see these callbacks handled as a stack (last-in first-out):
Second handler First handler
Naturally, the next thing I tried to do was get a pair of callbacks to enter an infinite loop, but that didn’t seem to work:
$x is now 0 Surprise $y => 1 $x is now 1 $y is now 2
and exits cleanly.
Sadly, the Perl-like special variables aren’t truly globals, so this doesn’t print anything special: