Adrien
Just use memo_wise or similar
When I was writing the Jekyll integration for JamComments, I started reminiscing about some of the features I really like about Ruby (it had been a minute since I wrote much of it). One of the first that came to mind was the conditional assignment operator, often used to memoize values:
def results
@results ||= calculate_results
end
If you're unfamiliar, the @results
instance variable if will only be set if it's falsey. It's a nice way to ensure an expensive operation is performed only when it's needed and never more than once.
For one-liners like this, it's straightforward. But sometimes, a little more complexity may require multiple lines of code, like if you were to fetch results
from an external service. In that case, memoization isn't as elegant. That's where another neat Ruby feature can help retain that elegance. But first, let's flesh out a scenario.
Here's a GitHubRepo
class for fetching repository data from the GitHub API. It handles making the request and accessing particular data we want from the response.
require 'httparty'
class GitHubRepo
attr_reader :name
def initialize(name:)
@name = name
end
def license
repo.dig('license', 'key')
end
def stars
repo['stargazers_count']
end
private
def repo
puts "fetching repo!"
response = HTTParty.get("https://api.github.com/repos/#{name}")
JSON.parse(response.body)
end
end
Spin it up by passing in a repository name:
repo = GitHubRepo.new(name: 'alexmacarthur/typeit')
puts "License: #{repo.license}"
puts "Star Count: #{repo.stars}"
Unsurprisingly, "fetching repo" would be output twice, since the repo
method is being repeatedly used with no memoization. We could solve that by more manually checking & setting a @repo
instance variable:
# Inside class...
def repo
# Check if it's already set.
return @repo unless @repo.nil?
puts 'fetching repo!'
response = HTTParty.get("https://api.github.com/repos/#{name}")
# Set it.
@repo = JSON.parse(response.body)
end
But like I said, not as elegant. I don't love needing to check if @repo
is nil
myself, and then setting it in a different branch of logic.
.tap
ShinesRuby's .tap
method is really helpful in moments like this. It exists on the Object
class, and as the docs describe, it "yields self to the block, and then returns self." So, memoizing an HTTP response cleans up a bit better:
# Inside class...
def repo
@repo ||= {}.tap do |repo_data|
puts 'fetching repo!'
response = HTTParty.get("https://api.github.com/repos/#{name}")
repo_data.merge!(JSON.parse(response.body))
end
end
Explained: We start with an empty hash {}
as a "default" value, which is then "tapped" and provided to the block as repo_data
. Then, we can spend as many lines as we want in that self-contained block building repo_data
as desired before it's implicitly returned. And that block is behind a conditional assignment operator ||=
, so future repo
calls will just return the @repo
instance variable. No variable checking. One code path. Slightly more cultivated, in my opinion.
But there's a potential kicker in there that's nabbed me a few times: .tap
will always return itself from the block, no matter what you do within it. And that means if you want a particular value to be returned from the block, you have to mutate that value. This would be pointless:
repo_data = repo_data.merge(JSON.parse(response.body))
It would have simply reassigned the variable, and the original reference would have still been returned unchanged. But using the "bang" version of merge
does work because it's modifying the repo_data
reference itself.
begin
block? Yep, something like this would definitely work, and there are some solid advantages to it, like less code and no mutations.
# Inside class...
def repo
@repo ||= begin
puts 'fetching repo!'
response = HTTParty.get("https://api.github.com/repos/#{name}")
JSON.parse(response.body)
end
end
The reason I tend to prefer .tap
is because (I feel like) it gives me a little more control over the shape of the object I'm building. In cases like this, there's nothing I can do to guarantee that the response body will be modeled in a particular way. Using .tap
streamlines the building of my hash exactly how I want, and makes it easy to fall back to default values if certain properties aren't found.
# Inside class...
def repo
@repo ||= {}.tap do |repo_data|
puts 'fetching repo!'
response = HTTParty.get("https://api.github.com/repos/#{name}")
data = JSON.parse(response.body)
# Shaping the hash exactly how I want it:
repo_data['license'] = data.dig('license', 'key') || "unknown"
repo_data['stargazers_count'] = data['stargazers_count']
end
end
Not to mention, by starting with that empty hash, we're guaranteeing that the request would never be performed again, even if the response resolves to something falsey.
That said, the distinction probably doesn't matter that much. Make your own choices.
You can tell a feature is valuable when other languages or frameworks adopt their own version of it, and this is one of those. I'm aware of just a couple, but I'm sure there are more.
Laravel (PHP), for example, exposes a global tap()
helper method, and there's even a Tabbable
trait, which is used to add a tap
method to several classes within the framework. Moreover, Taylor Otwell has even said its inspiration was found in Ruby. Here's a snippet robbed straight from their documentation:
$user = tap(User::first(), function (User $user) {
$user->name = 'taylor';
$user->save();
});
And here's how that helper could be used to memoize our GitHub API request. As you can see, it works nicely with PHP's null coalescing operator:
// Inside class...
private function repo()
{
// Perform request only if property is empty.
return $this->repo = $this->repo ?? tap([], function(&$repoData) {
echo 'Fetching repo!';
$client = new Client();
$response = $client->request('GET', "https://api.github.com/repos/{$this->name}");
$repoData += json_decode($response->getBody(), true);
});
}
Kotlin's actually has a couple tools similar to .tap
. The .apply
and .also
methods permit you to mutate an object reference that's implicitly returned at the end of a lambda:
val repo = mutableMapOf<Any, Any>().also { repoData ->
print("fetching repo!")
val response = get("https://api.github.com/repos/$name")
jacksonObjectMapper().readValue<Map<String, Any>>(response.text).also { fetchedData ->
repoData.putAll(fetchedData)
}
}
But for memoization, you don't even need them. Kotlin's lazy
delegate will automatically memoize the result of the proceeding self-contained block.
// Inside class...
private val repoData: Map<String, Any> by lazy {
print("fetching repo!")
val response = get("https://api.github.com/repos/$name")
jacksonObjectMapper().readValue<Map<String, Any>>(response.text)
}
Sadly, God's language, JavaScript, doesn't have a built-in .tap
method, but it could easily be leveraged with Lodash's implementation. Or, if you're feeling particularly dangerous, tack it onto the Object
prototype yourself. Continuing with the repository fetch example:
Object.prototype.tap = async function(cb) {
await cb(this);
return this;
};
// Create an empty object for storing the repo data.
const repoData = await Object.create({}).tap(async function (o) {
const response = await fetch(
`https://api.github.com/repos/alexmacarthur/typeit`
);
const data = await response.json(response);
// Mutate tapped object.
Object.assign(o, data);
});
For memoization, this would pair decently with the new-ish nullish coalescing operator. Say we were in the context of a class like before:
// Inside class...
async getRepo() {
this.repo = this.repo ?? await Object.create({}).tap(async (o) => {
console.log("fetching repo!");
const response = await fetch(
`https://api.github.com/repos/alexmacarthur/typeit`
);
const data = await response.json(response);
Object.assign(o, data);
});
return this.repo;
}
Still not the level of elegance that Ruby offers, but it's getting there.
Like I mentioned, it's been a little while since I've dabbled in Ruby, spending most of my time as of late in Kotlin, PHP, and JavaScript. But I think that sabbatical has given more comprehensive, renewed perspective on the language, and helped me to appreciate the experience it offers despite the things I don't prefer so much (there are some). Hoping I continue to identify these lost gems!
Thank you to Jason, a far-above-average golfer, who taught me Ruby tricks like this.
Get irregular emails about new posts or projects.
No spam. Unsubscribe whenever.Just use memo_wise or similar
I have no problem w/ that. But it’s not always desirable to introduce another dependency, and often simple enough to roll it yourself.
# the cleanest way is to use Memery:
include Memery
memoize def repo
response = HTTParty.get("https://api.github.com/repos/#{name}")
JSON.parse(response.body)
end
Related: https://ilya-sher.org/2022/12/31/the-new-life-of-tap/
There is a tiny typo in this statement: "@results instance variable if will only be set if it's falsey", - should,'t it be rather "@results instance variable will only be set if it's falsey" ? (extra 'if' was removed)
Good catch. Thanks!!
IMO instead of creating additional nesting it's much better to just do early return
def fn
return @fn if @fn
@fn = ...
end
“Much better” is pretty subjective, lol. But yes, I see the appeal of fewer indents. I personally like wrapping the expensive logic inside some sort of block and not needing to touch the instance variable as often.
You can avoid the mutation and still have total control:
@repo ||= HTTParty.get("https://api.github.com/repos/#{name}").then do |response|
data = JSON.parse(response.body)
{
license: data.dig('license', 'key') || "unknown",
stargazers_count: data['stargazers_count']
}
end
Yep, that seems to be a pretty common preference people have. Might see myself moving toward it in the future.
.tap
has tehe benefit of keeping everything about the memoisation in the block, but at the cost of an indent.
An alternative approach I've recommended for years (and describe in https://dev.to/epigene/memoization-in-ruby-2835), is the "defined?" approach.
I totally get the indent concern and why others opt for an alternative. I admittedly don’t have a huge history of Ruby experience, so I can see myself shifting in preference as time goes on. Good overview of options in that post, btw!
Can't say I see the benefit. The difference between your approach and using a begin-block is just that you add a method invocation of #tap. As Fabio says, using #then is also generally better, but it doesn't make a difference in your case. The way I'd do it is,
@repo ||= HTTParty
.tap { puts 'fetching repo!' }
.then { _1.get("https://api.github.com/repos/#{name}") }
.then { JSON.parse(_1.body) rescue {} }
.then do |data|
# Shaping the hash exactly how I want it:
{ 'license' => data.dig('license', 'key') || "unknown",
'stargazers_count' => data['stargazers_count']
}
end
But I don't generally use HTTParty anyways since it doesn't lend itself to this kind of composability. I'd prefer RestClient::Resource.new("https://api.github.com/repos/#{name}").tap { puts "fetching repo!" }.then { _1.get } for that bit.
Me again! Here’s what I think may be the most expressive way to write this. It shows a clear progression of the data being mutated and still gives the ultimate benefit of memoization using your technique. It’s a little bit like piping the output of one command through to another in *nix shells.
def repo
@repo ||= nil
.tap { puts 'fetching repo!' }
.then { HTTParty.get("https://api.github.com/repos/#{name}") }
.then { |response| JSON.parse(response.body) }
.then { |parsed| parsed || {} }
end
end
Cool idea!
I wanted to mention that there is a similar method to #tap
called #then
. It also yields self to the block, but it returns the value of the block. So, in that case, using merge (instead of mutating it with merge!) would work. Hope that’s helpful!