The Bike Shed
336: Million Dollar Password
Chris came up with a mnemonic device: Fn-Delete – for when he really wants to delete something and is also thinking about password complexity requirements, which leads to an exciting discussion around security theater.
Steph talks about the upcoming RailsConf and the not-in-person option for virtual attendees. She also gives a shoutout to the Ruby Weekly newsletter for being awesome.
NIST Password Standards
3 ActiveRecord Mistakes That Slow Down Rails Apps: Count, Where and Present
Difference between count, length and size in an association with ActiveRecord
Ruby Weekly
Railsconf 2022
Become a Sponsor of The Bike Shed!
Transcript:
STEPH: Hello and welcome to another episode of The Bike Shed, a weekly podcast from your friends at thoughtbot about developing great software. I'm Steph Viccari.
CHRIS: And I'm Chris Toomey.
STEPH: And together, we're here to share a bit of what we've learned along the way. So hey, Chris, happy Friday. You know, each time I do that, I can't resist the urge to say happy Friday, but then I realize people aren't listening on a Friday. So happy day to anyone that's listening. What's new in your world, friend?
CHRIS: I'm going to be honest; you threw me for a loop there. [laughs] I think it was the most recent episode where we talked about my very specific...[laughs] it's a lovely Friday, that's true. There's sun and clouds. Those are true things. But yeah, what's new in my world? [laughs] I can do this. I can focus. I got this.
Actually, I have one thing. So this is going to be, I'm going to say vaguely selfish, but I have this thing that I've been trying to commit into my brain for a long time, and I just can't get it to stick. So today, I came up with like a mnemonic device for it. And I'm going to share it on The Bike Shed because maybe it'll be useful for other people. And then hopefully, in quote, unquote, "teaching it," I will deeply learn it.
So the thing that happens in my world is occasionally, I want to delete a URL from Chrome's autocomplete. To be more specific, because it's easier for people to run away with that idea, it's The Weather Channel. I do not like weather.com. I try to type weather often, and I just want Google to show me the little, very quick pop-up thing there. I don't want any ads. I don't want to deal with that.
But somehow, often, weather.com ends up in my results. I somehow accidentally click on it. It just gets auto-populated, and then that's the first thing that happens whenever I type weather into the Omnibox in Chrome. And I get unhappy, and I deal with it for a while, then eventually I'm like, you know what? I'm deleting it. I'm getting it out of there. And then I try and remember whatever magical key combination it is that allows you to delete an entry from the drop-down list there. And I know it's a weird combination of like, Command-Shift-Alt-Delete, Backspace, something.
And every single time, it's the same. I'm like, I know it's weird, but let me try this one. How about that one? How about that one? I feel like I try every possible combination. It's like when you try and plug in a USB drive, and you're like, well, it's this way. No, it's the other way. Well, there are only two options, and I've already tried two things. How can I not have gotten it yet? But I got it now.
Okay, so on a Mac specifically, the key sequence is Shift-Function-Delete. So the way I'm going to remember this is Function is abbreviated on the keyboard as Fn. So that can be like I'm swearing, like, I'm very angry about this. And then Shift is the way to uppercase something like you're shouting. So I just really need to Fn-Delete this. So that's how I'm going to remember it. Now I've shared it with everyone else, and hopefully, some other folks can get utility out of that. But really, I hope that I remember it now that I've tried to boil it down to a memorable thing.
STEPH: [laughs] It's definitely memorable. I'm now going to remember just that I need to Fn-Delete this. And I'm not going to remember what it all is tied to. [laughs]
CHRIS: That is the power of a mnemonic device. Yeah.
STEPH: Like, I know this is useful in some way, but I can't remember what it is. But yeah, that's wonderful. I love it. That's something that I haven't had to do in a long time, and I hadn't thought about. I need to do that more. Because you're right, especially changing projects or things like that, there are just some URLs that I don't need cached anymore; I don't want auto-completed. So yeah, okay. I just need to Fn-Delete it. I'll remember it. Here we go. I'm speaking this into the universe, so it'll be true.
CHRIS: Just Fn-Delete it.
STEPH: Your bit about the USB and always getting it wrong, you get it 50-50 [laughs] by getting it wrong, resonates so deeply with me and my capability with directions where I am just terrible whether I have to go right or left. My inner compass is going to get it wrong. And I've even tried to trick myself where I'm like, okay, I know I'm always wrong. So what if I do the opposite of what Stephanie would do? And it's still somehow wrong. [laughs]
CHRIS: Somehow, your brain compensates and is like, oh, I know that we're going to do that. So let's...yeah, it's amazing the way these things happen.
STEPH: Yep. I don't understand it. I've tried to trick the software, but I haven't figured out the right way. I should probably just learn and get better at directions. But here we are. Here we are.
CHRIS: You just loosely referred to the software, but I think you're referring to the Steph software when you say that.
STEPH: Yes. Oh yeah, Steph software totally. You got it. [laughs]
CHRIS: Gotcha. Cool. Glad that I checked in on that because that's great. But shifting gears to something a little bit deeper in the technical space, this past week, we've been thinking about passwords within our organization at Sagewell. And we're trying to decide what we want to do. We had an initial card that came through and actually got most of the way to implemented to dial up our password strictness requirements. And as I saw that come through, I was like, oh, wait, actually, I would love to talk about this.
And so we had the work that was coming through the PR that had been opened was a pretty traditional set of let's introduce some requirements on our passwords for complexity, so let's make it longer. We're going from; I think six was the default that Devise shipped with, so we're increasing that to, I think it was eight. And then let's say that it needs a number, and a special character, and an uppercase letter or something like that.
I've recently read the NIST rules, so the National Institute of Standards and Technology, I think, is what they are. But they're the ones who define a set of rules around this or guidelines. But I think they are...I don't know if they are laws or what at this point. But they tell you, "This is what you should and shouldn't do." And I know that the password complexity stuff is on the don't do that list these days. So I was like, this is interesting, and then I wanted to follow through.
Interestingly, right now, I've got the Trello boards up for The Bike Shed right now. But as a result, I can't look at the linked Trello card that is on the workboards because they're in different accounts. And Trello really has made my life more difficult than I wanted. But I'm going to pull this up elsewhere. So let's see.
So NIST stuff, just to talk through that, we can include a link in the show notes to a nice summary. But what are the NIST password requirements? Eight character minimum, that's great. Change passwords only if there is evidence of a compromise. Screen new passwords against a list of known compromised passwords. That's a really interesting one. Skip password hints, limit the number of failed authentication attempts. These all sound great to me.
The maximum password length should be at least 64 characters, so don't constrain how much someone can put in. If they want to have a very long password, let them go for it. Don't have any sort of required rotation. Allow copy and pasting or functionality that allows for password managers.
And allow the use of all printable ASCII characters as well as all Unicode characters, including emojis. And that one really caught my attention. I was like, that sounds fun. I wish I could look at all the passwords in our database. I obviously can't because they're salted and encrypted, and hashed, and all those sorts of things where I'm like, I wonder if anybody's using emojis. I'm pretty sure we would just support it. But I'm kind of intrigued.
STEPH: You said something in that list that caught my attention, and I just want to see if I heard it correctly. So you said only offer change password if compromised? Does that mean I can't just change my password if I want to?
CHRIS: Sorry. Yeah, I think the phrasing here might be a little bit odd. So it's essentially a different way to phrase this requirement is don't require rotation of passwords every six or whatever months. Forgotten password that's still a reasonable thing to have in your application, probably a necessity in most applications. But don't auto-rotate passwords, so don't say, "Your password has expired after six months."
STEPH: Got it. Okay, cool. That makes sense. Then the emojis, oh no, it's like, I mean, I use a password manager now, and thanks to several years ago where he shamed me into using one. Thank you. That was great. [laughs]
CHRIS: I hope it was friendly shame, but yeah.
STEPH: Yes, it was friendly; kind shame if that sounds like a weird sentence to say. But yes, it was a very positive change. And I can't go back now that I have a password manager in my life. Because yeah, now I'm thinking like, if I had emojis, I'd be like, oh great, now I have to think about how I was feeling at the time that then I introduced a new password. Was I happy? Was I angry? Is it a poop emoji? Is unicorn? What is it? [laughs] So that feels complicated and novel.
You also mentioned on that list that going for more complexity in terms of you have to have uppercase; you have to have a particular symbol, things like that are not on the recommended list. And I didn't know that. I'm so accustomed to that being requirements for passwords and the idea of how we create something that is secure and less easy to guess or to essentially hack. So I'm curious about that one if you know any more details about it as to why that's not the standard anymore.
CHRIS: Yeah, I think I have some ideas around it. My understanding is mostly that introducing the password complexity requirements while intended to prevent people from using very common things like names or their user name or things like that, it's like, no, no, no, you can't because we've now constrained the system in that way. It tends in practice to lead to people having a variety of passwords that they forget all the time, and then they're using the forgotten password flow more often.
And it basically, for human and behavior reasons, increases the threat surface area because it means that they're not able to use...say someone has a password scheme in mind where it's like, well, my passwords are, you know, it's this common base, and then some number of things specific to the site. It's like, oh no, no, we require three special characters, so it's like they can't do their thing. And now they have to write it down on a Post-it Note because they're not going to remember it otherwise. Or there are a variety of ways in which those complexity requirements lead to behavior that's actually less useful.
STEPH: Okay, so it's the Post-it Note threat vector that we have to be worried about. [laughs]
CHRIS: Which is a very real threat factor.
STEPH: I believe it. [laughs] Yes, I know people that keep lists of passwords on paper near their desk. [laughs] This is a thing.
CHRIS: Yep, yep, yep. The other thing that's interesting is, as you think about it, password complexity requirements technically reduce the overall combinatoric space that the passwords can exist in. Because imagine that you're a password hacker, and you're like, I have no idea what this password is. All I have is an encrypted hashed salted value, and I'm trying to crack it. And so you know the algorithm, you know how many passes, you know potentially the salt because often that is available. I think it has to be available now that I think about that out loud.
But so you've got all these pieces, and you're like, I don't know, now it's time to guess. So what's a good guess of a password? And so if you know the minimum number of characters is eight and, the maximum is 12 because that actually happens on a lot of systems, that's actually not a huge combinatoric space. And then if you say, oh, and it has to have a number, and it has to have an uppercase letter, and it has to have a special character, you're just reducing the number of possible options in that space.
And so, although this is more like a mathematical thing, but in my mind, I'm like, yeah, wait, that actually makes things less secure because now there are fewer passwords to check because they don't meet the complexity requirements. So you don't even have to try them if you're trying to brute-force crack a password.
STEPH: Yeah, you make a really good point that I hadn't really thought about because I've definitely seen those sites that, yeah, constrain you in terms of like, has to have a minimum, has to have a maximum, and I hadn't really considered the fact that they are constraining it and then reducing the values that it could be. I am curious, though, because then it doesn't feel right to have no limit in terms of, like, you don't want people then just spamming your sign up and then putting something awful in there that has a ridiculous length. So do you have any thoughts on that and providing some sort of length requirement or length maximum?
CHRIS: Yeah, I think the idea is don't prevent someone who wants to put in a long passphrase, like, let them do that. But there is, the NIST guidelines specifically say 64 characters. Devise out of the box is 128, I believe. I don't think we tweaked that, and that's what we're at right now. So you can write an old-style tweet and that can be your password if that's what you want to do. But there is an upper limit to that. So there is a reasonable upper limit, but it should be very permissive to anyone who's like, I want to crank it up.
STEPH: Cool. Cool. Yeah, I just wanted to validate that; yeah, having an upper bound is still important.
CHRIS: Yeah, definitely. Important...it's more for implementation and our database having a reasonable size and those sorts of things. Although at the end of the day, the thing that we saw is the encrypted password. So I don't know if bcrypt would run slower on a giant body of text versus a couple of characters; that might be the impact. So it would be speed as opposed to storage space because you always end up with a fixed-length hash of the same length, as far as I understand it.
But yeah, it's interesting little trade-offs like that where the complexity requirements do a good job of forcing people to not use very obvious things like password. Password does not fit nearly any complexity requirements. But we're going to try and deal with that in a different way. We don't want to try and prevent you from using password by saying you must use an uppercase letter and a special character and things that make real passwords harder as well. But it is an interesting trade-off because, technically, you're making the crackability easier. So it gets into the human and the technical and the interplay between them.
Thinking about it somewhat differently as well, there's all this stuff about you should salt your passwords, then you should hash them. You should run them through a good password hashing algorithm. So we're using bcrypt right now because I believe that's the default that Devise ships with. I've heard good things about Argon2; I think is the name of the new cool kid on the block in terms of password hashing. That whole world is very interesting to me, but at the end of the day, we can just go with Devise's defaults, and I'll feel pretty good about that and have a reasonable cost factor. Those all seem like smart things.
But then, as we start to think about the complexity requirements and especially as we start to interact with an audience like Sagewell's demographics where we're working with seniors who are perhaps less tech native, less familiar, we want to reduce the complexity there in terms of them thinking of and remembering their passwords. And so, rather than having those complexity requirements, which I think can do a good job but still make stuff harder, and how do you communicate the failure modes, et cetera, et cetera, we're switching it.
And the things that we're introducing are we have increased the minimum length, so we're up to eight characters now, which is NIST's low-end recommended, so it's between 8 and 128 characters. We are capturing anytime a I forgot password reset attempt happens and the outcome of it. So we're storing those now in the database, and we're showing them to the admins.
So our admin team can see if password reset attempts have happened and if they were successful. That feels like good information to keep around. Technically, we could get it from the logs, but that's deeply hidden away and only really accessible to the developers. So we're now surfacing that information because it feels like a particularly pertinent thing for us.
We've introduced Rack::Attack. So we're throttling those attempts, and if someone tries to just brute force through that credential stuffing, as the terminology goes, we will lock them out so either based on IP address or the account that they're trying to log into. We also have Devise's lockable module enabled. So if someone tries to log in a bunch of times and fails, their account will go into a locked state, and then an admin can unlock it. But it gives us a little more control there. So a bunch of those are already in place.
The new one, this is the one that I'm most excited about, is we're going to introduce Have I Been Pwned? And so, they have an API. We can hit it. It's a really interesting model as to how do we ask if a password has been compromised without giving them the password? And it turns out there's this fun sort of cryptographic handshake thing that happens. K-anonymity is apparently the mechanism or the underpinning technology or idea.
Anyway, it's super cool; I'm excited to build it. It's going to be fun. But the idea there is rather than saying, "Don't use a password that might not be secure," it's, "Hey, we actually definitively know that your password has been cracked and is available in plaintext on the internet, so we're not going to let you use that one."
STEPH: And that's part of the signup flow as to where you would catch that?
CHRIS: So we're going to introduce on both signup and sign-in because a password can be compromised after a user signs up for our system. So we want to have it at any point. Obviously, we do not keep their plaintext password, so we can't do this retroactively. We can only do it at the point in time that they are either signing up or signing in because that's when we do have access to the password. We otherwise throw it away and keep only the hashed value. But we'll probably introduce it at both points.
And the interesting thing is communicating this failure mode is really tricky. Like, "Hey, your password is cracked, not like here, not on our site, no, we're fine. Well, you should probably change your password. So here's what it means, there's actually this database that's called Have I Been Pwned? Don't worry; it's good, though. It's P-W-N-E-D. But that's fine." That's too many words to put on a page. I can't even say it here in a podcast.
And so what we're likely to do initially is instrument it such that our admin team will get a notification and can see that a user's password has been compromised. At that point, we will reach out to them and then, using the magic of human conversation, try and actually communicate that and help them understand the ramifications, what they should do, et cetera. Longer-term, we may find a way to build up an FAQ page that describes it and then say, "Feel free to reach out if you have questions." But we want to start with the higher touch approach, so that's where we're at.
STEPH: I love it. I love that you dove into how to explain this to people as well because I was just thinking, like, this is complicated, and you're going to freak people out in panic. But you want them to take action but not panic. Well, I don't know, maybe they should panic a little bit. [laughs]
CHRIS: They should panic just the right amount.
STEPH: Right.[laughs] So I like the starting with the more manual process of reaching out to people because then you can find out more, like, how did people react to this? What kind of questions did they ask? And then collect that data and then turn that into an FAQ page. Just, well done.
CHRIS: We haven't quite done it yet. But I am very happy with the collection of ideas that we've come to here. We have a security firm that we're working with as well. And so I had my weekly meeting with them, and I was like, "Oh yeah, we also thought about passwords a bunch, and here's what we came up with." And I was very happy that they were like, "Yeah, that sounds like a good set." I was like, "Cool. All right, I feel good." I'm very happy that we're getting to do this.
And there's an interesting sort of interplay between security theater and real security. And security theater, just to explain the phrase if anyone's unfamiliar with it, is things that look like security, so, you know, big green lock up in the top-left corner of the URL bar. That actually doesn't mean anything historically or now. But it really looks like it's very secure, right? Or password complexity requirements make you think, oh, this must be a very secure site. But for reasons, that actually doesn't necessarily prove that at all.
And so we tried to find the balance of what are the things that obviously demonstrate our considerations around security to the user? At the end of the day, what are the things that actually will help protect our users? That's what I really care about. But occasionally, you got to play the security theater game. Every other financial institution on the internet kind of looks and feels a certain way in how they deal with passwords.
And so will a user look at our seemingly laxer requirements or laxer approach to passwords and judge us for that and consider us less secure despite the fact that behind the scenes look at all the fun stuff we're doing for you? But it's an interesting question and interesting trade-off that we're going to have to spend time with. We may end up with the complexity requirements despite the fact that I would really rather we didn't. But it may be the sort of thing that there is not a good way to communicate the thought and decision-making process that led us to where we're at and the other things that we're doing.
And so we're like, fine, we just got to put them in and try and do a great job and make that as usable of an experience as possible because usability is, I think, one of the things that suffers there. You didn't do one of the things on the list, or like, it's green for each of the ones that you did, but it's red for the one that you didn't. And your password and your password confirmation don't match, and you can't paste...it's very easy to make this wildly complex for users.
STEPH: Security theater is a phrase that I don't think I've used, but the way you're describing it, I really like. And I have a solution for you: underneath the password where you have "We don't partake in security theater, and we don't have all the other fancy requirements that you may have seen floating around the internet and here's why," and then just drop a link to the episode. And, you know, people can come here and listen. It'll totally be great. It won't annoy anyone at all. [laughs]
CHRIS: And it'll start, and they'll hear me yelling about Fn-Delete that weather.com URL.
[laughter]
STEPH: Okay, maybe fast forward then to the part about --
CHRIS: Drop them to the timestamp. That makes sense. Yep. Yep.
STEPH: Mm-hmm. Mm-hmm. [laughs]
CHRIS: I like it. I think that's what we should do, yeah. Most features on the app should have a link to a Bike Shed episode. That feels true.
STEPH: Excellent Easter egg. I'm into it. But yeah, I like all the thoughtfulness that y'all have put into this because I haven't had to think about passwords in this level of detail. And then also, yeah, switching over to when things start to change and start to move away, you're right; there's still that we need to help people then become comfortable with this new way and let them know that this is just as secure if not more secure. But then there's already been that standard that has been set for your expectations, and then how do you help people along that path? So yeah, seems like y'all have a lot of really great thoughtfulness going into it.
CHRIS: Well, thank you. Yeah, it's frankly been a lot of fun. I really like thinking in this space. It's a fun sort of almost hobby that happens to align very well with my profession sort of thing. Actually, oh, I have one other idea that we're not going to do, but this is something that I've had in the back of my mind for a long time.
So when we use bcrypt or Devise uses bcrypt under the hood, one of the things that it configures is the cost factor, which I believe is just the number of times that the password plus the salts and whatnot is run through the bcrypt algorithm. The idea there is you want it to be computationally difficult, and so by doing it multiple times, you increase that difficulty.
But what I'd love is instead of thinking of it in terms of an arbitrary cost factor which I think is 12, like, I don't know what 12 means. I want to know it, in terms of dollars, how much would it cost to, like dollars and cents, to crack a password. Because, in theory, you can distribute this across any number of EC2 instances that you spin up. The idea of cracking a password that's a very map-reducible type problem.
So let's assume that you can infinitely scale up compute on-demand; how much would it cost in dollars to break this password? And I feel like there's an answer. Like, I want that number to be like a million dollars. But as EC2 costs go down over time, I want to hold that line. I want to be like, a million dollars is the line that we want to have. And so, as EC2 prices go down, we need to increase our bcrypt cost factor over time to adjust for that and maintain the million dollar per password cracking sort of high bar. That's the dream.
Swapping out the cost factor is actually really difficult. I've looked into it, and you have to like double encrypt and do weird stuff. So for a bunch of reasons, I haven't done this, but I just like that idea. Let's pin this to $1 value. And then, from there, decisions naturally flow out of it. But it's so much more of a real thing. A million dollars, I know what that means; 12, I don't know what 12 means.
STEPH: A million-dollar password, I like it. I feel like --
CHRIS: We named the episode.
STEPH: I was going to say that's a perfect title, A Million-Dollar Password. [laughs]
CHRIS: A Million-Dollar Password. But with that wonderful episode naming cap there, I think I'm done rambling about passwords. What's up in your world, Steph?
STEPH: One of the things that I've been chatting with folks lately is RailsConf is coming up; it's May 17 through the 19th. And it's been sort of like that casual conversation of like, "Hey, are you going? Are you going? Who's going? It's going to be great." And as people have asked like, "Are you going?" And I'm always like, "No, I'm not going." But then I popped on to the RailsConf website today because I was just curious. I wanted to see the schedule and the talks that are being given.
And I keep forgetting that there's the in-person version, but there's also the home edition. And I was like, oh, I could go, I could do this. [laughs] And I just forget that that is something that is just more common now for conferences where you can attend them virtually, and that is just really neat. So I started looking a little more closely at the talks. And I'm really excited because we have a number of thoughtboters that are giving a talk at RailsConf this year.
So there's a talk being given by Fernando Perales that's called Open the Gate a Little: Strategies to Protect and Share Data. There's also a talk being given by Joël Quenneville: Your Test Suite is Making Too Many Database Calls. I'm very excited; just that one is near and dear to my heart, given the current client experiences that I'm having. And then there's another one from someone who just joined thoughtbot, Christopher "Aji" Slater, Your TDD Treasure Map.
So we'll be sure to include a link to those for anyone that's curious. But it's a stellar lineup. I mean, I'm always impressed with RailsConf talks. But this one, in particular, has me very excited. Do you have any plans for RailsConf? Do you typically wait for them to come out later and then watch them, or what's your MO?
CHRIS: Historically, I've tended to watch the conference recordings after the fact. I went one year. I actually met Christopher "Aji" Slater at that very RailsConf that I went to, and I believe Joël Quenneville was speaking at that one. So lots of everything old is new again. But yeah, I think I'll probably catch it after the fact in this case.
I'd love to go back in person at some point because I really do like the in-person thing. I'm thrilled that there is the remote option as well. But for me personally, the hallway track and hanging out and meeting folks is a very exciting part. So that's probably the mode that I would go with in the future. But I think, for now, I'm probably just going to watch some talks as they come out.
STEPH: Yeah, that's typically what I've done in the past, too, is I kind of wait for things to come out, and then I go through and make a list of the ones that I want to watch, and then, you know, I can make popcorn at home. It's delightful. I can just get cozy and have an evening of RailsConf talks. That's what normal people do on Friday nights, right? That's totally normal. [laughs]
CHRIS: I mean, yeah, maybe not the popcorn part.
STEPH: No popcorn?
CHRIS: But not that I'm opposed to popcorn just —-
STEPH: Brussels sprouts? What do you need? [laughs]
CHRIS: Yeah, Brussels sprouts, that's what it is. Just sitting there eating handfuls of Brussels sprouts watching Ruby conference talks.
STEPH: [laughs]
CHRIS: I do love Brussels sprouts, just to throw it out there. I don't want it to be out in the ether that I don't like them. I got an air fryer, and so I can air fry Brussels sprouts. And they're delicious. I mean, I like them regardless. But that is a really fantastic way to cook them at home. So I'm a big fan.
STEPH: All right, I'm moving you into the category of fancy friends, fancy friends with an air fryer.
CHRIS: I wasn't already in your category of fancy friends?
STEPH: [laughs] I didn't think you'd take it that way. I'm sorry to break it to you.
[laughter]
CHRIS: I'm actually a little hurt that I'm now in the category of fancy friends. It makes a lot of sense that I wasn't there before. So I'll just deal with...yeah, it's fine. I'm fine.
STEPH: It's a weird rubric that I'm running over here. Pivoting away quickly, so I don't have to explain the categorization for fancy friends, I saw something in the Ruby Weekly Newsletter that had just come out. And it's one of those that I see surface every so often, and I feel like it's a nice reminder because I know it's something that even I tend to forget. And so I thought it'd be fun just to resurface it here. And then, we can also provide a link to the wonderful blog post that's written by Benito Serna.
And it's the difference between count, length, and size and an association with ActiveRecord. So for folks that would love a refresher, so count, that's a method that's always going to perform a SQL count query. So even if the collection has already been loaded, then calling count is always going to execute a database query. So this is the one that's just like, watch out, avoid it. You're always going to hit your database when you use this one.
And then next is length. And so, length loads the whole collection into memory and then returns that length to the number of items in that collection. If the collection has been loaded, then it's not going to issue a database call. And then it's just still going to use...it's going to delegate to that Ruby length method and let you know how many records are in that collection. So that one is a little bit better because then that way, if it's already loaded, at least you're not going to have a database call.
And then next is the size method, which is just the one that's more highly recommended that you use because this one does have a nice safety net that is built-in because first, it's going to check if we need to perform a database call, if the records have been loaded or not. So if the collection has not been loaded, so we haven't executed a database query and stored the result, then size is going to perform a database query. Specifically, it's using that SQL count under the hood. And if the collection has been loaded, then a database call is not issued, and then going to use the Ruby length method to then return the number of records.
So it just helps you prevent unnecessary database calls. And it's the reason that that one is recommended over using count, which is going to always issue a call. And then also to avoid length where you can because it's going to load the whole collection into memory, and we want to avoid that. So it was a nice refresher. I'll be sure to include a link in the show notes.
But yeah, I find that I myself often forget about the difference in count and size. And so if I'm just in the console and I just want to know something, that I still reach for count. It is still a default for mine. But then, if I'm writing production code, then I will be more considered as to which one I'm using.
CHRIS: I feel like this is one of those that I've struggled to lock into my head, but as you're describing it right now, I think I've got, again, another mnemonic device that we can lock on to. So I know that SQL uses the keyword count, so count that's SQL definitely. Length I know that because I use that on other stuff. And so it's size that is different and therefore special. That all seems good. Cool, locking that in my brain along with Fn-Delete. I have two things that are now firmly locked in.
So you were just mentioning being in the console and working with this. And one of the things that I've noticed a lot with folks that are newer to ActiveRecord and the idea of relations and the fact that they're lazy, is that that concept is very hard to grasp when working in a console because at the console, they don't seem lazy.
The minute you type out user.where some clause, and the minute you type that and hit enter in the console, Ruby is going to do its normal thing, which is like, okay, cool, I want to...I forget what it is that IRB or any of the REPLs are going to do, but it's either inspect or to_s or something like that. But it's looking for a representation that it can display in the console. And ActiveRecord relations will typically say like, "Oh, cool, you need the records now because you want to show it like an array because that's what inspect is doing under the hood."
So at the console, it looks like ActiveRecord is eager and will evaluate the query the minute you type it, but that's not true. And this is a critical thing that if you can think about it in that way and the fact that ActiveRecord relations are lazy and then take advantage of it, you can chain queries, you can build them up, you can break that apart. You can compose them together. There's really magical stuff that falls out of that.
But it's interesting because sort of like a Heisenberg where the minute you go to look at it in the REPL, it's like, oh, it is not lazy; it is eager. It evaluates it the minute I type the query. But that's not true; that's actually the REPL tricking you. I will often just throw a semicolon at the end of it because I'm like, I don't want to see all that noise. Just give me the relation. I want the relation, not the results of executing that query. So if you tack a semicolon at the end of the line, that tells Ruby not to print the thing, and then you're good to go from there.
STEPH: That's a great pro-tip. Yeah, I've forgotten about the semicolon. And I haven't been using that in my workflow as much. So I'm so glad you mentioned that. Yeah, I'm sure that's part of the thing that's added to my confusion around this, too, or something that has just taken me a while to lock it in as to which approach I want to use for when I'm querying data or for when I need to get a particular count, or length, or size. And by using all three, I'm just confusing myself more. So I should really just stick to using size.
There's also a fabulous article by Nate Berkopec that's titled Three ActiveRecord Mistakes That Slow Down Rails Apps. And he does a fabulous job of also talking about the differences of when to use size and then some of the benefits of when you might use count. The short version is that you can use count if you truly don't care about using any of those records. Like, you're not going to do anything with them. You don't need to load them, like; you truly just want to get a count. Then sure, because then you're issuing a database query, but then you're not going to then, in a view, very soon issue another database query to collect those records again. So he has some really great examples, and I'll be sure to include a link to his article as well.
Speaking of Ruby tidbits and kind of how this particular article about count, length, and size came across my view earlier today, Ruby Weekly is a wonderful newsletter. And I feel like I don't know if I've given them a shout-out. They do a wonderful job. So if you haven't yet checked out Ruby Weekly, I highly recommend it.
There are just always really great, interesting articles either about stuff that's a little bit more like cutting edge or things that are being released with newer versions, or they might be just really helpful tips around something that someone learned, like the difference between count, length, and size, and I really enjoy it. So I'll also be sure to include a link in the show notes for anyone that wants to check that out.
They also do something that I really appreciate where when you go to their website, you have the option to subscribe, but I am terrible about subscribing to stuff. So you can still click and check out the latest issue, which I really appreciate because then, that way, I don't feel obligated to subscribe, but I can still see the content.
CHRIS: Oh yeah. Ruby Weekly is fantastic. In fact, I think Peter Cooper is the person behind it, or Cooperpress as the company goes. And there is a whole slew of newsletters that they produce. So there's JavaScript Weekly, there's Ruby Weekly, there's Node Weekly, Golang Weekly, React Status, Postgres Weekly. There's a whole bunch of them. They're all equally fantastic, the same level of curation and intentional content and all those wonderful things. So I'm a big fan. I'm subscribed to a handful of them.
And just because I can't go an episode without mentioning inbox zero, if you are the sort of person that likes to defend the pristine nature of your email inbox, I highly recommend Feedbin and their ability to set up a special email address that you can use to then turn it into an RSS feed because that's magical. Actually, these ones might already have an RSS feed under the hood. But yeah, RSS is still alive. It's still out there. I love it. It's great. And that ends my thoughts on that matter.
STEPH: I have what I feel is a developer confession. I don't think I really appreciate RSS feeds. I know they're out there in the ether, and people love them. And I just have no emotion, no opinion attached to them. So one day, I think I need to enjoy the enrichment that is RSS feeds, or maybe I'll hate it. Who knows? I'm reserving judgment. Either way, I don't think I will. [laughs] But I don't want to box future Stephanie in.
CHRIS: Gotta maintain that freedom.
STEPH: On that note, shall we wrap up?
CHRIS: Let's wrap up. The show notes for this episode can be found at bikeshed.fm.
STEPH: This show is produced and edited by Mandy Moore.
CHRIS: If you enjoyed listening, one really easy way to support the show is to leave us a quick rating or even a review on iTunes, as it really helps other folks find the show.
STEPH: If you have any feedback for this or any of our other episodes, you can reach us at @_bikeshed or reach me on Twitter @SViccari.
CHRIS: And I'm @christoomey.
STEPH: Or you can reach us at hosts@bikeshed.fm via email.
CHRIS: Thanks so much for listening to The Bike Shed, and we'll see you next week.
ALL: Byeeeeee!!!!!!
ANNOUNCER: This podcast was brought to you by thoughtbot. thoughtbot is your expert design and development partner. Let's make your product and team a success.