I had always considered external CSS files a best practice in web development. It does, however, seem fairly counter-intuitive. Why would we fetch the HTML for a page, then wait around for another round trip to fetch the styles, deal with all the handshakes and latency involved in the process, and then eventually render a page. Latency is always a constraint, but it is especially brutal under inconsistent network conditions. Immediacy is the killer feature of the Internet and the World Wide Web, and by imposing a few constraints on our process, we can make it happen way more often.
Performance advocates like Ilya Grigorik, an engineer at Google, have given a lot of talks advocating alternate practices like inlining critical CSS in order to drastically speed up rendering time. A lot of organizations that make good use of these ideas, but it seems to me that many individual developers have not paid much attention to these ideas. It can also be paralyzing to think that new specs like HTTP/2 will upend some of our current optimization efforts. I think, however, that new specs will be easier to adopt by focusing in now on the general notion that render-blocking latency is an inhibiting factor that can be avoided. HTTP/2 features like server push will be an option you may want to consider working towards right now, but browser and server support may put you uncomfortably on the bleeding edge and I think big gains can be made without committing to this process.
We use external CSS files in order to let browsers cache our assets and to simplifying our workflow. I think there are a lot of cases where caching the stylesheets is beneficial, but these benefits are lost on the user in many legitimate cases. Users will need to pay a big upfront cost when they first visit your page and any time you change your CSS files. These are in no way edge cases. If you want to ensure a great user experience every time, you should attempt to make each page render feel immediate.
Addressing the workflow issue: I take the approach of writing all the CSS to external files, then programmatically include critical CSS in a
<style> tag in the head. This will be done different ways depending on your server-side environment, but ensures I do not repeat myself and take on a programming flow that can be more easily explained when it comes to collaborating. Non-critical CSS (perhaps thought of as, "below the fold" content) should be “asynchronized” so that they do not block the page rendering.
This asynchronization of CSS is such an “anti-pattern” that you cannot even achieve it without a blatant hack. But it is a hack that does not add substantial complexity or debt, so it is worthwhile. There are a few alternate ways to achieve the asynchronization, but my approach is to add a bogus media attribute to the stylesheet link to any random string, such as “butthead”. You can use any string other than the functional keywords such as "all", "print", "screen", etc.
<link rel="stylesheet" media="bogus" href="styles.css" onload="if(media!='all')media='all'" />
For any page where time to render will have a critical impact on user experience, we should eliminate the latency tax as much as we can. There are various projects and specifications working to address some of these concerns. Google’s Accelerated Mobile Pages project enforces the constraint of inline CSS explicitly and it will be interesting to see what influences it will have in the future of the web. HTTP/2, and all future specs will continue to change a lot of “best practices” as well. There are also advances is "offline-first" approaches that are worth looking into. All of this can be taken advantage of when it is appropriate to do so, but don't feel like you need to learn all of this in order to make performance a priority, especially if it will complicate things. The best optimizations add little complex overhead. Focusing on a few basic common-sense principles around latency will help you achieve a lot of gains now and put you in the mindset to make the right decisions going forward.
Eliminating asset round-trips can be a gateway drug for thinking about performance in general. If you find this stuff interesting, there is a lot you can learn. The rabbit hole goes deep, though, and we do not want to “over-optimize” or “optimize prematurely” and introduce the kind of overhead that makes communication with other developers difficult. By adopting a few high level constraints into your projects, these optimizations will become an empowering part of your workflow and not a complication.
The application landscape is fracturing and the native vs web vs bots vs whatever debate will rage on. This might make you less concerned with specific web optimization strategies, but I see it as an opportunity to focus the "web" on the web's purpose, which is, as I mentioned before, immediacy. We have expectations of immediacy on the web. Most developers will build a website or two along the way no matter what their specialization, and learning to deliver on the platform's primary ambition is a fulfilling exercise for any developer.