loading...

What's the point of learning how to sort?

cooperfle profile image Cooper F Le ・1 min read

Warning, rant incoming.

I am sitting in CS 3345. Data Structures and Algorithms. This is a class designed for juniors in our Computer Science program.

Right now, my professor is yapping on about how to do a merge sort. Something that I learned my sophomore year of high school. Then again my junior year. Then my senior year. I was taught it again my freshman year of college.

Now I'm a sophomore sitting in a junior course being taught how to do a merge sort for the fifth time as if this were the first time I've ever even heard of recursion.

Last week, we also went over how to do insertion sort, bubble sort, and shell sorts. All of this topics were covered in my high school courses as well as past college courses that are prerequisites for this current course.

Why is there such a heavy emphasis on sorting?

I understand that my high school curriculum was unusual. Most of my peers didn't get the same education I got as far as computer science goes, but why repeat the concepts within your own degree plan?

Especially for something like sorting. Where there are built-in functions that not only exist, but are usually preferred. If I were to submit code to my manager containing a custom sort function, he would highlight it in red and add the comment "Why didn't you the built-in sort function. Writing your own is smelly, and I would have to have a really good reason to use my own function than the built-in method.

Posted on by:

cooperfle profile

Cooper F Le

@cooperfle

I'm currently a student who is interested in coding, writing, and photography.

Discussion

pic
Editor guide
 

The point isn't to remember the sorting algorithms so you can write them on the job. We of course have library algorithms/search engines for that as you said. They are taught because they serve as a good case studies on algorithm analysis/design. Of course this doesn't mean its the best way (but theres already a lot of material out there on it for teaching purposes so it's probably easier for professors to teach it too).

For example bubble sort, quick sort, and selection sort all take the same input and produce the same output but at different speeds. This is a nice way to show how different approaches can yield more efficient algorithms using approaches such as divide and conquer. On top of that, sorting algorithms don't require advanced data structures knowledge to understand. If you can understand arrays you can understand sorting algorithm examples.

They also serve as a good introduction to algorithms for people who aren't familiar. Although some students will be in your position and have a lot of CS experience, many college juniors will not. Sorting algorithms are pretty intuitive to understand(compared to an algorithm like B-Tree insertion or Dijkstra's algorithm). This is probably why they are used as examples so often, they're a good starting point. I'm not sure how your classes are structured but in my experience we were initially taught how to write a sorting algorithm as a way of learning how to translate algorithms to code in early classes. We were taught these algorithms again in an algorithms course but from an analysis perspective.

Basically, they usually aren't taught because they're useful industry knowledge. They're taught because they serve as good ways to introduce CS concepts in an accessible and familiar way.

 

It's interesting that you bring up B-Tree insertion. We have also covered that in my Data Structures and Algorithms class. Except we did it before the sorting algorithms, and spend about half as long on the B-Tree as we did on the merge sort.

 

I think the biggest problem in the curriculum is answering the question you're asking here. The why.

Because it really is more about the practice of understanding algorithms than the sorting itself, or so I assume. CS educators probably get so caught up in the tradition of teaching these algorithms that they forget how little context they are providing.

I dropped out of CS and definitely appreciate limited education in these topics more in hindsight. Among other struggles, the dryness of sorting algorithms helped me lose interest.

 

I can highlight few points to expand the scope of the query.

  • The sorting algorithms are one of the easiest way to distinguish performance and or complexity of different algorithms for a same problem domain. This helps the learners to understand the importance of designing better algorithms to solve a problem.
  • Different sorting algorithms use different data structures. Implementing these algorithms to solve an identical problem (sorting) open the scope to learn about data structures.
  • The built-in sort functions are written based on these sort function we have been taught. So, it is quite interesting to code it thyself to understand how built-in function works.

And as others said, you have a very unusual early education. As it repeats again and again for you, you might find it boring. But learning different ways to solve a same problem is very helpful to whom these are introduced for the first time.

 

The short answer is that you're taught this because it matters, particularly if you're considering going into a Masters/PhD program involving algorithm analysis OR looking to head into AI/ML. Also, knowing when to use one sorting algorithm versus another might end up mattering...perhaps it won't.

 

I get being taught it once, but relearning it every year is a bit excessive. You know what else is important? Learning how to read documentation. You know what is never taught once? Reading documentation. It's a skill that can help you no matter what type of computer science related field you go into, particularly if you go into any computer science related field.

But regardless, my frustration isn't with the fact that sorting algorithms are being taught. My frustration is with the fact that within a single institution, it is being taught just about once a year.

 

I guess it boils down to individual experience. For me (and perhaps other students in my class), I first learned some sorting algorithms in the first coding class I took in college (A 100-level class). It was towards the end, and it was very basic = think O(n2) complexity. The next time was in a 300-level class (like your scenario, this class is intended for Juniors), but this time around we learned the algorithms in different run-times, i.e., worst, average, and best-case scenarios.

As for learning how to read documentation, I don't share the same experience. Perhaps its the school, the difference of state, or maybe just the Professors.

Is this something you can bring up? Or do you know if other peers share your experience? While changes to a curriculum don't happen overnight, if the issue is brought up in enough force or numbers then perhaps that could lead to some change. I know at my school once a year we take about 3 days over the course of a week to hear students' concerns, gripes, etc. This has actually led to changes in coursework and what not.

In any case, it's good that you're expressing these frustrations. You never know who else is feeling the same way and along the way you might just find some answers. Best of luck to you!

 

You absolutely do have an unusual high school education, but the overlap in your college courses isn't unheard of.

This is mostly opinion but...

There really isn't a good reason except that universities are old, slow moving institutions. It's part of the reasons why software developers without degrees can perform just as well as one's with a degree.

I for one can't remember a time when a fresh graduate learned about MVC, or Cloud Architecture patterns, RESTful services, Not to mention things like CSS and responsive design.

Sorting is an important thing to cover in computer science but most people with a degree in computer science won't go on to be computer scientists, they'll go on to be web/software engineers.

I know it sucks, but focus on your own learning for what your future career will demand, and utilize college for what it still does do very well, making connections with people and organizations that will serve you for the rest of your life.

 

It's not about sorting: it's about teaching algorithms (the sorts), data structures (arrays, heaps) and Big-O notation.

Big-O Notation

They're all really important, but Big-O notation is really important. If nothing else, and this is real low hanging fruit, interviews index highly on Big-O notation. If a candidate is not very familiar with at least one O(n*log(n)) sort, they're going to get a no. Not because it's very important in day to day, but when it does come up, it matters to know that sorting is done in less than n2 time.

Here's a pretty real example, it's both an interview question and can be a real-life coding analogue:

You have two people, Alice and Bert, and you have an list of all their friends on Facebook (don't worry about how we got this data...). You want to find all the friends they have in common.

If a candidate or co-worker suggests an n2 approach, they'll get a hard no.

Showing algorithms vs intuition

The way most humans sort things, like a deck of cards, is via insertion sort (take one card, go through the sorted part of the deck, and put it where it belongs). This feels fine and works okay in real life, where scanning through the sorted deck is fast and we an intuit where the card should go in the sorted.

Beginner's Mind
Finally, the last reason you might be seeing sorting a lot is that some teachers think it's beginner material while others think it's advanced, but they all think it's important, so they all include it.

Further Reading
If you're more interested in seeing beefier algorithms, I highly recommend The Algorithm Design Manual

 

I think your experience is quite unique. I did not exposed to the sorting algorithms until I went to college, though I to eventually dropped out once I got a full time job as dev. The hard thing with many of these classses there so many students with different background and experience, they have to target some baseline.

 

I feel your pain, 30 years ago I had to learn all that stuff and it didn't seem very relevant then. I don't think I have ever needed to write a sort in the years since and if I did I would just Google the algorithm I needed.

I guess the idea is the same as learning various formulas in maths. I have never needed to know the area under a curve, but the basic techniques I learnt to do that have come in really useful.

I've always felt it would be more useful to cover database design and use. You are more likely to do performance tuning on a poorly designed db then any in memory sort.

 

This is a great discussion. And there are great points made about the sorting algorithms and the ability to study algorithm analysis/design and complexity using sorting algorithms. I agree with all of them. My background has been in industry, 4-year university and 2-year university faculty. So my response is coming from that context. What I have not seen mentioned (forgive me if I missed it) is that there are two end goals at stake for faculty in CS departments at 4-year schools. One of those end goals is to prepare students to enter industry as software developers. Through that lens, the focus is more on the actual sorting to solve the problem and so you would definitely use a built-in function to accomplish this task. And so the study of algorithms that are known and built-in may seem less important. However, the other goal is to produce researchers. These students will go on to earn PhDs and work in industry as well as academia. In this context, the importance of analyzing algorithms, comparing complexities is paramount depending on the area they intend to study. For example, I was heavily into building compilers for parallel and embedded systems. In this context, we constructed a lot of new algorithms. These new algorithms were sometimes built on the ideas of well-understood algorithms. Being able to design, analyze, and implement a new algorithm where performance of the algorithm really matters requires you to use these skills on a daily basis. You also have to evaluate new algorithms that are in research literature, and again these skills are important. So I think its important to keep in mind that faculty at 4-year schools are training two groups of students at the same time, and its challenging to have a curriculum that speaks to both end-goals simultaneously and separately.

 

So most of the other responses are good as to why you need to learn to sort... I can't comment on the fact you got a better high school CS program than most individuals who come into CS programs currently; however, at the university level there are a variety of factors:

1) Different instructors - if you have different instructors they may emphasize topics they see students struggle with later in the curriculum in other classes they teach. Alternatively, they have their own goals for a given course and don't confer with their peers on how they teach the courses before/after the one they are assigned to teach. I know when I teach CS courses as I tend to usually only teach upper-division courses I have to teach concepts I expected students to learn earlier but experience has shown the majority have forgotten or didn't get those concepts as expected.

2) Transfer students - As sorting is something usually doing in the CS1, CS2, CS3 sequence of courses and it seems like the specific course number you cite is either CS2 or CS3 at your school. It's possible that your faculty have had to adapt to students coming in mid-sequence and so they are recovering something so that the transfer students who potentially came from lacking pre-requisite courses don't lose out on what is otherwise an important topic to know.

3) Faculty time - Faculty either are teaching numerous classes + service outside of class or have lots of research to do. This means almost no time to meet regularly as a group to make sure everyone knows what each other is teaching in the curriculum. And as they're usually given freedom to teach classes how they want towards the learning outcomes for a class, this can result in what you're seeing too. But give your faculty some slack, they're working extremely long hours as pay that is significantly less than what they can get in the industry currently.

Just my 2 cents, there is a long list of other reasons... just don't have time to elaborate beyond this.

 

Just came to say that the built-in function doesn't work equally for all data and can take you to dangerous places if you don't know how the built-in function works and where use it.