DEV Community

Cover image for Attempting to Learn Go - Listing Files By Extension

Attempting to Learn Go - Listing Files By Extension

Steve Layton on February 23, 2019

Hello World Near the end of the last post, I noted we would put the static site generator project aside for the time being. I decided th...
Collapse
 
detunized profile image
Dmitry Yakimenko • Edited

Steve, why do you sort inside the loop on every iteration?

for _, file := range dir {
    if !file.IsDir() {
        ...
        sort.Strings(m[ext[len(ext)-1]]) // <-- HERE
    }
}

And here's my take on it. You can sort by predicate. It's not exactly very efficient though, since the extension is recalculated every time. But come on, Go could be really annoying sometimes. Look at this verbosity:

type ByExt []string

func (a ByExt) Len() int           { return len(a) }
func (a ByExt) Swap(i, j int)      { a[i], a[j] = a[j], a[i] }
func (a ByExt) Less(i, j int) bool { return filepath.Ext(a[i]) < filepath.Ext(a[j]) }

...

func main() {
    ...
    files := []string{}
    for _, file := range dir {
        if !file.IsDir() {
            files = append(files, file.Name())
        }
    }
    sort.Sort(ByExt(files))
    ...
}

In Ruby that would be:

filenames.sort_by { |f| File.extname f }
Collapse
 
dirkolbrich profile image
Dirk Olbrich • Edited

Nah, no need for the boilerplate to define a custom sort. This would do sorting the map:

var m = make(map[string][]string)
for _, file := range dir {
    if !file.IsDir() {
        fileName := file.Name()
        ext := strings.Split(fileName, ".")
        switch {
        case len(ext) > 1:
            m[ext[len(ext)-1]] = append(m[ext[len(ext)-1]], fileName)
        case len(ext) == 1:
            m["no-ext"] = append(m["no-ext"], fileName)
        }
    }
}
for ext := range m { sort.Strings(m[ext]) }
Collapse
 
dirkolbrich profile image
Dirk Olbrich • Edited

Edit: sorting the ˋ[]stringˋ within the ˋmap[string][]stringˋ. The map itself can't be sorted.

Thread Thread
 
detunized profile image
Dmitry Yakimenko

I don't have a map in my version. I sort an array by a predicate.

Thread Thread
 
dirkolbrich profile image
Dirk Olbrich

Yes, I have seen it. Your approach is different by just sorting a list of filenames. The orginal intent is to sort files by extension into different buckets.

Your use of filepath.Ext() is quit clever. Haven't thought of that.

Thread Thread
 
dirkolbrich profile image
Dirk Olbrich • Edited

This would make the example even shorter:

var m = make(map[string][]string)
for _, file := range dir {
    if !file.IsDir() {
        fileName := file.Name()
        ext := filepath.Ext(fileName)
        m[ext] = append(m[ext], fileName)
    }
}
for ext := range m { sort.Strings(m[ext]) }
Thread Thread
 
shindakun profile image
Steve Layton

@detunized @dirkolbrich

Thanks for the replies! filepath.Ext()! Didn't occur to me to try that. It goes to show that the standard library really is pretty complete.

Dirk, I like the for ext := range m { sort.Strings(m[ext]) } solution then I wouldn't need to have a separate sort in each "print" function, it's much clearer that way.

Collapse
 
shindakun profile image
Steve Layton

Hah. You caught me! The sort should have been moved up into the print function(s) at the very least. It's a bad design decision - I put it there at first just for the sake of simplicity and never got around to cleaning it up. It doesn't really matter in a directory of a few files but would really impact performance in a larger directory. Something like the following might be fine, and still pretty simple to follow.

func plainList(m map[string][]string, v []string) {
    for _, value := range v {
        sort.Strings(m[value])
        for _, file := range m[value] {
            fmt.Println(file)
        }
    }
}

I think I may update the article to make sure it's called out for clarity.

Collapse
 
dirkolbrich profile image
Dirk Olbrich

You don't mind?

func plainlist(m map[string][]string, order string) string {
    // 1. get all keys of the map
    var keys []string
    for k := range m {
        keys = append(keys, k)
    }

    // 2. sort by order type
    switch order {
    case "desc", "Desc", "DESC":
        for ext := range m {
            sort.Sort(sort.Reverse(sort.StringSlice(m[ext])))
        }
        sort.Sort(sort.Reverse(sort.StringSlice(keys)))
    default:
        for ext := range m {
            sort.Strings(m[ext])
        }
        sort.Strings(keys)
    }

    // 3. build a concatenated string
    var list string
    for _, k := range keys {
        list = fmt.Sprintf("%v\n%v", list, m[k])
    }
    return list
}

use it with:

fmt.Println(plainlist(m, "asc"))
fmt.Println(plainlist(m, "desc"))
Collapse
 
ladydascalie profile image
Benjamin Cable • Edited

Hey Steve, fantastic start.

You've come up with some great solutions in there, so I thought I'd share my own.

I restricted myself to fitting a subset of what you've solved thus far, that is to say, get all the files organised by category, and print them out as JSON. I've ignored plain / nested output, since that is somewhat trivial/not business logic.

Here's my solution:

package main

import (
    "encoding/json"
    "flag"
    "io/ioutil"
    "log"
    "os"
    "path/filepath"
    "strings"
)

var directory string

func main() {
    flag.StringVar(&directory, "dir", ".", "sorter -dir ./path/to/dir")
    flag.Parse()

    files, err := ioutil.ReadDir(directory)
    if err != nil {
        log.Fatal(err)
    }

    var categories = make(map[string][]string)
    for _, file := range files {
        // skip directories.
        if file.IsDir() {
            continue
        }

        ext := filepath.Ext(file.Name())
        name := strings.TrimSuffix(file.Name(), ext)

        // empty name signified a dotfile, skip that.
        if name == "" {
            continue
        }

        // get the absolute path to the file, or error out
        fpath, err := filepath.Abs(filepath.Join(directory, file.Name()))
        if err != nil {
            log.Fatalf("failed building absolute path: %v", err)
        }

        // trim dots before adding to the map.
        ext = strings.TrimPrefix(ext, ".")
        categories[ext] = append(categories[ext], fpath)
    }

    if err := json.NewEncoder(os.Stdout).Encode(categories); err != nil {
        panic(err)
    }
}

As you can see, I've drastically cut down on the number of operations needed to get there, as well as corrected for a few problems you weren't looking out for yet. These are mainly:

  • You should skip dotfiles or hidden files, which start with a . character (at least by default), as these are frequently config files or important somehow.

  • You're expending a lot of effort sorting / printing your data, when really all you need is a map to handle the listing

output from my program (against a sample directory):

usage: sorter -dir ./sample | jq

{
  "jpg": [
    "/Users/bc/code/Personal/sorter/samples/3.jpg"
  ],
  "pdf": [
    "/Users/bc/code/Personal/sorter/samples/2.pdf"
  ],
  "txt": [
    "/Users/bc/code/Personal/sorter/samples/1.txt",
    "/Users/bc/code/Personal/sorter/samples/2.txt",
    "/Users/bc/code/Personal/sorter/samples/3.txt"
  ]
}

If I wanted plain output, with a map I could do something like this:

for key, category := range categories {
    fmt.Println("kind:", key)
    for _, file := range category {
        fmt.Println("\t", file)
    }
}

which would output like so:

kind: txt
     /Users/bc/code/Personal/sorter/samples/1.txt
     /Users/bc/code/Personal/sorter/samples/2.txt
     /Users/bc/code/Personal/sorter/samples/3.txt
kind: pdf
     /Users/bc/code/Personal/sorter/samples/2.pdf
kind: jpg
     /Users/bc/code/Personal/sorter/samples/3.jpg

This is somewhat trite and gross but you get the point, dealing with one map makes this much easier to handle!

Looking forward to seeing what you come up with next!

Collapse
 
shindakun profile image
Steve Layton

I think the core of my issue is I'm also not leaning on the standard library as much as I should. I didn't realize filepath.Ext() was a thing. :/ Yeah, I read "sorting files by ext" as just that sorting alphabetically, had I left that out I would have been done quite a bit quicker. I suppose that made me go off the rails a bit so to speak. The different printing methods were not needed at all but what are you gonna do lol.