Yes, you are right for match case to work, you need to have python 3.10 at least.
However, you can also download the zip file with the executable only here if you just want to use the program. In that case, it should work without issues.
Do let me know if you find any issue when using it so I can check that out.
I also had a look at your repository, and it seems like an excellent app with much more functionalities than mine haha.
Well, I didn't feel like changing my Python version (that's a whole other topic) so I figured I'd just rewrite the match case construct to the older style.
Except, when I noticed all the logical Or symbols I was prompted to re-imagine how I would handle that sort of thing. I decided that I preferred the run-time handling to be just looking into a dictionary rather than doing a cascade of boolean comparisons.
So after some looking at other code I'd written here's what I came up with.
The prep work is to define a look-up resource - implemented as instructions for making two layers of dictionary.
fromenumimportEnum,unique,auto@uniqueclassTypeOfFile(Enum):Document=auto()Audio=auto()Video=auto()Picture=auto()Executable=auto()Graphic2D=auto()Graphic3D=auto()Font=auto()Text=auto()Compressed=auto()DiskImage=auto()MobilePhone=auto()Databases=auto()defSubFolder_For_TypeOfFile(tof):iftofin[TypeOfFile.Document]:return'/Documents/'eliftofin[TypeOfFile.Audio]:return'/Audio Files/'eliftofin[TypeOfFile.Video]:return'/Video Files/'eliftofin[TypeOfFile.Picture]:return'/Images/'eliftofin[TypeOfFile.Executable]:return'/Executable Files/'eliftofin[TypeOfFile.Graphic2D]:return'/Graphic Files/'eliftofin[TypeOfFile.Graphic3D]:return'/3D Graphics/'eliftofin[TypeOfFile.Font]:return'/Font Files/'eliftofin[TypeOfFile.Text]:return'/Text Files/'eliftofin[TypeOfFile.Compressed]:return'/Compressed Files/'eliftofin[TypeOfFile.DiskImage]:return'/Disk Images/'eliftofin[TypeOfFile.MobilePhone]:return'/Mobile Phone Related Files/'eliftofin[TypeOfFile.Databases]:return'/Databases Files/'defExtensions_For_TypeOfFile(tof):iftofin[TypeOfFile.Document]:return['.abw','.aww','.chm','.cnt','.dbx','.djvu','.doc','.docm','.docx','.dot','.dotm','.dotx','.epub','.gp4','.ind','.indd','.key','.keynote','.mht','.mpp','.odf','.ods','.odt','.opx','.ott','.oxps','.pages','.pdf','.pmd','.pot','.potx','.pps','.ppsx','.ppt','.pptm','.pptx','.prn','.ps','.pub','.pwi','.rtf','.sdd','.sdw','.shs','.snp','.sxw','.tpl','.vsd','.wpd','.wps','.wri','.xps','.numbers','.ods','.sdc','.sxc','.xls','.xlsm','.xlsx']eliftofin[TypeOfFile.Audio]:return['.3ga','.aac','.aiff','.amr','.ape','.arf','.asf','.asx','.cda','.dvf','.flac','.gp4','.gp5','.gpx','.logic','.m4a','.m4b','.m4p','.midi','.mp3','.ogg','.opus','.pcm','.rec','.snd','.sng','.uax','.wav','.wma','.wpl','.zab']eliftofin[TypeOfFile.Video]:return['.264','.3g2','.3gp','.ard','.asf','.asx','.avi','.bik','.dat','.dvr','.flv','.h264','.m2t','.m2ts','.m4v','.mkv','.mod','.mov','.mp4','.mpeg','.mpg','.mts','.ogv','.prproj','.rec','.rmvb','.swf ','.tod','.tp','.ts','.vob','.webm','.wlmp','.wmv']eliftofin[TypeOfFile.Picture]:return['.bmp','.cpt','.dds','.dib','.dng','.emf','.gif','.heic','.ico','.icon','.jpeg','.jpg','.pcx','.pic','.png','.psd','.psdx','.raw','.tga','.thm','.tif','.tiff','.wbmp','.wdp','.webp']eliftofin[TypeOfFile.Executable]:return['.air','.app','.application','.appx','.bat','.bin','.com','.cpl','.deb','.dll','.elf','.exe','.jar','.js']eliftofin[TypeOfFile.Graphic2D]:return['.abr','.ai','.ani','.cdt','.djvu','.eps','.fla','.icns','.ico','.icon','.mdi','.odg','.pic','.psb','.psd','.pzl','.sup','.vsdx','.xmp']eliftofin[TypeOfFile.Graphic3D]:return['.3d','.3ds','.c4d','.dgn','.dwfx','.dwg','.dxf','.ipt','.lcf','.max','.obj','.pro','.skp','.stl','.u3d','.x_t']eliftofin[TypeOfFile.Font]:return['.eot','.otf','.ttc','.ttf','.woff']eliftofin[TypeOfFile.Text]:return['.1st','.alx','.application','.asp','.csv','.htm','.html','.log','.lrc','.lst','.md','.nfo','.opml','.plist','.reg','.rtf','.srt','.sub','.tbl','.text','.txt','.xml','.xmp','.xsd','.xsl','.xslt','.ini']eliftofin[TypeOfFile.Compressed]:return['.001','.002','.003','.004','.005','.006','.007','.008','.009','.010','.7z','.7z.001','.7z.002','.7z.003','.7z.004','.7zip','.a00','.a01','.a02','.a03','.a04','.a05','.ace','.air','.appxbundle','.arc','.arj','.bar','.bin','.c00','.c01','.c02','.c03','.cab','.cbr','.cbz','.cso','.deb','.dlc','.gz','.gzip','.hqx','.inv','.isz','.jar','.msu','.nbh','.pak','.part1.exe','.part1.rar','.part2.rar','.pkg','.pkg','.r00','.r01','.r02','.r03','.r04','.r05','.r06','.r07','.r08','.r09','.r10','.rar','.rpm','.sit','.sitd','.sitx','.tar','.tar.gz','.tgz','.uax','.vsix','.webarchive','.z01','.z02','.z03','.z04','.z05','.zab','.zip','.zipx']eliftofin[TypeOfFile.DiskImage]:return['.000','.ccd','.cue','.daa','.dao','.dmg','.img','.img','.iso','.mdf','.mds','.mdx','.nrg','.tao','.tc','.toast','.uif','.vcd']eliftofin[TypeOfFile.MobilePhone]:return['.apk','.asec','.bbb','.crypt','.crypt14','.ipa','.ipd','.ipsw','.lqm','.mdbackup','.nbh','.nomedia','.npf','.pkpass','.rem','.rsc','.sbf','.sis','.sisx','.spd','.thm','.tpk','.vcf','.xap','.xapk']eliftofin[TypeOfFile.Databases]:return['.accdb','.accdt','.csv','.db','.dbf','.fdb','.gdb','.idx','.mdb','.mdf','.sdf','.sql','.sqlite','.wdb']defmake_ext_lookups():# make a lookup dictionary by TypeOfFile with each one's subfolder name
dct_filetype_subfolder={}fortofinTypeOfFile:dct_filetype_subfolder[tof]=SubFolder_For_TypeOfFile(tof)# make the extension list
dct_extensions={}fortofinTypeOfFile:forextinExtensions_For_TypeOfFile(tof):ifextindct_extensions:print("Ignoring multiple use of "+ext+" for "+SubFolder_For_TypeOfFile(tof)+" is already in "+dct_filetype_subfolder[dct_extensions[ext]])else:dct_extensions[ext]=tofreturndct_filetype_subfolder,dct_extensions
I used an enumeration as the link between the two - in effect this is a translation of your various case groups.
so that constructs an instance of the nested dictionaries.
Then, instead of your match structure, I do:
# replace use of match with a two-level dictionary lookup
ifextindct_extensions:move_files(directory,file,dct_filetype_subfolder[dct_extensions[ext]])else:move_files(directory,file,'/Others/')
By the way, inside def make_ext_lookups(): I added a check to tell me if I'd miskeyed when I adapted the extension lists. This was done with if ext in dct_extensions: and the print that it does.
I wasn't actually expecting that to show anything, but as it happened, it did - printing the following:
Ignoring multiple use of .ods for /Documents/ is already in /Documents/
Ignoring multiple use of .gp4 for /Audio Files/ is already in /Documents/
Ignoring multiple use of .asf for /Video Files/ is already in /Audio Files/
Ignoring multiple use of .asx for /Video Files/ is already in /Audio Files/
Ignoring multiple use of .rec for /Video Files/ is already in /Audio Files/
Ignoring multiple use of .djvu for /Graphic Files/ is already in /Documents/
Ignoring multiple use of .ico for /Graphic Files/ is already in /Images/
Ignoring multiple use of .icon for /Graphic Files/ is already in /Images/
Ignoring multiple use of .pic for /Graphic Files/ is already in /Images/
Ignoring multiple use of .psd for /Graphic Files/ is already in /Images/
Ignoring multiple use of .application for /Text Files/ is already in /Executable Files/
Ignoring multiple use of .rtf for /Text Files/ is already in /Documents/
Ignoring multiple use of .xmp for /Text Files/ is already in /Graphic Files/
Ignoring multiple use of .air for /Compressed Files/ is already in /Executable Files/
Ignoring multiple use of .bin for /Compressed Files/ is already in /Executable Files/
Ignoring multiple use of .deb for /Compressed Files/ is already in /Executable Files/
Ignoring multiple use of .jar for /Compressed Files/ is already in /Executable Files/
Ignoring multiple use of .pkg for /Compressed Files/ is already in /Compressed Files/
Ignoring multiple use of .uax for /Compressed Files/ is already in /Audio Files/
Ignoring multiple use of .zab for /Compressed Files/ is already in /Audio Files/
Ignoring multiple use of .img for /Disk Images/ is already in /Disk Images/
Ignoring multiple use of .nbh for /Mobile Phone Related Files/ is already in /Compressed Files/
Ignoring multiple use of .thm for /Mobile Phone Related Files/ is already in /Images/
Ignoring multiple use of .csv for /Databases Files/ is already in /Text Files/
Ignoring multiple use of .mdf for /Databases Files/ is already in /Disk Images/
So you might want to check your source code for similar double presences in your case lines.
Anyway, now that I have a variant that runs of my older Python (which will be just whatever is installed on Xubuntu 20.04) - it seems to work nicely.
I don't know that I like "Others" as a folder name for unrecognised things - I'd prefer something either alphabetically before or after all the rest.
It took me a moment to understand how your solution was working since I'm still learning a lot of new stuff and had not used enum before but after testing for a while with the debugger I understood how you solved the sorting using enum and dictionaries. I wanted to ask you, does this make the code run more efficiently? or was it just your workaround to not use match case? I'm not sure yet how to test when a code is more or less efficient so I would appreciate if you could tell me if there is a difference in performance with one approach or the other.
Regarding the duplicates I did have a look and found out that the source from where I got the extensions list does have some extensions listed in multiple categories which is why your code found the use of the same extension on different categories. I guess I'll have to manually decide to which categories I want those extensions to be sorted out.
Well, I certainly made the change to not use match case.
My main reaction though was because I don't like seeing so much data-like material being hard coded. Where I can, I like to move that kind of thing into a data structure. This often has the advantage of making the code simpler at the point of decision.
But another advantage is that it prepares the ground for maybe loading that decision data from a config file, say from a JSON file. That way, fine tuning of what the program does can be done without rewriting it.
As for which is more "efficient" that might depend on quite what meaning you want for that.
As you had a match case construct, answering that will partly depend on quite how that gets implemented under the hood, i.e. by CPython. Double-guessing (or even checking the CPython source code) seems to be a popular game in some quarters. My personal view is that if that's a worry then its probably time to code in something other than Python. It is an interpreted scripting language after all.
Anyway, I wasn't really comparing to the match case, rather to the complex of if and elif blocks I would need and also the number of Or operators - just to replace what you had.
When I see a lot of Or usage I tend to remember that the speed of the operation becomes quite variable depending on the data. It was that thought of variance that prompted me to think that for each distinct extension, we (the programmers) already know which categorisation should be used. So how do we express that best in Python? Well in short, using a known value to get another known related value is what dictionaries are good at.
I strongly suspect that the dictionary lookup is faster than a lot of cascading conditional logic operators, but it does beg the question of how dictionaries are handled by CPython.
For that matter, I could have constructed a dictionary that directly mapped from extension to sub-folder - rather than the double-dictionary method that I first wrote. Thus are the many, many options of tackling these things.
As for when to construct those dictionaries, that comes down to knowing the scope and lifespan of the program. As a quick thing to do, I put that construction step - that is, calling the function to do it - inside the def organize(directory). But the way your program currently works, it could have been done outside that, thereby only be done once to cover all runs of organize. I was just too lazy to work which way to do that: make it global, pass it in as a control parameter etc.
BTW another thing that the as-data approach enables is having alternate dictionaries to pass to the organise function. For example, there could be some stock but varied combination for the user to select among.
Sorry for the delay I was offline for a few days, yes I can see what you mean by loading the decision from a file by doing it using data structures instead of hard coding it and re-writing the code each time you want to add or remove something.
As for the efficiency topic I was asking more because right now I'm reading about time and space complexity and was curious if may be this was something that played a role on why you decided to do it like that. If I'm not mistaken python uses a garbage collector so space complexity is out of the hands of the programmer (I think) but I'm still not sure how to know when a program will be more or less time complex.
Thanks for your comments you have shared some really interesting stuff ^^
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
Hi geraldew, thanks for your comment!
Yes, you are right for
match case
to work, you need to have python 3.10 at least.However, you can also download the zip file with the executable only here if you just want to use the program. In that case, it should work without issues.
Do let me know if you find any issue when using it so I can check that out.
I also had a look at your repository, and it seems like an excellent app with much more functionalities than mine haha.
Thanks again!
Well, I didn't feel like changing my Python version (that's a whole other topic) so I figured I'd just rewrite the
match case
construct to the older style.Except, when I noticed all the logical Or symbols I was prompted to re-imagine how I would handle that sort of thing. I decided that I preferred the run-time handling to be just looking into a dictionary rather than doing a cascade of boolean comparisons.
So after some looking at other code I'd written here's what I came up with.
The prep work is to define a look-up resource - implemented as instructions for making two layers of dictionary.
I used an enumeration as the link between the two - in effect this is a translation of your various
case
groups.Then, to the top of:
I added a line:
so that constructs an instance of the nested dictionaries.
Then, instead of your
match
structure, I do:By the way, inside
def make_ext_lookups():
I added a check to tell me if I'd miskeyed when I adapted the extension lists. This was done withif ext in dct_extensions:
and theprint
that it does.I wasn't actually expecting that to show anything, but as it happened, it did - printing the following:
So you might want to check your source code for similar double presences in your
case
lines.Anyway, now that I have a variant that runs of my older Python (which will be just whatever is installed on Xubuntu 20.04) - it seems to work nicely.
I don't know that I like "Others" as a folder name for unrecognised things - I'd prefer something either alphabetically before or after all the rest.
Hi geraldew,
It took me a moment to understand how your solution was working since I'm still learning a lot of new stuff and had not used enum before but after testing for a while with the debugger I understood how you solved the sorting using enum and dictionaries. I wanted to ask you, does this make the code run more efficiently? or was it just your workaround to not use match case? I'm not sure yet how to test when a code is more or less efficient so I would appreciate if you could tell me if there is a difference in performance with one approach or the other.
Regarding the duplicates I did have a look and found out that the source from where I got the extensions list does have some extensions listed in multiple categories which is why your code found the use of the same extension on different categories. I guess I'll have to manually decide to which categories I want those extensions to be sorted out.
Thanks again!
Well, I certainly made the change to not use
match case
.My main reaction though was because I don't like seeing so much data-like material being hard coded. Where I can, I like to move that kind of thing into a data structure. This often has the advantage of making the code simpler at the point of decision.
But another advantage is that it prepares the ground for maybe loading that decision data from a config file, say from a JSON file. That way, fine tuning of what the program does can be done without rewriting it.
As for which is more "efficient" that might depend on quite what meaning you want for that.
As you had a
match case
construct, answering that will partly depend on quite how that gets implemented under the hood, i.e. by CPython. Double-guessing (or even checking the CPython source code) seems to be a popular game in some quarters. My personal view is that if that's a worry then its probably time to code in something other than Python. It is an interpreted scripting language after all.Anyway, I wasn't really comparing to the match case, rather to the complex of
if
andelif
blocks I would need and also the number ofOr
operators - just to replace what you had.When I see a lot of
Or
usage I tend to remember that the speed of the operation becomes quite variable depending on the data. It was that thought of variance that prompted me to think that for each distinct extension, we (the programmers) already know which categorisation should be used. So how do we express that best in Python? Well in short, using a known value to get another known related value is what dictionaries are good at.I strongly suspect that the dictionary lookup is faster than a lot of cascading conditional logic operators, but it does beg the question of how dictionaries are handled by CPython.
For that matter, I could have constructed a dictionary that directly mapped from extension to sub-folder - rather than the double-dictionary method that I first wrote. Thus are the many, many options of tackling these things.
As for when to construct those dictionaries, that comes down to knowing the scope and lifespan of the program. As a quick thing to do, I put that construction step - that is, calling the function to do it - inside the
def organize(directory)
. But the way your program currently works, it could have been done outside that, thereby only be done once to cover all runs oforganize
. I was just too lazy to work which way to do that: make it global, pass it in as a control parameter etc.BTW another thing that the as-data approach enables is having alternate dictionaries to pass to the organise function. For example, there could be some stock but varied combination for the user to select among.
Hi geraldew,
Sorry for the delay I was offline for a few days, yes I can see what you mean by loading the decision from a file by doing it using data structures instead of hard coding it and re-writing the code each time you want to add or remove something.
As for the efficiency topic I was asking more because right now I'm reading about time and space complexity and was curious if may be this was something that played a role on why you decided to do it like that. If I'm not mistaken python uses a garbage collector so space complexity is out of the hands of the programmer (I think) but I'm still not sure how to know when a program will be more or less time complex.
Thanks for your comments you have shared some really interesting stuff ^^