Friday, July 4, 2014

Autophile: Automatically Sort Files into Folder by Name

I like to mark my student's reports electronically rather than mark paper copies. I type faster than I write and I can copy/paste verbose comments when students do similar things. The major drawback is getting the work back to the students; it's always a painstaking process to manipulate each file into an email or a folder. To reduce that grunt-work I wrote a program that will move files into folders with similar names. I call it Autophile.


The idea is to get your students to put their student ID in the file name so that the software can sort each student's work into their own folder on the network drive. Choose the files you want to sort, select the folders you want to sort them into, adjust settings, and click "Sort"


A confirmation window will pop up telling you where the program thinks each file should go. You can make adjustments or cancel the process. Clicking "Apply Sort" will move the files.

Confirmation Window
The program computes the best match between folder and file by computing something called the Damerau–Levenshtein distance. Basically, it's the number of letters you would have to substitute, remove, insert, or swap to make a portion of the file name look like the folder name.

I was having problems with very short folder names getting picked up as the best match so I modified the result a little bit to favor longer, less accurate matches over shorter accurate matches. I weighted the D-L distance at 100% and the match length at 90% when doing the comparison. For example, I wanted to favor a match that was 5 letters long with one letter wrong over a match that was 3 letters wrong with no errors.

You can download the program here or the source code here.

84 comments:

  1. I am a lawyer in Brazil , and that helped me a lot in the organization of my files , thank you

    ReplyDelete
  2. it is a really great help for me...thank you

    ReplyDelete
  3. Hello, i am a slow learner and usually tough for me to learn new software. Your software by far is the best and easiest to use, this was exactly what I was looking for. You saved me so much time and effort. Thanks a bunch!

    ReplyDelete
  4. Wow... This is rather easy. Thank you for this, sir!

    ReplyDelete
  5. Hello. Really good software. Are you open to taking this one step further ?

    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete
  7. Hello, I recently found your program and decided to give it a try at my company, where it sorts up to 300 documents between a few thousand maps. I was wondering if there is any way to sort my files by their file name, since at the moment it seems to sort them randomly.

    ReplyDelete
    Replies
    1. Hi Hazmat,

      That's what the program is supposed to do. It's supposed to find the best fit between folder name and file name. I'd be interested to see what the Sort Report window tells you. However, I can't promise any help really, I've moved on from this project. The source code is available in the post if there's anyone at your company that wants to pick it up they are welcome to.

      Sorry I can't be more help,
      David

      Delete
  8. Hi David,

    This software is exactly what i need and it is simple too.

    I have one question, is it possible when already a file with the same name is in the folder, that the software automatically renames it into name (2), as Windows does when we copy it manually.

    ReplyDelete
    Replies
    1. Hi Unknown,

      That's a good point, it really should be copying without replacing. I haven't looked at this program in years though. If I get a chance I'll make that update, but I woudn't hold your breath ;)

      In the mean time the source code is available if you want to make an update I have no objections

      Soure: https://drive.google.com/file/d/0B-1eAQYKZo_5bTlMWGVXX2M2Ykk/edit?usp=sharing

      Delete
  9. THANK YOU!!! You saved me!!! -last minute guy

    ReplyDelete
  10. Where on Earth did you learn this? That is such a handy algorithm, and your logic flow seems anything but hobby level. You're working in C# using VS pro 2015? Are you interested in custom work? And have you done any projects since? Lot of questions - you can point me to something or other.. Many thanks for turning me on to this.

    ReplyDelete
    Replies
    1. You're welcome; glad it was helpful.

      The truth is, since you're curious, I learned to code by reading the manual of my Ti-83 Graphics Calculator! I've progressed from total amateur making all the classic mistakes and learning from them bit by bit. I still have a lot more to learn! I've been detailing some of my projects on this blog if you want to browse around.

      I did complete this project in C# using Visual Studio (2015 edition sounds about right). If you're interested in picking up a modern language for Windows programs I think that's a pretty good choice.

      I'm pretty busy with my full-time teaching job but I'm always open to talk about new opportunities. There is a group I organize called Coding for Good. There are several talented students in that group that might be willing to take on a project if it's for a good cause. (http://coding-for-good.s3-website-ap-southeast-1.amazonaws.com/). There's a contact form on that website if you want to drop us a line.

      Cheers,
      David

      Delete
    2. Just replying to myself to mention that Coding for Good updated their website: codingforgood.co

      Delete
    3. Just to notify that their site changed again to https://www.coding4good.net/

      Delete
    4. Woah, cool. That's actually a different student group with the same idea as us.

      Sadly, we accidentally let our domain registration slip and lost control of our URL. Our own website is now at http://c4g.herokuapp.com/

      Delete
  11. this is a great program! the only thing that would make it better if you can also add a general PDF file that doesn't have their name (e.g. general instruction sheet) into each folder as well.

    ReplyDelete
    Replies
    1. That's a good idea. Thanks for your feedback!

      Delete
    2. Any chance you've moved ahead on this- I love your program and would like to be able to add the same file to a range of folders?
      Thank you-

      Delete
  12. hi, it seems to be good software but I havent been able to get it to work. sorting many files causes it to be non responsive. then there is a glitch when choosing the both folders.

    ReplyDelete
    Replies
    1. Hi the galian,

      Good point. I hadn't learned about Threading yet when I built this program so all processes run in only one thread. The software becomes non-responsive because if you have a lot of files or a lot of folders, it takes a long time for the computation to complete. While the computation is running the UI doesn't have access to the CPU to keep it responsive.

      The computation time will grow linearly with the number of files you're trying to sort, but it will also grow geometrically with the number of folders you are sorting into and the length of the names of the files and folders. Unfortunately I don't think there's a way around that for this kind of application. You would have to find a faster way to identify matches between names.

      As for choosing two folders, I don't think the program supports sorting a single file into more than one folder.

      Cheers,
      David

      Delete
  13. I LOOOOOOOVE YOU
    This might just have saved me about 2 weeks of moving files manually or learning how to write PowerShell commands. Combining this with “Bulk Rename Utility”, I can easily manipulate tens of thousands of files in a few hours. Thanks.

    ReplyDelete
    Replies
    1. tens of thousands of files? he must of not read my concern and the awnser to it above. not possible

      Delete
    2. the Galian, I mostly agree. But if the number of folders to sort into is small, then it might work. For example, if 1000's of files all go into just one folder, then 10,000's of files to sort might work.

      Delete
  14. Thank... Thank you very much !!!~~

    ReplyDelete
  15. Thank You So Much .. I do Appreciate Your Great Helps

    ReplyDelete
  16. Hello.. this is the exact program im looking for. However, when i choose nearest match and hit sort, its asking me to pick destination folders rather than have them filled out for me. The files and folders are clearly numbered where it should be recognized as a match. Any ideas on what the issue is? Im running Windows 10.

    ReplyDelete
    Replies
    1. Hi Glenn,

      A couple possibilities:
      - in the "sort report" window, is the destination column blank, or filled in? If it's filled in that's normal behavior. The sort report is just to help you confirm before changes are made
      - if the destination column is blank, the program wasn't able to find a suitable match for a destination folder. Are the numbered folders very small numbers? If I remember correctly you need at least 3 or 4 digits before a match can be made, so numbers less than 100 aren't going to get match. You could try generating some folders with leading zeros, like 00001, 00002, 00003 and see if they get matched to files

      Delete
  17. Thanks a lot.... it really worked :)))))

    ReplyDelete
  18. can we just let the program to sort and auto create respective folders?

    ReplyDelete
    Replies
    1. No, not quite. There's not really a way to know how to name the folders that would need to be created. In the video, there are two examples of folder creation though. At 2:40 you can see how deeper directory trees can be created and at 4:50 you can see how you can use a list (like a list of clients' names) to create a series of folders together.

      But no, there's no way that I can imagine to take the name of a file and guess what an appropriate folder name to create for it would be.

      Delete
  19. That is beautiful piece of software, thanks.

    ReplyDelete
    Replies
    1. You're welcome! I don't really have a need for it anymore so I'm really glad to hear that other people are getting value out of it :)

      Delete
  20. I am a music producer. I have a lot of samples unarranged it really helped me to arrange my samples. I i had done it one by one it might took about 2-3 days as i have millions of wav sounds. Thanks man it is really great can i share it with my friends aswell?

    ReplyDelete
  21. Wow! That's so awesome to hear! That's for the cool feedback.

    Please feel free to share :)

    ReplyDelete
  22. This program is a life saver. Had 700,000 project files that needed major sorting before their next step in processing. Would've taken months to complete instead of just a few hours like with this program. Thank you for all the good you are putting out into the world!

    ReplyDelete
    Replies
    1. You're welcome :)

      700,000 files is really quite a lot! Out of curiosity, how many different folders did they need to get sorted into and how long did it take for Autophile to find matches?

      I hadn't really envisioned sorting so many files when I wrote the software - It might have quite a long time!

      Delete
    2. I did smaller batches at a time going into just 35 folders. Only takes a minute or 2 really. I tried 13000 at first but it doesn't seem to add any files if there are too many files in a folder. So I would just drag a few thousand files in a folder then sort/move and repeat. It's really great software Thanks again.

      Delete
    3. Interesting. Thanks for that data.

      I never really envisioned sorting so many files when I designed this project. I think it may have worked out because while you have a huge number of files, you have only a relatively few folders to sort into. I'm glad to hear it worked out!

      Delete
  23. Fantastic - I'm using it to sort photos for a Camera Club - Many thanks from across the Ditch

    ReplyDelete
    Replies
    1. You're welcome! I'm really glad to hear that you and others are getting use out of it.

      Delete
  24. Hello David,
    I came across your video in search of a solution. The video displays exactly what I need. I am having difficulty getting the program to work on my computer though. I have about 368 files (which contain the last names) to corresponding first & last name folders. Everytime i attempt to use it it freezes and is non responsive. Tried on 3 computers. Any suggestions?

    Thanks in advance for suggestions,

    Ryan Gorsky

    ReplyDelete
    Replies
    1. i'm not david tho, but freezing while handling many files is common on my end, since it has no progress bar or something like that, but waiting for it to complete is the suggestion I can give you, it may took a while even for my Ryzen 3 3rd gen system to do the job. I hope it helps. Cheers mate xD

      Delete
    2. Hi Ryan,

      Mael is right. The program freezes while calculating the best fit for files to folders, so the only solution is to give it time.

      You can try sorting just one file first to see how long it takes. You can expect the whole process to take 368 times longer for the full file selection. Go have a coffee and come back :)

      By the way, how many folders are you sorting into? If it's quite a lot, that will have a big impact on sorting time.

      Delete
    3. Hey,
      I attempted to do only five files for five folders. The same thing keeps happening. This time I waited 2 and a half hours. Still no response from the program. The folders have the first and last name, and the files contain the last name. first initial. Would that have any affect? I am not sorting through exact matches.

      Ryan

      Delete
    4. Sorry to hear it's not working. Five files into five folders should take less than a second, not hours. The file and folder names sound appropriate to me, but I'm not sure why it's getting stuck. I'm sorry to have to let you down, but I don't think I have a solution for you :(

      Delete
  25. Hey David, you are a ROCK STAR! Look how much you have helped people over the years through this project! I completed student teaching in fall of 2019, so when spring 2020 came I was expecting to begin officially teaching in public schools (I had taught in non public school settings for 13 years). Well, enter COVID-19 and the world flipped upside down and all plans were down the drain. So, over the pandemic I have been forced to be creative, build a new skill base and took on learning graphic design thinking I would be designing products for Teachers Pay Teachers. I quickly found that I now have NO IDEA what teachers need with the virtual learning platforms - many schools are restricting what content teachers can/should use and now that we've gone almost 100% paperless, there is a whole new set of needs. Anyway, the first graphic bundle I purchased was over 16,000 files over multiple themed collections. The creator had them well organized so I was really quite spoiled, but have since downloaded about 10,000 folders of additional graphics with a million individual image files within them, not counting the crazy amt of fonts I've acquired as well - it has all turned into an overwhelming mess, hence why I searched YouTube and found your video today! What does this have to do with anything, well here's what I've been experiencing:

    So, the process is daunting when you find this awesome bundle with 50 FREE graphics or whatever in several themes, you scan and open the zipped files all at once and they split into a million pieces because hardly any of these creators thought to put them in a sub folder first! This causes you to lose the very important "Read Me" files containing licensing info, even though they were supposed to be "free for commercial use". So my point in this whole life story scenario is that if I can get this program to work for me by auto sorting these, it would save me SOOOOOO much time and heartache managing these images, meaning I can spend more time creating and "entrepreur-ING". So, seriously, thank you sooooooo much!! I gave all the deets because I thought you'd find it interesting to see how you are helping people in other fields of work with your awesome "project". Blessings to you from Virginia, U.S.A!!!

    ReplyDelete
    Replies
    1. Thank you so much for your really wonderful message! It brightened my day. As a fellow teacher, I'm sure you know the warm fuzzy feeling you get from knowing you helped make things a bit better for someone else :)

      Delete
  26. You are my savior.
    I sorted about 70000 files and over 70GB of data with this program. I takes a while for each file but the program used only about 10% of my CPU and sip of my memory so I divided the files and ran 10 exes' at once. This still took me about a day and some left over files but much better than using my own hand.

    ReplyDelete
    Replies
    1. Wow! That's incredible! When I made this program, I never imagined that someone would use it for such a big job. I was facing the repetitive task of sorting the work of 20-30 students into folders over and over again. Never thought that someone might have 70,000 files to sort!

      One day I will have to come back to this project and re-think it for users with big sorting needs like yourself and Galian :{

      Delete
  27. THANK YOU SO MUCH! Your program saved me hours! I had to sort almost a thousand files for an upcoming art exhibition. Up until now, this has always been done manually. I am so glad I found your site! Autophile will be a trusty companion from now on :)

    ReplyDelete
    Replies
    1. Wow, glad to hear it! Thanks for the kind words :)

      Delete
  28. hey David, this awesome and saved me many hours sorting my files.
    I'm having a little problem and hopefully you can help!
    I have 2 files called A and B that I want to sort to the right folder and for file A the folder is called "A superseded by B". On exact match and nearest match autophile does not seem to work. Is there a way the program can sort by the first couple digits (to be set by user) of the destination folder?. many thanks in advance!.

    ReplyDelete
    Replies
    1. Hmm...So for example you have files:

      apple.txt
      banana.txt
      cantaloupe.txt

      and folders:

      /apple superseded by banana
      /banana superseded by cantaloupe
      /cantaloupe (current version)

      And you'd want banana.txt to sort into /banana superseded by cantaloupe but the sorting algorithm picks /apple superseded by banana?

      If that's the case, I think the best suggestion I can offer is this:

      Sort your files one folder at a time from the bottom up. Meaning, attempt to sort all the files into just /cantaloupe (current version). Files with cantaloupe in their name will move out of your unsorted directory while files with very poor matches (like banana.txt and apple.txt) won't sort at all and will remain in the unsorted directory.

      Then, do the same process for /banana superseded by cantaloupe. All the cantaloupe files will have already moved so they won't get mis-sorted into this directory, banana files will match in, and apple files won't get sorted because they are a poor match for this directory.

      Continue this process all they way back to your first directory, /apple superseded by banana

      Delete
    2. Hi David, many thanks for quick response!

      Your example is close and a good solution if I had hundreds of files with "apple"or "banana" in the file name.
      In my case its hundreds of different files names to go in hundreds of folders so drag-n-drop is possibly a quicker process for now.

      The Autophile program would work for me if it would sort the files by the first 10 (or less) digits depending on folder name.
      I.e for a file like apple.txt a near match would be "/apple sup", the first 10 digits of "/apple superseded by banana".

      is it something you could look into or any ideas?

      many thanks

      Nardi

      Delete
    3. Sorry, this probably isn't the answer you were hoping for, but I don't think Autophile has the feature you're describing. I'll consider your suggestion for the next version, but that's not really on the horizon at the moment.

      In the mean-time, if you feel like downloading the source code (https://drive.google.com/file/d/0B-1eAQYKZo_5bTlMWGVXX2M2Ykk/edit) and compiling it yourself, just a small change will hard-code the behavior you're describing.

      in Form1.cs, change line 295 from this:

      Distance = DamerauLevenshteinDistance(Piece.Text.ToLower(), Puzzle.Substring(LowerIndex, UpperIndex - LowerIndex).ToLower());

      to this:

      Distance = DamerauLevenshteinDistance(Piece.Text.ToLower().Substring(0,9), Puzzle.Substring(LowerIndex, UpperIndex - LowerIndex).ToLower());

      Delete
    4. Hi David,

      again many thanks for quick reply. I will practice with this a little and see if I can get it going.

      Delete
  29. THANK YOU SO MUCH! Your program saved me hours! I had to sort almost a thousand files for an upcoming art exhibition. Up until now, this has always been done manually. I am so glad I found your site! Autophile will be a trusty companion from now on

    ReplyDelete
    Replies
    1. Glad to hear it helped you so much! Thank you for taking the time to share :)

      Delete
  30. This comment has been removed by the author.

    ReplyDelete
  31. Hi, at work I am constantly moving files to certain subfolders based on the number of the file and it takes up a lot of my time so I've been trying to automatize this recently. The problem is that the name of the file matches the name of the subfolder only partially. We create a folder for each deal we have and those deals are each identified as follows 16G02_01. Then we name all the files pertaining to that specific deal with a numerical sequence.

    For example, let's say we have a deal 16G02_01_5156156_1254_16-02-2021. We create a subfolder called 16G02_01. This sub folder is located in folder North -> region 1 – building 1 - 16G02_01 I then move those files to the corresponding folder manually. I was wondering if there was any way to automatize this with this tool considering only a part of the file name matches with the destination folder name. Thank you!

    ReplyDelete
    Replies
    1. Hi Nono,

      I've got good news, and bad news.

      The good news is that the partial match shouldn't be a problem - that's what the program is designed for. Change "Exact Match" to the "nearest match" option under "3) Set sorting options"

      As for looking for matches to folders that are buried in the directory tree, first, let me see if I understand correctly:

      You have several folders for different regions:

      North
      South
      East
      Europe
      etc

      And in each of these folders you have many folders pertaining to specific deals:

      16G02_01
      16G02_02
      etc...

      If that's the case, the easiest (but imperfect) solution is to try sorting into just one region at a time. It means that you'll need to know which region to sort into.

      Unfortunately, there is not another good solution as far as I can tell. I had look for ways to create a single mirror of the contents of each region, but I don't think windows supports that. I also had a look back through the source code, but I don't think there is a simple fix for combining destination directories from multiple branch directories.

      It might be a bit of cold comfort, but you've got a great problem to solve here. I'd like to try to add it to the next version of the program; but I don't think it will be ready any time soon :(

      Delete
  32. this is truly amazing! even after many years publisher still answering our question

    Thank you! you're a life saver

    ReplyDelete
  33. Hi David! Firstly, thank you for creating this program! I use this at work almost every day and it has saved me days of file sorting at this point. We'll soon be using it to digitize about 6 years worth of physical files.

    Secondly - I'd like to customize it so that the input folder defaults to my "Dropzone" folder and the output defaults to our "Clients folder". I see that you've provided the source code, but I'm not sure how to go about editing it and turning it back into an application.

    Do you know of any articles that would detail this? I tried to look it up but I'm having difficulty finding resources. I am trying to learn how to code but I'm still very much a beginner!

    ReplyDelete
    Replies
    1. Hi Liz!

      Thank you so much for letting me know you're getting good use out of the program. It's really a great feeling to know :)

      That sounds like a pretty do-able edit and a great first job for a beginner programmer. I'll try not to make any assumptions about things you might already know.

      You'll probably want to download something called an IDE. It's specialized software that helps you write, test, and run your own code. The source code for this project is written in the programming language C# using the IDE VSCode, so I'd suggest you start with a beginner's tutorial on C# in VSCode. This one looks promising: https://www.w3schools.com/cs/cs_getstarted.php

      After completing the tutorial you might feel comfortable importing the source code I've provided and trying some edits of your own. If not you might instead look for a C# tutorial on file handling since that's the job you're interested in.

      I hope that helps!

      Delete
    2. Thank you so much! I'll try this.

      Delete
  34. شكرا جزيلا جزيتم الخير و هديتم للإسلام

    ReplyDelete
  35. Replies
    1. give me your contact please iam from pakistan i want to learn how to make its

      Delete
    2. If you just want to use the program, it is freely available here: https://drive.google.com/file/d/0B-1eAQYKZo_5YmlGTF9neEdRWWM/edit?usp=sharing

      But if you are asking me how to code the project, or manipulate the source code, I'm sorry to say that I don't have the capacity to walk you through that process personally. Here's the general steps:

      You'll probably want to download something called an IDE. It's specialized software that helps you write, test, and run your own code. The source code for this project is written in the programming language C# using the IDE VSCode, so I'd suggest you start with a beginner's tutorial on C# in VSCode. This one looks promising: https://www.w3schools.com/cs/cs_getstarted.php

      After completing the tutorial you might feel comfortable importing the source code I've provided and trying some edits of your own.

      Hope that helps!

      Delete
  36. Hi, I can't seem to open the gdrive link. it's taking quite some time to open. Help?

    ReplyDelete
    Replies
    1. Hi Unknown,

      On the G-drive page with the .exe file, there should be a little download icon in the upper right corner of the screen. exe files don't run in google drive so it won't load right there in your browser. You'll need to download it.

      Delete
    2. Hi, I managed to download this. Thanks! Very useful!

      Delete
  37. Has anyone tried to use this with Google Drive Desktop? I managed to use it once. Instead of moving, Autophile duplicated the files (not really an issue). But I'm not sure what happened after that. When I tried to sort the files again, Google Drive gave me a notification saying that if I want to duplicate files, I would have to use the Web App. Hope someone has experienced my situation. Thanks in advance!

    ReplyDelete
  38. Is there a way to use this in Windows 11? It won't install, just keeps looking for required files.

    ReplyDelete