sanitize is a bash script to replace special and accented characters in a filename to their best-match in the ASCII code.
See a blog post for a discussion of the necessity and merit of this task.
Contents |
Make a file sanitize with the source below executable and in the directory where files are to be fixed, run:
for f in *; do ./sanitize "$f"; done
If you are happy with the output, uncomment the mv line. If required, extend the transliteration table:
s@XXX@YYY@g
where XXX will be replaced by YYY, e.g.,
s@æ@ae@g
Octal code are possible. Use "ls -b" to figure out which they are.
Also make the run-sanitize file (source below) and run instead:
find . -type d -exec sh -c "cd \"{}\" && ./run-sanitize \"*\"" \;
#!/bin/bash # ____ _ _ _ # / ___| __ _ _ __ (_) |_(_)_______ # \___ \ / _` | '_ \| | __| |_ / _ \ # ___) | (_| | | | | | |_| |/ / __/ # |____/ \__,_|_| |_|_|\__|_/___\___| # # sanitize v0.1 # FP Laussy -- fabrice.laussy@gmail.com # http://laussy.org # Sun Jun 5 17:20:50 CEST 2011 # (building on TeX+ :) # # This script remove special characters in filenames # according to a transliteration table given below. # # Usage: # Caution: this is potentially harmful! # Use only if you know what you are doing. # # To use in the files within the same directory: # # for f in *; do ./sanitize "$f"; done # # To go recursively through subdirectories: # # find . -type d -exec sh -c "cd \"{}\" && ./run-sanitize \"*\"" \; # # where run-sanitize is provided separately. (it's essentially the # command above put in a script). sanitized=`echo $1 | sed ' /^%/d #begin transliteration table: s@ @_@g s@Á@A@g s@Æ@AE@g s@Ê@E@g s@É@E@g s@Ë@E@g s@Ì@I@g s@Ý@Y@g s@Ù@U@g s@Ú@U@g s@Ñ@N@g s@\o323@O@g s@à@a@g s@æ@ae@g s@á@a@g s@ê@e@g s@é@e@g s@è@e@g s@ë@e@g s@ì@i@g s@ñ@n@g s@ó@o@g s@ú@u@g s@\o350@e@g s@\o351@e@g s@\o353@e@g s@\o364@o@g s@\o363@o@g s@\o361@n@g s@\[@(@g s@\]@)@g #end transliteration table '` if [[ $1 != $sanitized ]] then echo $1 "-->" $sanitized #mv "`pwd`/$1" "`pwd`/$sanitized" fi
To be used for propagating through subdirectories.
#!/bin/bash #echo `pwd` for f in $*; do ./sanitize "$f" done