{{{1}}}

Contents

Sanitize

sanitize is a bash script to replace special and accented characters in a filename to their best-match in the ASCII code.

See a blog post for a discussion of the necessity and merit of this task.

Usage

Files in a given directory

Create a file called sanitize with the source below, make it executable and in the directory where files are to be sanitized, run:

for f in *; do ./sanitize "$f"; done

If you are happy with the output, uncomment the mv line. If required, extend the transliteration table:

s@XXX@YYY@g 

where XXX will be replaced by YYY, e.g.,

s@æ@ae@g 

Octal code are possible. Use "ls -b" to figure out which they are.

All files in all subdirectories

Also make the run-sanitize file (source below) and run instead:

find . -type d -exec sh -c "cd \"{}\" && run-sanitize \"*\""  \;

Directories

If you want to use the script to change not only the filename but also the name of directories, you can use the following trick (to put/replace in the Sanitize script):

mkdir -p "`dirname $sanitized`" cp $1 $sanitized

This should be run with something like:

find . -type f -exec sanitize {} \;

What it does is to recreate the directory tree, sanitized according to your transliteration table, and copy the (also sanitized) files within. If you are happy with the result, you can then delete the original structure (not done by the script itself for security).

Source

sanitize

#!/bin/bash

  1. ____ _ _ _
  2. / ___| __ _ _ __ (_) |_(_)_______
  3. \___ \ / _` | '_ \| | __| |_ / _ \
  4. ___) | (_| | | | | | |_| |/ / __/
  5. |____/ \__,_|_| |_|_|\__|_/___\___|
  6. sanitize v0.1
  7. FP Laussy -- fabrice.laussy@gmail.com
  8. http://laussy.org
  9. Sun Jun 5 17:20:50 CEST 2011
  10. (building on TeX+ :)
  11. This script remove special characters in filenames
  12. according to a transliteration table given below.
  13. Usage:
  14. Caution: this is potentially harmful!
  15. Use only if you know what you are doing.
  16. To use in the files within the same directory:
  17. for f in *; do ./sanitize "$f"; done # # To go recursively through subdirectories: # # find . -type d -exec sh -c "cd \"{}\" && ./run-sanitize \"*\"" \; # # where run-sanitize is provided separately. (it's essentially the # command above put in a script). sanitized=`echo $1 | sed '

/^%/d

  1. begin transliteration table:

s@ @_@g s@Á@A@g s@Æ@AE@g s@Ê@E@g s@É@E@g s@Ë@E@g s@Ì@I@g s@Ý@Y@g s@Ù@U@g s@Ú@U@g s@Ñ@N@g s@\o323@O@g s@à@a@g s@æ@ae@g s@á@a@g s@ê@e@g s@é@e@g s@è@e@g s@ë@e@g s@ì@i@g s@ñ@n@g s@ó@o@g s@ú@u@g s@\o350@e@g s@\o351@e@g s@\o353@e@g s@\o364@o@g s@\o363@o@g s@\o361@n@g s@\[@(@g s@\]@)@g

  1. end transliteration table

'`

if [[ $1 != $sanitized ]] then echo $1 "-->" $sanitized

  1. mv "`pwd`/$1" "`pwd`/$sanitized"

fi

run-sanitize

To be used for propagating through subdirectories. The script "sanitize" must then be callable from anywhere (put it in /usr/local/bin for instance)

#!/bin/bash

  1. echo `pwd`

for f in $*; do sanitize "$f" done

History