m (Pattern substitutions)
m (Sanitize CVS files)
Line 33: Line 33:
 
<code lang='bash'>for f in *.prf; do cat "$f" | perl -ne 'print "$1\n" if /^(([ \t]*[-+]?\d*\.?\d+([eE][-+]?\d+)?,){1}[ \t]*[-+]?\d*\.?\d+([eE][-+]?\d+)?)/' > $f.dat ; done</code>
 
<code lang='bash'>for f in *.prf; do cat "$f" | perl -ne 'print "$1\n" if /^(([ \t]*[-+]?\d*\.?\d+([eE][-+]?\d+)?,){1}[ \t]*[-+]?\d*\.?\d+([eE][-+]?\d+)?)/' > $f.dat ; done</code>
  
Be sure that you understand the script so that the sanitization goes to the depth you wish it to.
+
Be sure that you understand the script so that the sanitization goes to the depth you wish it to (for instance in its present form the first script-line will keep lines in the CSV file with a number but no comma as a valid line with one value).
  
 
== Pretty print ==
 
== Pretty print ==
  
 
We use [http://www.mediawiki.org/wiki/User:Gri6507 Paul Grinberg]'s [http://www.mediawiki.org/wiki/Extension:Code Code extension] for [[Mediawiki]] to pretty-print source through [http://qbnz.com/highlighter/ GeSHi] on our web.
 
We use [http://www.mediawiki.org/wiki/User:Gri6507 Paul Grinberg]'s [http://www.mediawiki.org/wiki/Extension:Code Code extension] for [[Mediawiki]] to pretty-print source through [http://qbnz.com/highlighter/ GeSHi] on our web.

Revision as of 16:30, 20 September 2012

{{{1}}}

Contents

Code

This page is still largely in progress.

This is a list of code we make available with no guarantee, beside the one that it did once work for its intended purpose.

Beware, version below one (e.g., v°0.1) are $\beta$-version. It might be that's all you find here.

  1. stampit — to stamp pdf files after their name.
  2. sanitize — to remove accents & special characters from filenames.
  3. putInDir — to move files inside directories bearing their name.
  4. uniqname — to generate a timestamp which can be used as a unique name.

Pattern substitutions

See Jukka “Yucca” Korpela's cheatsheet for regexps (archived)

Replace comma-separated digits by their point-separated counterpart

E.g, 123,45 → 123.45. To change in all .dat files:

perl -pi -w -e 's/(\d+),(\d+)/$1\.$2/g;' *dat

Sanitize CVS files

The following will sanitize CSV files from trailing text (headers, comments on lines following the CSV, etc.):

for f in *.prf; do cat "$f" | perl -ne 'print "$1\n" if /^([ \t]*([-+]?\d*\.?\d+([eE][-+]?\d+)?,)*[ \t]*[-+]?\d*\.?\d+([eE][-+]?\d+)?)/' > $f.dat ; done

This is a variation to keep all lines which have exactly 2 values (replace {1} by {$n-1$} to have exactly $n$ values per line):

for f in *.prf; do cat "$f" | perl -ne 'print "$1\n" if /^(([ \t]*[-+]?\d*\.?\d+([eE][-+]?\d+)?,){1}[ \t]*[-+]?\d*\.?\d+([eE][-+]?\d+)?)/' > $f.dat ; done

Be sure that you understand the script so that the sanitization goes to the depth you wish it to (for instance in its present form the first script-line will keep lines in the CSV file with a number but no comma as a valid line with one value).

Pretty print

We use Paul Grinberg's Code extension for Mediawiki to pretty-print source through GeSHi on our web.