m (Code)
m (Pattern substitutions)
Line 22: Line 22:
  
 
<code lang='bash'>perl -pi -w -e 's/(\d+),(\d+)/$1\.$2/g;' *dat</code>
 
<code lang='bash'>perl -pi -w -e 's/(\d+),(\d+)/$1\.$2/g;' *dat</code>
 +
 +
=== Sanitize CVS files ===
 +
 +
The following will sanitize [http://en.wikipedia.org/wiki/Comma-separated_values CSV files] from trailing text (headers, comments on lines ''following'' the CSV, etc.):
 +
 +
<code lang='bash'>for f in *.prf; do cat "$f" | perl -ne 'print "$1\n" if /^([ \t]*([-+]?\d*\.?\d+([eE][-+]?\d+)?,)*[ \t]*[-+]?\d*\.?\d+([eE][-+]?\d+)?)/' > $f.dat ; done</code>
 +
 +
This is a variation to keep all lines which have ''exactly'' 2 values (replace {1} by {$n-1$} to have exactly $n$ values per line):
 +
 +
<code lang='bash'>for f in *.prf; do cat "$f" | perl -ne 'print "$1\n" if /^(([ \t]*[-+]?\d*\.?\d+([eE][-+]?\d+)?,){1}[ \t]*[-+]?\d*\.?\d+([eE][-+]?\d+)?)/' > $f.dat ; done</code>
 +
 +
Be sure that you understand the script so that the sanitization goes to the depth you wish it to.
  
 
== Pretty print ==
 
== Pretty print ==
  
 
We use [http://www.mediawiki.org/wiki/User:Gri6507 Paul Grinberg]'s [http://www.mediawiki.org/wiki/Extension:Code Code extension] for [[Mediawiki]] to pretty-print source through [http://qbnz.com/highlighter/ GeSHi] on our web.
 
We use [http://www.mediawiki.org/wiki/User:Gri6507 Paul Grinberg]'s [http://www.mediawiki.org/wiki/Extension:Code Code extension] for [[Mediawiki]] to pretty-print source through [http://qbnz.com/highlighter/ GeSHi] on our web.

Revision as of 16:28, 20 September 2012

{{{1}}}

Contents

Code

This page is still largely in progress.

This is a list of code we make available with no guarantee, beside the one that it did once work for its intended purpose.

Beware, version below one (e.g., v°0.1) are $\beta$-version. It might be that's all you find here.

  1. stampit — to stamp pdf files after their name.
  2. sanitize — to remove accents & special characters from filenames.
  3. putInDir — to move files inside directories bearing their name.
  4. uniqname — to generate a timestamp which can be used as a unique name.

Pattern substitutions

See Jukka “Yucca” Korpela's cheatsheet for regexps (archived)

Replace comma-separated digits by their point-separated counterpart

E.g, 123,45 → 123.45. To change in all .dat files:

perl -pi -w -e 's/(\d+),(\d+)/$1\.$2/g;' *dat

Sanitize CVS files

The following will sanitize CSV files from trailing text (headers, comments on lines following the CSV, etc.):

for f in *.prf; do cat "$f" | perl -ne 'print "$1\n" if /^([ \t]*([-+]?\d*\.?\d+([eE][-+]?\d+)?,)*[ \t]*[-+]?\d*\.?\d+([eE][-+]?\d+)?)/' > $f.dat ; done

This is a variation to keep all lines which have exactly 2 values (replace {1} by {$n-1$} to have exactly $n$ values per line):

for f in *.prf; do cat "$f" | perl -ne 'print "$1\n" if /^(([ \t]*[-+]?\d*\.?\d+([eE][-+]?\d+)?,){1}[ \t]*[-+]?\d*\.?\d+([eE][-+]?\d+)?)/' > $f.dat ; done

Be sure that you understand the script so that the sanitization goes to the depth you wish it to.

Pretty print

We use Paul Grinberg's Code extension for Mediawiki to pretty-print source through GeSHi on our web.