bib2wiki or parsing bibTeX into mediawiki templates

⇠ Back to Blog:Hacks

In a previous post, I was discussing a prospective bib2wiki script to automatize the transcription of my bibTeX database to (this website). This was over 10 years ago, so I forgot what I did back then. I had to do it again. I was mentioning that I had hacked bibtex2html to get the output I wanted, but for the life of me, I could not remember what I did. It does not seem easy to do with bibtex2html itself. Instead, I turned to Perl.

Using the script by itself does nothing. It needs to be given the keys of the sought entries, separated by a comma. So for instance:

bib2wiki -keys=delvalle12a,delvalle13a


<u>[[Theory of Frequency-Filtered and Time-Resolved $N$-Photon Correlations]]</u>. [[E. del Valle]], [[A. González-Tudela]], [[F. P. Laussy]], [[C. Tejedor]] and [[M. J. Hartmann]] in [[Phys. Rev. Lett.]] [ '''109''':183601] ([[2012]]).

<u>[[Distilling one, two and entangled pairs of photons from a quantum dot with cavity QED effects and spectral filtering]]</u>. [[E. del Valle]] in [[New J. Phys.]] [ '''15''':025019] ([[2013]]).

It does what I want, i.e., links to all that is linkable: the paper itself (if it needs its own page, which special papers do), the authors, the journals, the link to the paper by its doi through its volume:page (this used to be in the title but I thought otherwise to highlight papers) and the year. We also attach the paper itself through a little icon Pdf-48px.png if we are authors, as we have the right to do that. It also does not put a comma before "and" that announces the last author, etc.

If one wants all the entries, then:

bib2wiki -keys=all
<u>[[Multilinear formulas and skepticism of quantum computing]]</u>. [[S. Aaronson]] in [[Proceedings of the Thirty-sixth Annual ACM Symposium on Theory of 
Computing]] [ '''''':] ([[2004]]).

<u>[[The computational complexity of linear optics]]</u>. [[S. Aaronson]] and [[A. Arkhipov]] in [[Proceedings of the 43rd Annual ACM Symposium on Theory of
 Computing]] [ '''''':333] ([[2011]]).

[... 3534 entries there ...]

<u>[[Spintronics: Fundamentals and applications]]</u>. [[I. Zutic]], [[I. Fabian]] and [[S. Das Sarma]] in [[Rev. Mod. Phys.]] [ '''76''':323] ([[2004]]).

<u>[[Correlation spectroscopy of excitons and biexcitons on a single quantum dot]]</u>. [[V. Zwiller]], [[P. Jonsson]], [[H. Blom]], [[S. Jeppesen]], [[M.-E. Pistol]], [[L. Samuelson]], [[A. A. Katznelson]], [[E. Yu. Kotelnikov]], [[V. Evtikhiev]] and [[G. Björk]] in [[Phys. Rev. A]] [ '''66''':053814] ([[2002]]).

The idea then is, when writing a text that refers to some references, if those are not yet in the wiki (as templates) then they will appear as broken link in edit mode, in the list of templates used in the page (at the end). It could look something like this:

Screenshot 20230716 164012.png

Copy/past in a file (say templates) on which you run:

cat templates | perl -pe 's/Template:([a-zA-Z]+[0-9]{2}[a-z])\s\(edit\)\.*/\1/p' | sed '/Template/d' | awk -vORS=, '{ print $1 }' | sed 's/,$/\n/'

That will output on one line the list of templates needed:


You can then pass this to bib2wiki

bib2wiki -keys=Kirkwood35a,Kirkwood39a,Kirkwood42a,Kirkwood50a,Lopezcarreno18b,Percus58a,Salsburg53a,Sells53a,Thiele63a,Wertheim63a,Zerniker37a

which will return the list you have to upload:

<u>[[Statistical Mechanics of Fluid Mixtures]]</u>. [[J. G. Kirkwood]] in [[J. Chem. Phys.]] [ '''3''':300] ([[1935]]).

<u>[[Molecular Distribution in Liquids]]</u>. [[J. G. Kirkwood]] in [[J. Chem. Phys.]] [ '''7''':919] ([[1939]]).

<u>[[The Radial Distribution Function in Liquids]]</u>. [[J. G. Kirkwood]] and [[E. Monroe]] in [[J. Chem. Phys.]] [ '''10''':394] ([[1942]]).


Then click the templates and copy/paste the reference. It's fairly fast. I might upgrade it so that bib2wiki uploads the template directly, which was the intention of ten years ago, but it's good to check beforehand as there might have some glitches (not yet supported accents, math mode in the title, etc.)

To upgrade to the new style references that were already present under the different style, one can get the page of all templates and copy the content into the $text variable of the script below:

#! /usr/bin/perl

use Data::Dumper;

Template:!	Template:!/doc	Template:!xt
Template:*	Template:15th	Template:16th
Template:17th	Template:18th	Template:19th
Template:;	Template:=	Template:?
Template:AWB standard installation	Template:According to whom	Template:Alcohol
Template:Alternative links	Template:Amo09a	Template:Anchor
Template:Salsburg53a	Template:Sanchezmunoz14a	Template:Sanchezmunoz14b
Template:Season needed
Template:See	Template:Self-published inline	Template:Self-published source

@arr = ($text =~ /Template:([a-zA-Z]+[0-9]{2}[a-z]*)/g);

foreach (@arr)
            print("{{Template:" . $_ . "}}\n");

That will generate a list which can be put in a sandbox and give access to all the works cited. Then they can be (painfully, but hopefully, only once) updated.

For me on that occasion—and you see I don't even need the sandbox—that was:


which upgraded this reference:

Screenshot 20230716 213634.png

to this one:

Screenshot 20230716 213757.png