Not Wikilambda:Setup

From Not Wikilambda
Jump to navigation Jump to search

This page describes the setup of Not Wikilambda. It’s only interesting if you want to know more about running a MediaWiki install on Toolforge, or investigate problems with this particular wiki. It’s probably not a useful resource on how to run the WikiLambda extension.

The entire wiki runs under the notwikilambda tool on Wikimedia Toolforge. If you have a Toolforge account, you can see most of the files belonging to the tool in its home directory, ~tools.notwikilambda. (Paths below will be relative to that base directory.) Changes to the setup are logged in the tool’s server admin log.

MediaWiki core is cloned from Git / Gerrit into public_html/w/, and extensions and skins are likewise cloned into subdirectories of public_html/w/extensions/ and public_html/w/skins/, respectively. I considered symlinking them in from /data/project/shared/, which has Git checkouts of MediaWiki-related code, but ultimately decided against it when one of the extensions used on the wiki, WSOAuth, turned out to be broken in the Gerrit version and needed to be cloned from GitHub instead (T263955, now resolved); I was also uneasy with the fact that this would mean Git updates would happen at random times outside my control, and not at the same time as composer updates and the update.php maintenance script.

Dependencies are installed via Composer (I considered symlinking /data/project/shared/mediawiki/vendor/ but again decided against it). The public_html/w/composer.local.json file configures Composer to install dependencies not just of MediaWiki, but also of all extensions and skins.

Short URLs are configured in the .lighttpd.conf config file.

The wiki uses Toolforge user databases to store its data, by copying the user and password from replica.my.conf and prefixing the database name with s54524__. You can see the $wgDBserver, $wgDBname and $wgDBuser in the public_html/w/LocalSettings.php, but the $wgDBpassword is in public_html/w/PrivateSettings.php, which is not readable to others.

MediaWiki is configured to use the CACHE_DB type, since no other type seems suitable. (Toolforge does not offer a memcached service. There is a Redis service, but it is shared between a tools, and the MediaWiki RedisBagOStuff class does not support configuring a cache key prefix which would be needed to make using the shared instance feasible.) Image uploads, anonymous editing and local account creation are disabled.

The PluggableAuth extension and the WSOAuth extension are used to allow login via Wikimedia Meta accounts. An identity-only consumer is used on the Wikimedia end, and configured for WSOAuth. Currently, login sessions on Not Wikilambda are rather short-lived for some reason; I originally thought it was due to having no cache configured, but changing the $wgMainCacheType from CACHE_NONE to CACHE_DB doesn’t seem to have helped.

The “toolforge” and “wikitech” interwiki prefixes were defined by running the following SQL in the mysql.php maintenance script:

INSERT INTO interwiki (iw_prefix, iw_url) VALUES ('toolforge', 'https://iw.toolforge.org/$1'), ('wikitech', 'https://wikitech.wikimedia.org/wiki/$1');

This caused three SQL warnings, because iw_api, iw_wikiid and iw_local have no default values; they should probably be explicitly set to '', '' and 0, respectively. Consequently, I added the “phabricator” prefix with the following SQL:

INSERT INTO interwiki (iw_prefix, iw_url, iw_api, iw_wikiid, iw_local) VALUES ('phabricator', 'https://phabricator.wikimedia.org/$1', '', '', 0);

This time, there were no warnings.

The wiki is kept up to date by two shell scripts, update and cron, both in the tool home directory. update updates the wiki once: it pulls all Git repositories, runs composer update, then the update.php maintenance script, and finally the runJobs.php maintenance script, just in case job execution on page requests isn’t sufficient to clear the queue. cron runs in a cycle of roughly ten minutes: it launches the update script, then nine copies of runJobs.php, always sleeping for a minute after each script, and then repeats. cron runs as a Kubernetes continuous job, configured using the k8s/update/deployment.yaml file. The k8s/ directory is also available on GitHub.

Scribunto enables scripting on the wiki (the WikiLambda extension has no Lua interface yet as far as I’m aware, but we can load the raw page content and decode it as JSON), and WikiEditor and CodeEditor make editing modules and other pages easier.

SyntaxHighlight enables syntax highlighting. Since the PHP container image in which we run MediaWiki does not have Python 3 installed, we use pygments-server, running a small Flask app wrapping Pygments in a separate Kubernetes deployment, and implementing pygmentize as a wrapper shell script that sends a request to the corresponding Kubernetes service. The Kubernetes objects are set up in k8s/pygments-server/deployment.yaml and k8s/pygments-server/service.yaml, the Flask app is in www/python/src/app.py and the wrapper in www/python/src/pygmentize, with configuration in www/python/src/pygmentize.env.

A function-orchestrator runs as another Kubernetes service+deployment, with the configuration files in k8s/function-orchestrator/ and its source code in www/js/function-orchestrator/. The WikiLambda extension is configured to talk to that service at function-orchestrator.tool-notwikilambda:6254 (see $wgWikiLambdaOrchestratorLocation in public_html/w/LocalSettings.php).

A function-evaluator runs as another Kubernetes service+deployment, with the configuration files in k8s/function-evaluator/ and its source code in www/js/function-evaluator/. (Since this evaluates arbitrary user-provided code, we don’t mount the whole tool directory for this container – then users could steal the database password, $wgSecret, etc. – but only www/js/function-evaluator/.) The WikiLambda extension is configured to tell the orchestrator to talk to that service at http://function-evaluator.tool-notwikilambda:6927/1/v1/evaluate/ (see $wgWikiLambdaEvaluatorLocation in public_html/w/LocalSettings.php).

A Kubernetes CronJob restarts the function-orchestrator and function-evaluator every 12 hours (at 9:30 and 21:30, presumably UTC). Both deployments are configured with a readinessProbe, so that the restart should result in no service interruption: Kubernetes will only send traffic to the new pod, and delete the old pod, once the new pod is ready (can be connected to and responds to a HTTP request to /_info/name).