User:KovachevBot/Bulgarian Audio

From Wiktionary, the free dictionary
Jump to navigation Jump to search

General[edit]

This is a basic script which does the task of traversing my contributions on Wikimedia Commons, taking filenames as appropriate, and migrating the audio files I uploaded via LinguaLibre onto here. It does this only when the layout of the page is simple enough that there can be no ambiguity, e.g. if there's only a single L3 pronunciation header, or there's only one etymology and there's no pronunciation yet (so it can be created from scratch).

The purpose is to reduce the amount of effort I need to put in to get these on: up until I automated it, it had all been manual, and I had made several hundred vacuous edits like this. Now it'll be another couple hundred, but this time due to ambiguous page layouts, not in vain. For example, for words with multiple homographs with different pronunciations, it'll still be required to manually process them. Fortunately, I have a separate script to go through the failure cases from this task, which allows that to be made somewhat efficient as well.

The ultimate goal is to record the entirety of Category:Bulgarian lemmas and beyond, and my target is around 500 a few hundred new recordings on a good day. Of these, probably 80% or more are able to be transferred automatically, leaving little manual audio work for me to do.

Code is on GitHub.