Wikipedia
is one of the most-visited web sites on the internet
– in the top 100, according to Alexa.
The total size and traffic of all Wikipedias doubles
in size every four to six months. According to a report
released by Hitwise
(requires free registration) in May, the weekly
market share of U.S. visits to Wikipedia has grown
over 600 percent since the beginning of 2004, making
it the second most visited reference web site. In
other words, Wikipedia now is more popular than Microsoft's
Encarta or The New York Times Company's About.com.
The report also describes how Wikipedia is
becoming a high-powered magnet for internet searches,
with nearly 600,000 "living" articles to date. A
ranking of all web sites based on the total volume
of traffic received directly from search engines
placed Wikipedia at 146 in June 2004. By
September 2004, it had jumped to 93 and to 71 by
December. In March of this year, it was the 33rd
most popular site in terms of visits received from
search engines.
All Wikipedia projects have the same purpose:
to create a free encyclopedia with a neutral point
of view in every language on the internet. Originally,
Wikipedia was envisioned as an incubator
for articles that were to be published in Nupedia,
a peer-reviewed encyclopedia. However, this did
not happen because Wikipedia proved to
be so popular, and the quality so good, that there
was no need for the peer review step.
Wikipedia grew, and eventually versions
in other languages were added. When a need arose
for other projects, they were created. For example,
Wiktionary contains lexicological
information and Commons contains
digital imagery and sound.
It
must seem from the outside that what we have is
total anarchy.
As you can imagine, it is a wonderful experiment
where issues like globalization, internationalization,
localization and translation are part of what we
deal with everyday. Contrary to what the typical
LISA Member has available, we do not have an organizational
structure that decides what to do next. We do not
have policies that determine what content is to
be available in all Wikipedias. We do not
translate content as a rule. Therefore, it must
seem from the outside that what we have is total
anarchy.
Each Wikipedia has its own community of
people that contributes content. And there are several
ways in which articles are conceived. There are
articles where the bulk is written by the first
author; there are articles that start from a small
beginning and evolve over time; and there are translations.
There is also a community of bot operators who link
articles on the same subject.
Editor’s Note: A bot is short for robot.
A piece of software that has been created and designed
to complete certain minor but repetitive tasks automatically
and on-command; often used to search large amounts
of data for certain patterns and return a list of
results. (Source: http://en.wiktionary.org/wiki/Bot)
These links allow people to compare the information
among the separate Wikipedias, at the same
time that they encourage cross-fertilization. This
is one of the important mechanisms to ensure that
the global aims of the Wikimedia Foundation are
supported and maintained, i.e., making information
freely available that is based on a neutral point
of view.
NOTE: Under the terms of our license, the copyright
remains with the person who originally created the
content. Who the author is can be found through
the history of the articles. Possible relations
between articles and translations can be found through
interwiki links.
The question of who owns the rights to the
translated version of a Wikimedia article is, to
a large extent, a non-issue. When an article is
published in one of the Wikimedia projects, it is
published either under a GNU-FDL or a Public Domain
license. Therefore, an article does not have a monetary
value.
Typically, Wikimedia articles are not literal
translations. Instead, they are usually rewritten,
and the link to the original article is maintained
through interwiki links. When a contributor provides
a more literal translation, it is often mentioned
that it was translated from a particular Wikipedia.
This does not mean, however, that only one person
can claim this text. Typically, articles are the
result of the cooperation of many authors.
In summary, nobody really owns any Wikimedia
content. We are, however, proud of our contributors,
and we honor them in our article history and in
our statistics.
Each
Wikipedia project has a different view of “the Truth.”
As the projects grow, we find that they have different
values and a different view of “the Truth.” These
are the issues where culture comes into play. Personally,
I have never handled a gun, and I am proud of it.
However, many Americans do not appreciate that carrying
a gun is not a God-given right, but rather that
it is simply an attitude based on a particular cultural
view. Due to these differences in outlook, an article
on guns will be different in the Dutch and the English
versions of the Wikipedia. Another controversial
subject is religion. However, as I write this article,
I have discovered that the Dutch Wikipedia has
substantially more articles on Roman Catholicism
than does the Italian Wikipedia, contrary
to what one might expect.
The
Wikimedia Foundation is run by a “benevolent dictator.”
An organization to coordinate all of the various projects
does exist in the form of the Wikimedia
Foundation (WMF), which is the owner of
the servers that host the Wikimedia projects. The
WMF has a board, headed by the founder of both Wikipedia
and the WMF, Jimmy Wales. Wales is a great believer
in “dolce far niente,” leaving as much of the decisionmaking
process as possible to the individual communities.
Even in his role as WMF Chairman, he leaves decisions
very much up to the other board members. The result
is that he has the moral authority to do good.
Localization in a Free Content
Community: Wiktionary
Wiktionary is the lexicological
sister project of Wikipedia. When lexicological
content was added to Wikipedia, there were
many people who believed that it did not fit well
within the encyclopedic format. Therefore, it resulted
in a new project: Wiktionary. For a long
time, Wiktionary was an English language
project only; however, its aim was to include all
words of all languages. This has resulted in a wealth
of Chinese content, along with parallel efforts
in many other languages.
All of these Wiktionaries had the same
goal, i.e., all words of all languages – an ambitious
project which soon became overambitious. It would
have meant thousands of projects all aiming at the
same Holy Grail.
Editor’s Note: An artifact in
Christian mythology, being either the cup used at
the Last Supper or a cup that caught some of Christ's
blood during the Crucifixion. It is used in this
instance to mean that the goal is unattainable.
(Source: http://en.wiktionary.org/wiki/Holy_grail)
There was a need for a solution, and it was partially
resolved by implementing templates. This way {{-noun-}}
could be understood as noun in English
and Zelfstandig naamwoord in Dutch. This
was an improvement, as it enabled users to easily
copy content from one Wiktionary to another.
However, it was only a partial solution in that
several projects did not adopt the same templates,
thus preventing updates.
Our challenge will be to translate
the user interface in as many languages as possible.
Using this imperfect system of templates has taught
us that 80% of the lexicological content can be
expressed using templates. The next step will be
for us to combine all the language-independent content
in a database. Our challenge will be to translate
the user interface in as many languages as possible.
This is the first hurdle to make the Ultimate
Wiktionary accessible in any language. The
next step is to encourage people with language knowledge
to contribute to the Wiktionary by providing
descriptions and etymological information for various
terms.
The Ultimate Wiktionary will become extremely
relevant, based on the special content that it will
contain. For example, we plan to include the GEMET
thesaurus, the ecological resource of
the European Community (EC). It will also be possible
for users to add content in other languages, making
the original thesaurus even more accessible and
more valuable to more people. We hope to be able
to cooperate with organizations such as the EC in
order to host other glossaries and thesauri. As
everyone is invited to contribute content, we envision
this content being translated into many more languages
and thus resulting in increased trade opportunities
for the EC.
The current Wiktionaries will be converted
to the Ultimate Wiktionary. This means
that people will have access to the Dutch Wiktionary
with many words in Papiamento, the Italian
Wiktionary with many Neapolitan words,
and the Kurdish Wiktionary with words in
many different Kurdish dialects. The goal with the
Ultimate Wiktionary is to overcome the
fractured nature of individual Wiktionaries.
By combining them into one central repository, people
will be able to access a much greater variety of
content, thus enabling the Ultimate Wiktionary
to be greater than the sum of its separate Wiktionary
parts.
We
would like to see TBX implemented as one way to
expand our content.
One novelty (for us) is that we want to allow for
the automatic upload and download of data. Obviously,
in this day and age, XML is the method to explore.
As TBX
is one of the important XML standards related
to terminology, we would really like to see it implemented
as one of the methods to expand our content.
Localization in a Free Content Community: Commons
Commons
is a recent Wikimedia project started in October
2004. It is a central repository for digital imagery
and sound, with more than 87,000 free images and
3,000 sound files for all Wikimedia projects. It
is obvious that as the content grows, it will become
increasingly important to have a system that enables
people to quickly locate the images they need. Currently,
most of the pictures are categorized in English,
or can be located in English content that contains
pictures. To make Commons accessible, the Wikimedia
Foundation must localize the content.
How Work Gets Done in an Organization Like the
Wikimedia Foundation
The
quality and quantity of what is accomplished is
astounding.
For some, this is the most interesting and surprising
aspect of the WMF – how the work actually gets done.
Even though most work is done by volunteers who
do what they want when they feel
like it, the quality and quantity of what is accomplished
is astounding.
And it is important to note that content is not
the only area in which we depend on volunteers –
they also do the programming and administration.
In the true spirit of free software, we have started
to pay for some work, though. When commercial organizations
want to host our content, they must now pay for
the RSS feed. This contribution supports one programmer.
A grant in the sum of EUR 5,000 will pay for a programmer
to develop the Ultimate Wiktionary. Obviously,
we are extremely thrifty when it comes to money.