[Anubad] [OT] How do people's names differ around the world, and what are the implications of those differences on the design of forms, ontologies, etc. for the Web?

sankarshan.mukhopadhyay at gmail.com sankarshan.mukhopadhyay at gmail.com
Tue Jul 26 03:00:33 PDT 2011


<http://www.w3.org/International/questions/qa-personal-names>

People who create web forms, databases, or ontologies are often unaware
how different people’s names can be in other countries. They build their
forms or databases in a way that assumes too much on the part of foreign
users. This article will first introduce you to some of the different
styles used for personal names, and then some of the possible
implications for handling those on the Web.

This article doesn't provide all the answers - indeed in some cases it
may not be clear what the best answer is. It attempts to mostly
sensitize you to some of the key issues by way of an introduction.

There are a couple of key scenarios to consider.

You are designing a form in a single language (let’s assume English)
that people from around the world will be filling in.

You are designing a form in a one language but the form will be adapted
to suit the cultural differences of a given locale when the site is
translated.

In reality, you will probably not be able to localise for every
different culture, so even if you rely on approach 2, some people will
still use a form that is not intended specifically for their culture.

To get started, let’s look at some examples of how people’s names can be
different around the world.

In the name Björk Guðmundsdóttir Björk is the given name. The second
part of the name indicates the father’s (or sometimes the mother’s)
name, followed by -sson for a male and -sdóttir for a female, and is
more of a description than a family name in the Western sense. Björk’s
father, Guðmundor, was the son of Gunnar, so is known as Guðmundur
Gunnarsson.

Icelanders prefer to be called by their given name (Björk), or by their
full name (Björk Guðmundsdóttir). Björk wouldn’t normally expect to be
called Ms. Guðmundsdóttir. Telephone directories in Iceland are sorted
by given name.

Other cultures where a person has one given name followed by a
patronymic include parts of Southern India, Malaysia and Indonesia.

In the Malay name Isa bin Osman the word 'bin' means 'son of' ('binti'
is used for women). If you refer to this person you might say Mr. Isa,
or if you know him personally, Encik Isa (Encik is an Indonesian word
rather like Mr.).

In the name 毛泽东 (mao ze dong) the family name is Mao, ie. the first
name when reading (left to right). The given name is Dong. The middle
character, Ze, is a generational name, and is common to all his siblings
(such as his brothers and sister, 毛泽民 (mao ze min), 毛泽覃 (mao ze
tan), and 毛澤紅 (mao ze hong)).

Among people who are not on familiar terms, Mao may be referred to as 毛
泽东先生 (mao ze dong xiān shēng) or 毛先生 (mao xiān shēng) (xiān shēng
being the equivalent of Mr.). Although not everyone has a generational
name these days, especially in Mainland China, those that do have one
expect it to be used together with their given name. Thus, if you are on
familiar terms with someone called 毛泽东, you would normally refer to
them using 泽东 (ze dong), not just 东 (dong).

Note also that the names are not separated by spaces.

The order family name followed by given name(s) is common in other
countries, such as Japan, Korea and Hungary.

Chinese people who deal with Westerners will often adopt an additional
given name that is easier for Westerners to use. For example, Yao Ming
(family name Yao, given name Ming) may write his name for foreigners as
Fred Yao Ming or Fred Ming Yao.

Spanish-speaking people will commonly have two family names. For
example, Maria-Jose Carreño Quiñones may be the daughter of Antonio
Carreño Rodríguez and María Quiñones Marqués.

You would refer to her as Señorita Carreño, not Señorita Quiñones.

We already saw that the patronymic in Iceland ends in -son or -dóttir,
depending on whether the child is male or female. Russians use
patronymics as their middle name but also use family names, in the order
given-patronymic-family. The endings of the patronymic and family names
will indicate whether the person in question is male or female. For
example, the wife of Борис Никола́евич Ельцин (Boris Nikolayevich
Yeltsin) is Наина Иосифовна Ельцина (Naina Iosifovna Yeltsina) – note
how the husband’s names end in consonants, while the wife’s names (even
the patronymic from her father) end in a.

Americans often write their name with a middle initial, for example,
John Q. Public. Often forms designed in the USA assume that this is
common practice, whereas even in the UK, where people may indeed have
(one or more) middle names, this is often seen as a very American
approach. People in Korea, who typically do have 3 names but who don't
usually initialise them, may be confused about how to deal with such
forms. Bear in mind, also, that many people who do use an initial in
their name may use it at the beginning.

It would be wrong to assume that members of the same family share the
same family name. There is a growing trend in the West for wives to keep
their own name after marriage, but there are other cultures, such as
China, where this is the normal approach. In some countries the wife may
or may not take the husband's name. If the Malay girl Zaiton married
Isa, mentioned above, she may remain Mrs. Zaiton, or she may choose to
become Zaiton Isa, in which case you might refer to her as Mrs. Isa.

Spanish names approach this slightly differently. In 1996 Manuel A.
Pérez-Quiñones described the names in his family. As mentioned above,
his family names, known as apellidos, became Pérez Quiñones because his
father's apellidos were Pérez Rodríguez and his mother's apellidos were
Quiñones Alamo. In time, he courted a girl with the apellidos Padilla
Falto. When they married, her apellidos became Padilla de Pérez. Their
children were called Pérez Padilla, and so on. The point here is that
only the children in the family have the same apellidos.

You should also not simply assume that name adoption goes from husband
to wife. Sometimes men take their wife's name on marriage. It may be
better, in these cases, for a form to say 'Previous name' than 'Maiden
name' or 'née'.

Many cultures mix and match these differences from Western personal
names, and add their own novelties.

For example, Velikkakathu Sankaran Achuthanandan is a Kerala name from
Southern India, usually written V. S. Achuthanandan which follows the
order familyName-fathersName-givenName.

In many parts of the world, parts of names are derived from titles,
locations, genealogical information, caste, religious references, and so
on. Here are a few examples:

the Tamil name Kogaddu Birappa Timappa Nair follows the order
villageName-fathersName-givenName-lastName.

the Rajasthani name Aditya Pratap Singh Chauhan is composed of
givenName-fathersName-surname-caste.
in another part of India the name Madurai Mani Iyer represents
townName-givenName-casteName.

the Arabic Abu Karim Muhammad al-Jamil ibn Nidal ibn Abdulaziz
al-Filistini translates as "Father-of-Karim, Muhammad (given name), The
beautiful, Son of Nidal, Son of Abdulaziz, the Palestinian". Karim is
Muhammad's first-born son. (For more details about this rich naming
tradition, see Wikipedia.)

In Thailand people have a nickname, that is usually not related to their
actual name, and will generally use this name to address each other in
non-formal situations. (They will also typically introduce themselves to
Westerners with this name, since it is usually only one or two syllables
and therefore easier to pronounce.) Former prime minister Thaksin
Shinawatra has the nickname Maew (แม้ว). Often they will have different
nicknames for family and friends.

In Vietnam, names such as Nguyễn Tấn Dũng follow the order
family-middle-given name. Although this seems similar to the Chinese
example above, even in a formal situation this Prime Minister of Vietnam
is referred to using his given name, ie. Mr. Dung, not Mr. Nguyen.

The information above uses only simple cases to describe a number of
significant divergences in the way people construct names. The reality,
even within a single culture, is typically even more complicated.
Wikipedia sports a large number of fascinating articles about how
people’s names look in various cultures around the world. I strongly
recommend a perusal of the follow links.


As mentioned above, one possible approach is to localize forms for a
particular culture. In theory this should allow you to tailor your forms
exactly to the needs of the audience. Unfortunately, there may still be
a number of possible disadvantages to this approach:

If you need to centralise data from several locales in a single
database, using localized forms may only put off until later the time
you need to store the data the difficulties of synthesizing the
information across cultures.
Even within a single country people will typically have different ways
of forming personal names, for example there may be foreigners living in
the country, there may be different cultural elements within the country
(eg. Singaporeans have names of Chinese, Malay and South Indian origin),
or there may just be more than one way of using names. Therefore your
forms will often need to allow for some flexibility.
In what follows we propose some general guidelines that may help.
Unfortunately, this is a complex topic and the suggestions here are for
the very general case, and don't address all the issues.

If designing a form or database that will accept names from people with
a variety of backgrounds, you should ask yourself whether you really
need to have separate fields for given name and family name.

This will depend on what you need to do with the data, but obviously it
will be simpler, where it is possible, to just use the full name as the
user provides it.

Your profile

Full name

Bear in mind that names in some cultures can be quite a lot longer than
your own. Make input fields long enough to enter long names, and ensure
that if the name is displayed on a web page later there is enough space
for it. Also avoid limiting the field size for names in your database.

If you do still feel you need to ask for constituent parts of a name
separately, try to avoid using the labels ‘first name’ and ‘last name’
in non-localized forms, since these can be confusing for people who
normally write their family name followed by given names.

Your profile

Family name
Other/given names
For some cultures this is still problematic (for example Icelanders, who
don't actually have family names), but, short of very localized
customization, this is probably the best we can make a generic form.

In some cases you want to identify parts of a name so that you can sort
a list of names alphabetically, contact them, etc. Consider whether it
would make sense to have one or more extra fields, in addition to the
full name field, where you ask the user to enter the part(s) of their
name that you need to use for a specific purpose.

Sometimes you may opt for separate fields because you want to be able to
use part of the name to address the person directly, or refer to them.
For example, when Google+ refers to "Richard's contacts". Or perhaps
it's because you want to sent them emails with their name at the top.
Note that you may not only have problems due to name syntax here, but
you also have to account for varying expectations around the world with
regards to formality. It may be better to ask separately, when setting
up a profile for example, how that person would like you to address them.


Your profile

Full name

What should we call you? (for example, when we send you mail?)

This extra field would also be useful for finding the appropriate name
from a long list, and for handling Thai nicknames.

By the way, for sorting Japanese names you will need an additional field
for them to type how their name is pronounced, since you can't always
tell how to pronounce it from the ideographic characters. Such
pronunciation information is used for sorting Japanese names.

Also, if you have separate fields for parts of a person's name, ensure
that you label clearly which parts you want where. For example, don't
assume that the order they will provide names in will be given followed
by family.

Be careful, also, about assumptions built into algorithms that pull out
the parts of a name automatically. For example, the v-card and h-card
approach of implied “n” optimization could have difficulties with, say,
Chinese names. You should be as clear as possible about telling people
how to specify their name so that you capture the data you think you need.

Don't assume that a single letter name is an initial. People do have
names that are one letter long. These people can have problems if the
form validation refuses to accept their name and demands that they
supply their name in full. If you want to encourage people not to use
initials, perhaps you should make that a warning message, rather than
block the form submission.

Don't forget to allow people to use punctuation such as hyphens,
apostrophes, etc. in names. Don't require names to be entered all in
upper case - this can be difficult on a mobile device. Allow the user to
enter a name with spaces, eg. to support prefixes and suffixes such as
de in French, von in German, and Jnr/Jr in American names.

Don't assume that members of the same family will share the same family
name.

As mentioned earlier, because it is not only women who change their
family names, it may be better for a form to ask for 'Previous name'
rather than 'Maiden name' or 'née' .

If you are designing forms that will be localized on a per culture
basis, don’t forget that atomized name parts may still need to be stored
in a central database, which therefore needs to be able to represent all
the various complexities that you dealt with by relegating the form
design to the localization effort.

The first thing that English speakers must remember about other people’s
names is that a large majority of them don’t use the Latin alphabet, and
of those that do, a majority use accents and characters that don’t occur
in English. It seems obvious, once it is said, but it has some important
consequences for designers that are often overlooked.

If you are designing an English form you need to decide whether you are
expecting people to enter names in their own script (eg. 小林康宏) or in
an Latin-only transcription (such as Yasuhiro Kobayashi), or both.

Remember that even names in English may involve non-ASCII characters
(eg. Zoë).

On the other hand, there are some situations, such as a log-in name on
an ASCII-only system, where you can't permit non-ASCII characters.

What people will type into the form will often depend on whether the
form and its page is in their language or not. If the page is in their
language, don’t be surprised to get back non-Latin or accented Latin
characters.

In terms of letters, ASCII-only means using the basic letters of the
English alphabet, ie. ABCDEFGHIJKLMNOPQRSTUVWXYZ (upper- and lowercase).
If you hope to get Latin- or ASCII-only, you need to tell the user.

Don't forget to tell the people who are translating your page to explain
this to users.

Your profile

Full name   (Use only letters A to Z with no accents.)

The decision about which is most appropriate will depend to some extent
on what you are collecting people’s names for, and how you intend to use
them.

Are you collecting the person’s name just to have an identifier in your
system? If so, it may not matter whether the name is stored in
ASCII-only or native script.
Or do you plan to call them by name on a welcome page or in
correspondence? If you will correspond using their name on pages written
in their language, it would seem sensible to have the name in the native
script.
Is it important for people in your organization who handle queries to be
able to recognise and use the person’s name? If so, you may want to ask
for a Latin transcription.
Will their name be displayed or searchable (for example Flickr
optionally shows people’s names as well as their user name on their
profile page)? Or will you want to send them correspondence in their own
language, but track them in your back-office in a language such as
English? If so, you may want to store the name in both Latin and native
scripts, in which case you probably need to ask the user to submit their
name in both native script and Latin-only form, using separate fields.
Your profile

Name (in your alphabet)

Name (Latin alphabet)

If you do accept non-ASCII names, you should use a Unicode character
encoding (eg. UTF-8) in your pages, your back end databases and in all
the software code in between. This will significantly simplify your life.

Lists of names are not always sorted by family name around the world.
For example, Thai and Icelandic people expect lists to be sorted by
given name instead.

In another example, sort orders can also be different in different parts
of the Spanish-speaking world. For instance, Maria-Jose Carreño Quiñones
from Peru would expect to find her name in a list by looking up Carreño
Quiñones. Marie-Jose Carreño Quiñones from Mexico, however, would expect
her name to be sorted by Quiñones.

Different levels of formality apply in different cultures. When
addressing someone you need to take this into account. Whereas given
names are becoming a popular form of address in Western and technology
circles, it is by no means universally appropriate. Contacting someone
for the first time in the UK using their given name can sometimes imply
that you have previously met them.

On the other hand, addressing someone using a title and given name (eg.
"Mr. Richard") or just by their family name (eg. "Ishida!") are
acceptable in some parts of the world, but not in others (such as the UK).

In Germany, titles are important, and you may need to refer to someone
as not just Mr. Schmidt, but Herr Profesor Doktor Schmidt.

In a culture such as that in Japan, it is normal to add an honorific or
job title to the name of someone you contact. For example, it would be
expected to refer to someone as Tanaka-san or Tanaka-sama (depending on
your relationship to them). A departmental manager named Tanaka would
expect to be referred to as Tanaka-bucho (Department-head Tanaka) by the
people who report to him. Although you can attach -san to given names,
it would be very unusual to do so with people in the work environment.

-- 
sankarshan mukhopadhyay
<http://sankarshan.randomink.org/blog/>






More information about the Anubad mailing list