Title case a string containing one or more last names while handling names with apostrophes

85.6k Views Asked by At

I want to standardize a user-supplied string. I'd like the first letter to be capitalized for the name and if they have entered two last names, then capitalize the first and second names. For example, if someone enters:

marriedname maidenname

It would convert this to Marriedname Maidenname and so on if there is more than two names.

The other scenario is when someone has an apostrophe in their name. If someone enters:

o'connell

This would need to convert to O'Connell.

I was using:

ucfirst(strtolower($last_name));

However, as you can tell that wouldn't work for all the scenarios.

12

There are 12 best solutions below

2
On BEST ANSWER

This will capitalize all word's first letters, and letters immediately after an apostrophe. It will make all other letters lowercase. It should work for you:

str_replace('\' ', '\'', ucwords(str_replace('\'', '\' ', strtolower($last_name))));
1
On

This works well for most English names.
Does not work for roman numeral suffixes.
It does not work for sÃO JoÃO dos SaNTOS III

You have names with spaces. big john
You have names with apostrophes. O'dell
You have names with hyphens. Smith-jones
You have misplaced caps. sMith-joNes
You have name is all caps. JOHN SMITH
You have every combination.

Example: big JohN o'dell-sMIth

One simple line of code to handle them all.

$name = ucWords(strtolower($name)," -'");

Big John O'Dell-Smith

.

0
On

None of these are UTF8 friendly, so here's one that works flawlessly (so far)

function titleCase($string, $delimiters = array(" ", "-", ".", "'", "O'", "Mc"), $exceptions = array("and", "to", "of", "das", "dos", "I", "II", "III", "IV", "V", "VI"))
{
    /*
     * Exceptions in lower case are words you don't want converted
     * Exceptions all in upper case are any words you don't want converted to title case
     *   but should be converted to upper case, e.g.:
     *   king henry viii or king henry Viii should be King Henry VIII
     */
    $string = mb_convert_case($string, MB_CASE_TITLE, "UTF-8");
    foreach ($delimiters as $dlnr => $delimiter) {
        $words = explode($delimiter, $string);
        $newwords = array();
        foreach ($words as $wordnr => $word) {
            if (in_array(mb_strtoupper($word, "UTF-8"), $exceptions)) {
                // check exceptions list for any words that should be in upper case
                $word = mb_strtoupper($word, "UTF-8");
            } elseif (in_array(mb_strtolower($word, "UTF-8"), $exceptions)) {
                // check exceptions list for any words that should be in upper case
                $word = mb_strtolower($word, "UTF-8");
            } elseif (!in_array($word, $exceptions)) {
                // convert to uppercase (non-utf8 only)
                $word = ucfirst($word);
            }
            array_push($newwords, $word);
        }
        $string = join($delimiter, $newwords);
   }//foreach
   return $string;
}

Usage:

$s = 'SÃO JOÃO DOS SANTOS';
$v = titleCase($s); // 'São João dos Santos' 
0
On

If you're using WordPress then use:

function archive_title() {
$title = '<h1>' . ucwords( single_tag_title( '', false ) )  . '</h1>';
}
2
On

you can try this for word's

<?php echo ucwords(strtolower('Dhaka, JAMALPUR, sarishabari')) ?>

result is: Dhaka, Jamalpur, Sarishabari

0
On

Here's my highly over-engineered, but pretty all-encompassing solution to capitalisation of Latin names in PHP. It will solve all your capitalisation problems. All of them.

/**
 * Over-engineered solution to most capitalisation issues.
 * 
 * @author https://stackoverflow.com/users/429071/dearsina
 * @version 1.0
 */ 
class str {
    /**
     * Words or abbreviations that should always be all uppercase
     */
    const ALL_UPPERCASE = [
        "UK",
        "VAT",
    ];

    /**
     * Words or abbreviations that should always be all lowercase
     */
    const ALL_LOWERCASE = [
        "and",
        "as",
        "by",
        "in",
        "of",
        "or",
        "to",
    ];

    /**
     * Honorifics that only contain consonants.
     *
     */
    const CONSONANT_ONLY_HONORIFICS = [
        # English
        "Mr",
        "Mrs",
        "Ms",
        "Dr",
        "Br",
        "Sr",
        "Fr",
        "Pr",
        "St",

        # Afrikaans
        "Mnr",
    ];

    /**
     * Surname prefixes that should be lowercase,
     * unless not following another word (firstname).
     */
    const SURNAME_PREFIXES = [
        "de la",
        "de las",
        "van de",
        "van der",
        "vit de",
        "von",
        "van",
        "del",
        "der",
    ];

    /**
     * Capitalises every (appropriate) word in a given string.
     *
     * @param string|null $string
     *
     * @return string|null
     */
    public static function capitalise(?string $string): ?string
    {
        if(!$string){
            return $string;
        }

        # Strip away multi-spaces
        $string = preg_replace("/\s{2,}/", " ", $string);

        # Ensure there is always a space after a comma
        $string = preg_replace("/,([^\s])/", ", $1", $string);

        # A word is anything separated by spaces or a dash
        $string = preg_replace_callback("/([^\s\-\.]+)/", function($matches){
            # Make the word lowercase
            $word = mb_strtolower($matches[1]);

            # If the word needs to be all lowercase
            if(in_array($word, self::ALL_LOWERCASE)){
                return strtolower($word);
            }

            # If the word needs to be all uppercase
            if(in_array(mb_strtoupper($word), self::ALL_UPPERCASE)){
                return strtoupper($word);
            }

            # Create a version without diacritics
            $transliterator = \Transliterator::createFromRules(':: Any-Latin; :: Latin-ASCII; :: NFD; :: [:Nonspacing Mark:] Remove; :: Lower(); :: NFC;', \Transliterator::FORWARD);
            $ascii_word = $transliterator->transliterate($word);


            # If the word contains non-alpha characters (numbers, &, etc), with exceptions (comma, '), assume it's an abbreviation
            if(preg_match("/[^a-z,']/i", $ascii_word)){
                return strtoupper($word);
            }

            # If the word doesn't contain any vowels, assume it's an abbreviation
            if(!preg_match("/[aeiouy]/i", $ascii_word)){
                # Unless the word is an honorific
                if(!in_array(ucfirst($word), self::CONSONANT_ONLY_HONORIFICS)){
                    return strtoupper($word);
                }
            }

            # If the word contains two of the same vowel and is 3 characters or fewer, assume it's an abbreviation
            if(strlen($word) <= 3 && preg_match("/([aeiouy])\1/", $word)){
                return strtoupper($word);
            }

            # Ensure O'Connor, L'Oreal, etc, are double capitalised, with exceptions (d')
            if(preg_match("/\b([a-z]')(\w+)\b/i", $word, $match)){
                # Some prefixes (like d') are not capitalised
                if(in_array($match[1], ["d'"])){
                    return $match[1] . ucfirst($match[2]);
                }

                # Otherwise, everything is capitalised
                return strtoupper($match[1]) . ucfirst($match[2]);
            }

            # Otherwise, return the word with the first letter (only) capitalised
            return ucfirst($word);
            //The most common outcome
        }, $string);

        # Cater for the Mc prefix
        $pattern = "/(Mc)([b-df-hj-np-tv-z])/";
        //Mc followed by a consonant
        $string = preg_replace_callback($pattern, function($matches){
            return "Mc" . ucfirst($matches[2]);
        }, $string);

        # Cater for Roman numerals (need to be in all caps)
        $pattern = "/\b((?<![MDCLXVI])(?=[MDCLXVI])M{0,3}(?:C[MD]|D?C{0,3})(?:X[CL]|L?X{0,3})(?:I[XV]|V?I{0,3}))\b/i";
        $string = preg_replace_callback($pattern, function($matches){
            return strtoupper($matches[1]);
        }, $string);

        # Cater for surname prefixes (must be after the Roman numerals)
        $pattern = "/\b (".implode("|", self::SURNAME_PREFIXES).") \b/i";
        //A surname prefix, bookended by words
        $string = preg_replace_callback($pattern, function($matches){
            return strtolower(" {$matches[1]} ");
        }, $string);

        # Cater for ordinal numbers
        $pattern = "/\b(\d+(?:st|nd|rd|th))\b/i";
        //A number suffixed with an ordinal
        $string = preg_replace_callback($pattern, function($matches){
            return strtolower($matches[1]);
        }, $string);

        # And we're done done
        return $string;
    }
}

Have a play.

1
On

You can use preg_replace with the e flag (execute a php function):

function processReplacement($one, $two)
{
  return $one . strtoupper($two);
}

$name = "bob o'conner";
$name = preg_replace("/(^|[^a-zA-Z])([a-z])/e","processReplacement('$1', '$2')", $name);

var_dump($name); // output "Bob O'Conner"

Perhaps the regex pattern could be improved, but what I've done is:

  • $1 is either the beginning of line or any non-alphabetic character.
  • $2 is any lowercase alphabetic character

We then replace both of those with the result of the simple processReplacement() function.

If you've got PHP 5.3 it's probably worth making processReplacement() an anonymous function.

0
On

First convert to title case, then find the first apostrophe and uppercase the NEXT character. You will need to add many checks, to ensure that there is a char after the apostrophe, and this code will only work on one apostrophe. e.g. "Mary O'Callahan O'connell".

$str = mb_convert_case($str, MB_CASE_TITLE, "UTF-8");
$pos = strpos($str, "'");
if ($pos != FALSE)
{
     $str[$pos+1] = strtoupper($str[$pos+1]);
}
0
On

I don't believe there will be one good answer that covers all scenarios for you. The PHP.net forum for ucwords has a fair amount of discussions but none seem to have an answer for all. I would recommend that you standardize either using uppercase or leaving the user's input alone.

0
On

I Use This:

    <?php
// Let's create a function, so we can reuse the logic
    function sentence_case($str){
        // Let's split our string into an array of words
        $words = explode(' ', $str);
        foreach($words as &$word){
            // Let's check if the word is uppercase; if so, ignore it
            if($word == mb_convert_case($word, MB_CASE_UPPER, "UTF-8")){
                continue;
            }
            // Otherwise, let's make the first character uppercase
           $word = mb_convert_case($word, MB_CASE_TITLE , "UTF-8");
        }
        // Join the individual words back into a string
        return implode(' ', $words);
    }
        //echo sentence_case('tuyển nhân o'canel XV-YZ xp-hg iphone-plus viên bán hàng trên sàn MTĐT');
// "Tuyển Nhân O'Canel XV-YZ Xp-Hg Iphone-Plus Viên Bán Hàng Trên Sàn MTĐT"
1
On

This is a little more simple and more direct answer to the main question. The function below mimics the PHP approaches. Just in case if PHP extend this with their namespaces in the future, a test is first checked. I'm using this water proof for any languages in my wordpress installs.

$str = mb_ucfirst($str, 'UTF-8', true);

This make first letter uppercase and all other lowercase as the Q was. If the third arg is set to false (default), the rest of the string is not manipulated.

// Extends PHP
if (!function_exists('mb_ucfirst')) {

function mb_ucfirst($str, $encoding = "UTF-8", $lower_str_end = false) {
    $first_letter = mb_strtoupper(mb_substr($str, 0, 1, $encoding), $encoding);
    $str_end = "";
    if ($lower_str_end) {
        $str_end = mb_strtolower(mb_substr($str, 1, mb_strlen($str, $encoding), $encoding), $encoding);
    } else {
        $str_end = mb_substr($str, 1, mb_strlen($str, $encoding), $encoding);
    }
    $str = $first_letter . $str_end;
    return $str;
}

}
3
On

Use this built-in function:

ucwords('string');