14
Dec
2007

How to pluralize in PHP (and please help me check the code)

An improved version of the code has been posted here. Please use that version instead of this version.

At some point, I should post about what I have been doing over the past few months. Suffice it to say that I am tinkering around with a few ideas and getting reacquainted with things like PHP and AJAX.

In the meantime, here is a small code snippet for how to pluralize in PHP. The original code came from two sources:

http://www.eval.ca/articles/php-pluralize (no license specified)
http://solarphp.com (BSD license)

The first source derives from the Rails source, which is covered under the MIT license. Since the BSD license is a tiny tiny bit more restrictive than the MIT license, I think that means that this code is covered under BSD.

I made a few changes from the two original versions:

  1. Started with www.eval.ca version
  2. Changed nested arrays to associative array and wrapped it in a class
  3. Added unpluralization rules from solarphp.com
  4. Changed suspicious pluralization rules
    1. Removed unorthodox virus -> viri in favor of general rule for *us -> *uses (e.g., viruses, cactuses, caucuses)
    2. I noticed a rule for buffalo -> buffaloes and tomato -> tomatoes but not one for potato->potatoes. I added it.

One quibble. It kind of bothers me that these algorithms have such specific rules for pluralizations that are unlikely to come up in computer software (ox -> oxen? octopus -> octopi?) and yet the rules are obviously not complete, because I found at least three problems just by inspection. Whether you agree or disagree with how I pluralized virus, the plural of cactus is not cactus, the plural of caucus is not caucus, and the plural of potato is not potatos.

Did I miss any pluralizations? Did I overstep my bounds by adding a rule that says *us -> *uses?

// Thanks to http://www.eval.ca/articles/php-pluralize (MIT license)
// As well as http://solarphp.com/trac/changeset/2214?format=diff&new=2214 (BSD license)

// Changes:
//   Removed rule for virus -> viri
//   Added rule for potato -> potatoes
//   Added rule for *us -> *uses

class Inflect
{
    static $plural = array(
        '/(quiz)$/i'               => "$1zes",
        '/^(ox)$/i'                => "$1en",
        '/([m|l])ouse$/i'          => "$1ice",
        '/(matr|vert|ind)ix|ex$/i' => "$1ices",
        '/(x|ch|ss|sh)$/i'         => "$1es",
        '/([^aeiouy]|qu)y$/i'      => "$1ies",
        '/([^aeiouy]|qu)ies$/i'    => "$1y",
        '/(hive)$/i'               => "$1s",
        '/(?:([^f])fe|([lr])f)$/i' => "$1$2ves",
        '/sis$/i'                  => "ses",
        '/([ti])um$/i'             => "$1a",
        '/(buffal|tomat|potat)o$/i'=> "$1oes",
        '/(bu)s$/i'                => "$1ses",
        '/(alias|status)$/i'       => "$1es",
        '/(octop)us$/i'            => "$1i",
        '/(ax|test)is$/i'          => "$1es",
        '/us$/i'                   => "$1es",
        '/s$/i'                    => "s",
        '/$/'                      => "s"
    );
    
    static $singular = array(
        '/(n)ews$/i'                => "$1ews",
        '/([ti])a$/i'               => "$1um",
        '/((a)naly|(b)a|(d)iagno|(p)arenthe|(p)rogno|(s)ynop|(t)he)ses$/i'  => "$1$2sis",
        '/(^analy)ses$/i'           => "$1sis",
        '/([^f])ves$/i'             => "$1fe",
        '/(hive)s$/i'               => "$1",
        '/(tive)s$/i'               => "$1",
        '/([lr])ves$/i'             => "$1f",
        '/([^aeiouy]|qu)ies$/i'     => "$1y",
        '/(s)eries$/i'              => "$1eries",
        '/(m)ovies$/i'              => "$1ovie",
        '/(x|ch|ss|sh)es$/i'        => "$1",
        '/([m|l])ice$/i'            => "$1ouse",
        '/(bus)es$/i'               => "$1",
        '/(o)es$/i'                 => "$1",
        '/(shoe)s$/i'               => "$1",
        '/(cris|ax|test)es$/i'      => "$1is",
        '/(octop|vir)i$/i'          => "$1us",
        '/(alias|status)es$/i'      => "$1",
        '/^(ox)en$/i'               => "$1",
        '/(vert|ind)ices$/i'        => "$1ex",
        '/(matr)ices$/i'            => "$1ix",
        '/(quiz)zes$/i'             => "$1",
        '/(us)es$/i'                => "$1",
        '/s$/i'                     => ""
    );
    
    static $irregular = array(
        array( 'move',   'moves'    ),
        array( 'sex',    'sexes'    ),
        array( 'child',  'children' ),
        array( 'man',    'men'      ),
        array( 'person', 'people'   )
    );
    
    static $uncountable = array( 
        'sheep', 
        'fish',
        'series',
        'species',
        'money',
        'rice',
        'information',
        'equipment'
    );
    
    public static function pluralize( $string ) 
    {
        // save some time in the case that singular and plural are the same
        if ( in_array( strtolower( $string ), self::$uncountable ) )
            return $string;
            
    
        // check for irregular singular forms
        foreach ( self::$irregular as $noun )
        {
            if ( strtolower( $string ) == $noun[0] )
            return $noun[1];
        }
        
        // check for matches using regular expressions
        foreach ( self::$plural as $pattern => $result )
        {
            if ( preg_match( $pattern, $string ) )
                return preg_replace( $pattern, $result, $string );
        }
        
        return $string;
    }
    
    public static function singularize( $string )
    {
        // save some time in the case that singular and plural are the same
        if ( in_array( strtolower( $string ), self::$uncountable ) )
            return $string;

        // check for irregular singular forms
        foreach ( self::$irregular as $noun )
        {
            if ( strtolower( $string ) == $noun[1] )
            return $noun[0];
        }
        
        // check for matches using regular expressions
        foreach ( self::$singular as $pattern => $result )
        {
            if ( preg_match( $pattern, $string ) )
                return preg_replace( $pattern, $result, $string );
        }
        
        return $string;
    }
    
    public static function pluralize_if($count, $string)
    {
        if ($count == 1)
            return "1 $string";
        else
            return $count . " " . self::pluralize($string);
    }
}

Enjoy!

3 Responses to “How to pluralize in PHP (and please help me check the code)”

  1. sho

    Dang. Spotted another problem. Doesn’t this rule:

    ‘/([^f])ves$/i’ => “$1fe”,

    cause all sorts of problems when singularizing? (shelves -> shelfe, doves -> dofe, elves -> elfe, sheaves -> sheafe, leaves -> leafe, etc.)

  2. How to Pluralize in PHP | David Bisset: Web Designer, Coder, Wordpress Guru

    […] One of those useful but sometimes hard to find (good) PHP scripts. How to pluralize in PHP. It’s a work in progress with tweaking, but looks mostly solid. Tags: PHP […]

  3. Arvind Kumar

    Good work.

    You might want to look at CakePHP’s Inflector (http://api.cakephp.org/class/inflector) class. I think it is more deeply dug out and good one.

Leave a Reply