How to pluralize in PHP (and please help me check the code)
At some point, I should post about what I have been doing over the past few months. Suffice it to say that I am tinkering around with a few ideas and getting reacquainted with things like PHP and AJAX.
In the meantime, here is a small code snippet for how to pluralize in PHP. The original code came from two sources:
http://www.eval.ca/articles/php-pluralize (no license specified)
http://solarphp.com (BSD license)
The first source derives from the Rails source, which is covered under the MIT license. Since the BSD license is a tiny tiny bit more restrictive than the MIT license, I think that means that this code is covered under BSD.
I made a few changes from the two original versions:
- Started with www.eval.ca version
- Changed nested arrays to associative array and wrapped it in a class
- Added unpluralization rules from solarphp.com
- Changed suspicious pluralization rules
- Removed unorthodox virus -> viri in favor of general rule for *us -> *uses (e.g., viruses, cactuses, caucuses)
- I noticed a rule for buffalo -> buffaloes and tomato -> tomatoes but not one for potato->potatoes. I added it.
One quibble. It kind of bothers me that these algorithms have such specific rules for pluralizations that are unlikely to come up in computer software (ox -> oxen? octopus -> octopi?) and yet the rules are obviously not complete, because I found at least three problems just by inspection. Whether you agree or disagree with how I pluralized virus, the plural of cactus is not cactus, the plural of caucus is not caucus, and the plural of potato is not potatos.
Did I miss any pluralizations? Did I overstep my bounds by adding a rule that says *us -> *uses?
// Thanks to http://www.eval.ca/articles/php-pluralize (MIT license) // As well as http://solarphp.com/trac/changeset/2214?format=diff&new=2214 (BSD license) // Changes: // Removed rule for virus -> viri // Added rule for potato -> potatoes // Added rule for *us -> *uses class Inflect { static $plural = array( '/(quiz)$/i' => "$1zes", '/^(ox)$/i' => "$1en", '/([m|l])ouse$/i' => "$1ice", '/(matr|vert|ind)ix|ex$/i' => "$1ices", '/(x|ch|ss|sh)$/i' => "$1es", '/([^aeiouy]|qu)y$/i' => "$1ies", '/([^aeiouy]|qu)ies$/i' => "$1y", '/(hive)$/i' => "$1s", '/(?:([^f])fe|([lr])f)$/i' => "$1$2ves", '/sis$/i' => "ses", '/([ti])um$/i' => "$1a", '/(buffal|tomat|potat)o$/i'=> "$1oes", '/(bu)s$/i' => "$1ses", '/(alias|status)$/i' => "$1es", '/(octop)us$/i' => "$1i", '/(ax|test)is$/i' => "$1es", '/us$/i' => "$1es", '/s$/i' => "s", '/$/' => "s" ); static $singular = array( '/(n)ews$/i' => "$1ews", '/([ti])a$/i' => "$1um", '/((a)naly|(b)a|(d)iagno|(p)arenthe|(p)rogno|(s)ynop|(t)he)ses$/i' => "$1$2sis", '/(^analy)ses$/i' => "$1sis", '/([^f])ves$/i' => "$1fe", '/(hive)s$/i' => "$1", '/(tive)s$/i' => "$1", '/([lr])ves$/i' => "$1f", '/([^aeiouy]|qu)ies$/i' => "$1y", '/(s)eries$/i' => "$1eries", '/(m)ovies$/i' => "$1ovie", '/(x|ch|ss|sh)es$/i' => "$1", '/([m|l])ice$/i' => "$1ouse", '/(bus)es$/i' => "$1", '/(o)es$/i' => "$1", '/(shoe)s$/i' => "$1", '/(cris|ax|test)es$/i' => "$1is", '/(octop|vir)i$/i' => "$1us", '/(alias|status)es$/i' => "$1", '/^(ox)en$/i' => "$1", '/(vert|ind)ices$/i' => "$1ex", '/(matr)ices$/i' => "$1ix", '/(quiz)zes$/i' => "$1", '/(us)es$/i' => "$1", '/s$/i' => "" ); static $irregular = array( array( 'move', 'moves' ), array( 'sex', 'sexes' ), array( 'child', 'children' ), array( 'man', 'men' ), array( 'person', 'people' ) ); static $uncountable = array( 'sheep', 'fish', 'series', 'species', 'money', 'rice', 'information', 'equipment' ); public static function pluralize( $string ) { // save some time in the case that singular and plural are the same if ( in_array( strtolower( $string ), self::$uncountable ) ) return $string; // check for irregular singular forms foreach ( self::$irregular as $noun ) { if ( strtolower( $string ) == $noun[0] ) return $noun[1]; } // check for matches using regular expressions foreach ( self::$plural as $pattern => $result ) { if ( preg_match( $pattern, $string ) ) return preg_replace( $pattern, $result, $string ); } return $string; } public static function singularize( $string ) { // save some time in the case that singular and plural are the same if ( in_array( strtolower( $string ), self::$uncountable ) ) return $string; // check for irregular singular forms foreach ( self::$irregular as $noun ) { if ( strtolower( $string ) == $noun[1] ) return $noun[0]; } // check for matches using regular expressions foreach ( self::$singular as $pattern => $result ) { if ( preg_match( $pattern, $string ) ) return preg_replace( $pattern, $result, $string ); } return $string; } public static function pluralize_if($count, $string) { if ($count == 1) return "1 $string"; else return $count . " " . self::pluralize($string); } }
Enjoy!
Dang. Spotted another problem. Doesn’t this rule:
‘/([^f])ves$/i’ => “$1fe”,
cause all sorts of problems when singularizing? (shelves -> shelfe, doves -> dofe, elves -> elfe, sheaves -> sheafe, leaves -> leafe, etc.)
[…] One of those useful but sometimes hard to find (good) PHP scripts. How to pluralize in PHP. It’s a work in progress with tweaking, but looks mostly solid. Tags: PHP […]
Good work.
You might want to look at CakePHP’s Inflector (http://api.cakephp.org/class/inflector) class. I think it is more deeply dug out and good one.