Skip to content

Commit 5256323

Browse files
committed
feature #36929 Added a FrenchInflector for the String component (Alexandre-T)
This PR was merged into the 5.2-dev branch. Discussion ---------- Added a FrenchInflector for the String component I read in [this blog post](https://symfony.com/blog/new-in-symfony-5-1-deprecated-the-inflector-component) this sentence > Symfony Inflector component converts words between their singular and plural forms (**for now, only in English**) So I created a FrenchInflector class implementing the InflectorInterface from the String component. This inflector uses regular expressions and it is tested in the FrenchInflectorTest with a lot of the french exceptions. | Q | A | ------------- | --- | Branch | master | Bug fix | no | New feature | yes | Deprecations | no | License | MIT | Doc PR | Not yet Changelog has been updated, but I'm not sure I did it in the good paragraph. I don't know if I should update the symfony/symfony-docs, but I have created an example and I could create a PR with it, if you want. ```php <?php use Symfony\Component\String\Inflector\FrenchInflector; $inflector = new FrenchInflector(); $result = $inflector->singularize('dents'); // ['dent'] $result = $inflector->singularize('souris'); // ['souris'] $result = $inflector->singularize('messieurs'); // ['monsieur'] $result = $inflector->pluralize('cinquante'); // ['cinquante'] $result = $inflector->pluralize('pou'); // ['poux'] $result = $inflector->pluralize('cheval'); // ['chevaux'] ``` **fabbot.io** is detecting a typo, but this is not. The patch done by fabpot suggests to replace the french 'embarras' word by 'embarrass'. I shall not remove or replace it, because "embarras" is an invariant word. Commits ------- d903d9a Added a FrenchInflector for the String component French inflector implements InflectorInterface, it uses regexp and it is tested in the FrenchInflectorTest
2 parents 281a752 + d903d9a commit 5256323

File tree

3 files changed

+309
-0
lines changed

3 files changed

+309
-0
lines changed

src/Symfony/Component/String/CHANGELOG.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
CHANGELOG
22
=========
33

4+
5.2.0
5+
-----
6+
7+
* added a `FrenchInflector` class
8+
49
5.1.0
510
-----
611

Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
<?php
2+
3+
/*
4+
* This file is part of the Symfony package.
5+
*
6+
* (c) Fabien Potencier <fabien@symfony.com>
7+
*
8+
* For the full copyright and license information, please view the LICENSE
9+
* file that was distributed with this source code.
10+
*/
11+
12+
namespace Symfony\Component\String\Inflector;
13+
14+
/**
15+
* French inflector.
16+
*
17+
* This class does only inflect nouns; not adjectives nor composed words like "soixante-dix".
18+
*/
19+
final class FrenchInflector implements InflectorInterface
20+
{
21+
/**
22+
* A list of all rules for pluralise.
23+
* @see https://la-conjugaison.nouvelobs.com/regles/grammaire/le-pluriel-des-noms-121.php
24+
*/
25+
private static $pluralizeRegexp = [
26+
// First entry: regexp
27+
// Second entry: replacement
28+
29+
// Words finishing with "s", "x" or "z" are invariables
30+
// Les mots finissant par "s", "x" ou "z" sont invariables
31+
['/(s|x|z)$/i', '\1'],
32+
33+
// Words finishing with "eau" are pluralized with a "x"
34+
// Les mots finissant par "eau" prennent tous un "x" au pluriel
35+
['/(eau)$/i', '\1x'],
36+
37+
// Words finishing with "au" are pluralized with a "x" excepted "landau"
38+
// Les mots finissant par "au" prennent un "x" au pluriel sauf "landau"
39+
['/^(landau)$/i', '\1s'],
40+
['/(au)$/i', '\1x'],
41+
42+
// Words finishing with "eu" are pluralized with a "x" excepted "pneu", "bleu", "émeu"
43+
// Les mots finissant en "eu" prennent un "x" au pluriel sauf "pneu", "bleu", "émeu"
44+
['/^(pneu|bleu|émeu)$/i', '\1s'],
45+
['/(eu)$/i', '\1x'],
46+
47+
// Words finishing with "al" are pluralized with a "aux" excepted
48+
// Les mots finissant en "al" se terminent en "aux" sauf
49+
['/^(bal|carnaval|caracal|chacal|choral|corral|étal|festival|récital|val)$/i', '\1s'],
50+
['/al$/i', '\1aux'],
51+
52+
// Aspirail, bail, corail, émail, fermail, soupirail, travail, vantail et vitrail font leur pluriel en -aux
53+
['/^(aspir|b|cor|ém|ferm|soupir|trav|vant|vitr)ail$/i', '\1aux'],
54+
55+
// Bijou, caillou, chou, genou, hibou, joujou et pou qui prennent un x au pluriel
56+
['/^(bij|caill|ch|gen|hib|jouj|p)ou$/i', '\1oux'],
57+
58+
// Invariable words
59+
['/^(cinquante|soixante|mille)$/i', '\1'],
60+
61+
// French titles
62+
['/^(mon|ma)(sieur|dame|demoiselle|seigneur)$/', 'mes\2s'],
63+
['/^(Mon|Ma)(sieur|dame|demoiselle|seigneur)$/', 'Mes\2s'],
64+
];
65+
66+
/**
67+
* A list of all rules for singularize.
68+
*/
69+
private static $singularizeRegexp = [
70+
// First entry: regexp
71+
// Second entry: replacement
72+
73+
// Aspirail, bail, corail, émail, fermail, soupirail, travail, vantail et vitrail font leur pluriel en -aux
74+
['/((aspir|b|cor|ém|ferm|soupir|trav|vant|vitr))aux$/i', '\1ail'],
75+
76+
// Words finishing with "eau" are pluralized with a "x"
77+
// Les mots finissant par "eau" prennent tous un "x" au pluriel
78+
['/(eau)x$/i', '\1'],
79+
80+
// Words finishing with "al" are pluralized with a "aux" expected
81+
// Les mots finissant en "al" se terminent en "aux" sauf
82+
['/(amir|anim|arsen|boc|can|capit|capor|chev|crist|génér|hopit|hôpit|idé|journ|littor|loc|m|mét|minér|princip|radic|termin)aux$/i', '\1al'],
83+
84+
// Words finishing with "au" are pluralized with a "x" excepted "landau"
85+
// Les mots finissant par "au" prennent un "x" au pluriel sauf "landau"
86+
['/(au)x$/i', '\1'],
87+
88+
// Words finishing with "eu" are pluralized with a "x" excepted "pneu", "bleu", "émeu"
89+
// Les mots finissant en "eu" prennent un "x" au pluriel sauf "pneu", "bleu", "émeu"
90+
['/(eu)x$/i', '\1'],
91+
92+
// Words finishing with "ou" are pluralized with a "s" excepted bijou, caillou, chou, genou, hibou, joujou, pou
93+
// Les mots finissant par "ou" prennent un "s" sauf bijou, caillou, chou, genou, hibou, joujou, pou
94+
['/(bij|caill|ch|gen|hib|jouj|p)oux$/i', '\1ou'],
95+
96+
// French titles
97+
['/^mes(dame|demoiselle)s$/', 'ma\1'],
98+
['/^Mes(dame|demoiselle)s$/', 'Ma\1'],
99+
['/^mes(sieur|seigneur)s$/', 'mon\1'],
100+
['/^Mes(sieur|seigneur)s$/', 'Mon\1'],
101+
102+
//Default rule
103+
['/s$/i', ''],
104+
];
105+
106+
/**
107+
* A list of words which should not be inflected.
108+
* This list is only used by singularize.
109+
*/
110+
private static $uninflected = '/^(abcès|accès|abus|albatros|anchois|anglais|autobus|bois|brebis|carquois|cas|chas|colis|concours|corps|cours|cyprès|décès|devis|discours|dos|embarras|engrais|entrelacs|excès|fils|fois|gâchis|gars|glas|héros|intrus|jars|jus|kermès|lacis|legs|lilas|marais|mars|matelas|mépris|mets|mois|mors|obus|os|palais|paradis|parcours|pardessus|pays|plusieurs|poids|pois|pouls|printemps|processus|progrès|puits|pus|rabais|radis|recors|recours|refus|relais|remords|remous|rictus|rhinocéros|repas|rubis|sas|secours|sens|souris|succès|talus|tapis|tas|taudis|temps|tiers|univers|velours|verglas|vernis|virus)$/i';
111+
112+
/**
113+
* {@inheritdoc}
114+
*/
115+
public function singularize(string $plural): array
116+
{
117+
if ($this->isInflectedWord($plural)) {
118+
return [$plural];
119+
}
120+
121+
foreach (self::$singularizeRegexp as $rule) {
122+
[$regexp, $replace] = $rule;
123+
124+
if (1 === preg_match($regexp, $plural)) {
125+
return [preg_replace($regexp, $replace, $plural)];
126+
}
127+
}
128+
129+
return [$plural];
130+
}
131+
132+
/**
133+
* {@inheritdoc}
134+
*/
135+
public function pluralize(string $singular): array
136+
{
137+
if ($this->isInflectedWord($singular)) {
138+
return [$singular];
139+
}
140+
141+
foreach (self::$pluralizeRegexp as $rule) {
142+
[$regexp, $replace] = $rule;
143+
144+
if (1 === preg_match($regexp, $singular)) {
145+
return [preg_replace($regexp, $replace, $singular)];
146+
}
147+
}
148+
149+
return [$singular.'s'];
150+
}
151+
152+
private function isInflectedWord(string $word): bool
153+
{
154+
return 1 === preg_match(self::$uninflected, $word);
155+
}
156+
}
Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
<?php
2+
3+
/*
4+
* This file is part of the Symfony package.
5+
*
6+
* (c) Fabien Potencier <fabien@symfony.com>
7+
*
8+
* For the full copyright and license information, please view the LICENSE
9+
* file that was distributed with this source code.
10+
*/
11+
12+
namespace Symfony\Component\String\Tests;
13+
14+
use PHPUnit\Framework\TestCase;
15+
use Symfony\Component\String\Inflector\FrenchInflector;
16+
17+
class FrenchInflectorTest extends TestCase
18+
{
19+
public function pluralizeProvider()
20+
{
21+
return [
22+
//Le pluriel par défaut
23+
['voiture', 'voitures'],
24+
//special characters
25+
['œuf', 'œufs'],
26+
['oeuf', 'oeufs'],
27+
28+
//Les mots finissant par s, x, z sont invariables en nombre
29+
['bois', 'bois'],
30+
['fils', 'fils'],
31+
['héros', 'héros'],
32+
['nez', 'nez'],
33+
['rictus', 'rictus'],
34+
['souris', 'souris'],
35+
['tas', 'tas'],
36+
['toux', 'toux'],
37+
38+
//Les mots finissant en eau prennent tous un x au pluriel
39+
['eau', 'eaux'],
40+
['sceau', 'sceaux'],
41+
42+
//Les mots finissant en au prennent tous un x au pluriel sauf landau
43+
['noyau', 'noyaux'],
44+
['landau', 'landaus'],
45+
46+
//Les mots finissant en eu prennent un x au pluriel sauf pneu, bleu et émeu
47+
['pneu', 'pneus'],
48+
['bleu', 'bleus'],
49+
['émeu', 'émeus'],
50+
['cheveu', 'cheveux'],
51+
52+
//Les mots finissant en al se terminent en aux au pluriel
53+
['amiral', 'amiraux'],
54+
['animal', 'animaux'],
55+
['arsenal', 'arsenaux'],
56+
['bocal', 'bocaux'],
57+
['canal', 'canaux'],
58+
['capital', 'capitaux'],
59+
['caporal', 'caporaux'],
60+
['cheval', 'chevaux'],
61+
['cristal', 'cristaux'],
62+
['général', 'généraux'],
63+
['hopital', 'hopitaux'],
64+
['hôpital', 'hôpitaux'],
65+
['idéal', 'idéaux'],
66+
['journal', 'journaux'],
67+
['littoral', 'littoraux'],
68+
['local', 'locaux'],
69+
['mal', 'maux'],
70+
['métal', 'métaux'],
71+
['minéral', 'minéraux'],
72+
['principal', 'principaux'],
73+
['radical', 'radicaux'],
74+
['terminal', 'terminaux'],
75+
76+
//sauf bal, carnaval, caracal, chacal, choral, corral, étal, festival, récital et val
77+
['bal', 'bals'],
78+
['carnaval', 'carnavals'],
79+
['caracal', 'caracals'],
80+
['chacal', 'chacals'],
81+
['choral', 'chorals'],
82+
['corral', 'corrals'],
83+
['étal', 'étals'],
84+
['festival', 'festivals'],
85+
['récital', 'récitals'],
86+
['val', 'vals'],
87+
88+
// Les noms terminés en -ail prennent un s au pluriel.
89+
['portail', 'portails'],
90+
['rail', 'rails'],
91+
92+
// SAUF aspirail, bail, corail, émail, fermail, soupirail, travail, vantail et vitrail qui font leur pluriel en -aux
93+
['aspirail', 'aspiraux'],
94+
['bail', 'baux'],
95+
['corail', 'coraux'],
96+
['émail', 'émaux'],
97+
['fermail', 'fermaux'],
98+
['soupirail', 'soupiraux'],
99+
['travail', 'travaux'],
100+
['vantail', 'vantaux'],
101+
['vitrail', 'vitraux'],
102+
103+
// Les noms terminés en -ou prennent un s au pluriel.
104+
['trou', 'trous'],
105+
['fou', 'fous'],
106+
107+
//SAUF Bijou, caillou, chou, genou, hibou, joujou et pou qui prennent un x au pluriel
108+
['bijou', 'bijoux'],
109+
['caillou', 'cailloux'],
110+
['chou', 'choux'],
111+
['genou', 'genoux'],
112+
['hibou', 'hiboux'],
113+
['joujou', 'joujoux'],
114+
['pou', 'poux'],
115+
116+
//Inflected word
117+
['cinquante', 'cinquante'],
118+
['soixante', 'soixante'],
119+
['mille', 'mille'],
120+
121+
//Titles
122+
['monsieur', 'messieurs'],
123+
['madame', 'mesdames'],
124+
['mademoiselle', 'mesdemoiselles'],
125+
['monseigneur', 'messeigneurs'],
126+
];
127+
}
128+
129+
/**
130+
* @dataProvider pluralizeProvider
131+
*/
132+
public function testSingularize(string $singular, string $plural)
133+
{
134+
$this->assertSame([$singular], (new FrenchInflector())->singularize($plural));
135+
// test casing: if the first letter was uppercase, it should remain so
136+
$this->assertSame([ucfirst($singular)], (new FrenchInflector())->singularize(ucfirst($plural)));
137+
}
138+
139+
/**
140+
* @dataProvider pluralizeProvider
141+
*/
142+
public function testPluralize(string $singular, string $plural)
143+
{
144+
$this->assertSame([$plural], (new FrenchInflector())->pluralize($singular));
145+
// test casing: if the first letter was uppercase, it should remain so
146+
$this->assertSame([ucfirst($plural)], (new FrenchInflector())->pluralize(ucfirst($singular)));
147+
}
148+
}

0 commit comments

Comments
 (0)
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy