I'm doing a Romanian hyphenation script. Previous question (solved) is here: regex if capture group matches string in case you want to take a look This is a regex that deal with vowels that are not diphthongs or triphthongs:
(?:[aeiou])(?=[aeiou][bcdfghjklmnprstvwxyz]{0,})
I cannot seem to figure how to add two exceptions to this: "ii" in final position remain together. The "ii" group is usually preceded by a consonant, except in the case of "copiii" which is hyphenated -pi-ii https://regex101.com/r/ew4JUh/1 The expected result, except for the word "copiii" is always a consonant (or more) followed by the "ii" group in the same syllable muschii = mus-chii pomii = po-mii
EDIT:
Just in case someone ever needs to do the same, you can find the script so far here:
It works - mostly.
It implements the rules as I understand them. The only issue is that probably half the words stand in exception to the rules. So while the script does what it should, it cannot deal with exceptions that cannot be anticipated.
Ex:
avion = a-vi-on
iodat = io-dat
piatra = pia-tra
diamant = di-a-mant
And so on ad infinitum. I don't believe there's any rule to establish when the vowels are grouped as diphthongs or triphthongs and when they belong to different syllables.
On the plus side I know more grammar and more regex than ever :)
Many thanks to Wiktor who helped immensely.
(?!ii\b)[aeiou](?=[aeiou]), see demo.