2

I'm doing a Romanian hyphenation script. Previous question (solved) is here: regex if capture group matches string in case you want to take a look This is a regex that deal with vowels that are not diphthongs or triphthongs:

(?:[aeiou])(?=[aeiou][bcdfghjklmnprstvwxyz]{0,})

I cannot seem to figure how to add two exceptions to this: "ii" in final position remain together. The "ii" group is usually preceded by a consonant, except in the case of "copiii" which is hyphenated -pi-ii https://regex101.com/r/ew4JUh/1 The expected result, except for the word "copiii" is always a consonant (or more) followed by the "ii" group in the same syllable muschii = mus-chii pomii = po-mii

EDIT:

Just in case someone ever needs to do the same, you can find the script so far here:

https://playcode.io/156923

It works - mostly.

It implements the rules as I understand them. The only issue is that probably half the words stand in exception to the rules. So while the script does what it should, it cannot deal with exceptions that cannot be anticipated.

Ex:

avion = a-vi-on

iodat = io-dat

piatra = pia-tra

diamant = di-a-mant

And so on ad infinitum. I don't believe there's any rule to establish when the vowels are grouped as diphthongs or triphthongs and when they belong to different syllables.

On the plus side I know more grammar and more regex than ever :)

Many thanks to Wiktor who helped immensely.

2
  • I think you are looking for (?!ii\b)[aeiou](?=[aeiou]), see demo. Commented Nov 19, 2018 at 12:44
  • Hello and thank you again. Add it as the answer. I liked the second better: regex101.com/r/ew4JUh/2 Commented Nov 19, 2018 at 12:49

1 Answer 1

2

You may use

(?!ii\b)[aeiou](?=[aeiou])

See the regex demo.

Note that [bcdfghjklmnprstvwxyz]{0,} at the end of the positive lookahead is redundant, it makes no difference if you require an optional pattern or not.

Details

  • (?!ii\b) - a negative lookahead that fails the match if, immediately to the right of the current location, there is ii followed with a word boundary
  • [aeiou] - a vowel
  • (?=[aeiou]) - that must be followed with another vowel.
Sign up to request clarification or add additional context in comments.

6 Comments

No, [bcdfghjklmnprstvwxyz]{0,} is not the right way to do it. What I wanted to say was that this rule applies only if these 2 vowels are not followed by another vowel. It should have been [^aeiou], right?
@flish That restriction would not let matching i in piii. Are you sure you want that restriction?
It's complicated. What I want actually is to combine this here: (?!ii\b)(?:[aeiou])(?=[aeiou]) with the rest of the vowels rule : regex101.com/r/KPnEgy/1 (?:[aeo])(?=[iu][aeo])|(?:[aeiou])(?=([i][a][ui]|[e][ao]|[i][aeo]|[o][a]|[u][a]))
@flish Sorry, this is not clear how you want to combine these regexps. BTW, the regex you posted above can be written as [aeo](?=[iu][aeo])|[aeiou](?=(?:ia[ui]|e[ao]|i[aeo]|[ou]a))
(?!ii\b)(?:[aeiou])(?=[aeiou]) this one splits 2 vowels in different syllables (except for ii in final position. The other one [aeo](?=[iu][aeo])|[aeiou](?=(?:ia[ui]|e[ao]|i[aeo]|[ou]a)) deals with diphthongs and triphthongs where vowels are followed by other vowels. Ideally they would be merged into a single regex which covers both cases. The problem is that this (?!ii\b)[aeiou](?=[aeiou]) would split to different syllables the vowels which need to remain together. I don't know how to put them together
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.