Count the words in a string using Postgres

I just wanted to post this little bit of code. I needed to produce a list of phrases from a database but only show phrases with more than four words. Here’s the SQL:

select phrasetext from phrases
where array_upper(regexp_split_to_array(phrasetext,E'\\s+'),1) > 4;

How this works is a regular expression is used to split the string by spaces. This spits out a text array. The array_upper function retrieves the largest index in the array which, being 1-based is the same as the number of items in the array – which of course is the same as the number of words in the phrase.

UPDATE: I found a simpler way to do this, just use the spaces & wildcards:

select phrasetext from phrases
where phrasetext like '% % % %';

This entry was posted on Monday, October 17th, 2011 at 9:37 pm and is filed under Technology. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply