mpdeglau
Oct 9th, 2008, 10:50 AM
I have an app that has a paragraph(s) passed in, and I need to figure out how many sentences it has.
Right now, this is how I'm doing it:
private long getSentenceCount(String text){
String delim = "@@@@";
String delim2 = "####";
String tempText = text.replace(". ", delim);
tempText = tempText.replace(".\r\n", delim);
tempText = tempText.replace("! ", delim);
tempText = tempText.replace("!\r\n", delim);
tempText = tempText.replace("? ", delim);
tempText = tempText.replace("?\r\n", delim);
tempText = tempText.replace("\r\n", delim2);
String [] sentences = tempText.split(delim);
long sCnt = 0;
for(String s : sentences){
if(s.contains(delim2)){
String[] temp = s.split(delim2);
for(String t : temp){
if(textIsSentence(t) == true){ //textIsSentence checks that the string is not empty, that there are more than 4 words (arbitrary number for now) and the first letter is uppercasee
sCnt ++;
}
}
}else{
if(textIsSentence(s) == true){
sCnt ++;
}
}
}
return sCnt;
}
I'm wondering if there is a better way to do this. With regex prehaps. But I'm having trouble figuring out how to write the pattern.
What it needs to find is:
period, question mark or exclamation point, followed by either a space or a new line. Or just a new line.
Thanks
Right now, this is how I'm doing it:
private long getSentenceCount(String text){
String delim = "@@@@";
String delim2 = "####";
String tempText = text.replace(". ", delim);
tempText = tempText.replace(".\r\n", delim);
tempText = tempText.replace("! ", delim);
tempText = tempText.replace("!\r\n", delim);
tempText = tempText.replace("? ", delim);
tempText = tempText.replace("?\r\n", delim);
tempText = tempText.replace("\r\n", delim2);
String [] sentences = tempText.split(delim);
long sCnt = 0;
for(String s : sentences){
if(s.contains(delim2)){
String[] temp = s.split(delim2);
for(String t : temp){
if(textIsSentence(t) == true){ //textIsSentence checks that the string is not empty, that there are more than 4 words (arbitrary number for now) and the first letter is uppercasee
sCnt ++;
}
}
}else{
if(textIsSentence(s) == true){
sCnt ++;
}
}
}
return sCnt;
}
I'm wondering if there is a better way to do this. With regex prehaps. But I'm having trouble figuring out how to write the pattern.
What it needs to find is:
period, question mark or exclamation point, followed by either a space or a new line. Or just a new line.
Thanks