724

I am parsing a string in C++ using the following:

using namespace std;

string parsed,input="text to be parsed";
stringstream input_stringstream(input);

if (getline(input_stringstream,parsed,' '))
{
     // do some processing.
}

Parsing with a single char delimiter is fine. But what if I want to use a string as delimiter.

Example: I want to split:

scott>=tiger

with >= as delimiter so that I can get scott and tiger.

3

35 Answers 35

1
2
0

Some answers lack a special case. If you have a csv where you want to read equal number of columns, the code fails for cases like this: Row1: a,b,c,d Row2: g,e,, For Row2 only 3 items are read

A special treatment at end of loop adds an empty string:

if (startIndex != str.size())
    result.emplace_back(str.begin() + startIndex, str.end());  
else if (result.size())     // min 1 separator found before. 
    result.emplace_back();

However it will not add a string if there is only 1 column without delim, which is filled in some rows with data and is empty for other rows

Sign up to request clarification or add additional context in comments.

Comments

0

Yet another.... This one should be easy to add features to over time without changing the function signature since I used "flags" rather than separate bool options.

utils.h

#include <string>
#include <vector>

namespace utils
{
    void ltrim( std::string &s );
    void rtrim( std::string &s );
    void trim(  std::string &s );
    
    enum SplitFlags
    {
        SPLIT_TRIMMED  = 0x01
    ,   SPLIT_NO_EMPTY = 0x02
    };
    std::vector<std::string> split(
        const std::string &s, const char delimiter, const int flags=0 );
}

utils.cpp

#include <sstream>
#include <algorithm>
#include <cctype>
#include <locale>

#include "utils.h"

void utils::ltrim( std::string &s )
{
    s.erase( s.begin(), std::find_if( s.begin(), s.end(),
        []( unsigned char ch ) { return !std::isspace( ch ); } ) );
}

void utils::rtrim( std::string &s )
{
    s.erase( std::find_if( s.rbegin(), s.rend(),
        []( unsigned char ch ) { return !std::isspace( ch ); } ).base(), s.end() );
}

void utils::trim( std::string &s )
{
    rtrim( s );
    ltrim( s );
}
    
std::vector<std::string> utils::split(
    const std::string &s, const char delimiter, const int flags )
{
    const bool trimmed( flags & SPLIT_TRIMMED  )
             , noEmpty( flags & SPLIT_NO_EMPTY )
    ;
    std::vector<std::string> tokens;
    std::stringstream ss( s );
    for( std::string t; getline( ss, t, delimiter ); )
    {
        if( trimmed ) trim( t );
        if( noEmpty && t.empty() ) continue;
        tokens.push_back( t );
    }
    return tokens;
}

Example use:

const auto parts( utils::split( 
    " , a g , b, c, ", ',', utils::SPLIT_TRIMMED | utils::SPLIT_NO_EMPTY ) );

Comments

0

It would give both despair and hope to functional programming world to see the very imperative answers above. 🙂
In APL they can do :

text ← 'Hello world from APL' 
mask ← text ≠ ' ' ⍝ Create boolean mask (1 where not space) 
⍝ Result: 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 1 0 1 1 1 

⍝ Then use partitioned enclose to group 
words ← (mask ⊆ text) 
⍝ Result: 'Hello' 'world' 'from' 'APL'

(If we don't have Ranges ... ) Let's create a boolean mask of unwanted characters,
and then adjacent_difference using xor to get boundaries.
[Assuming : Text starts with a word character, and ends with a non-word character]

int main() {
    std::string text = "Hello  world   from C++ ";
    
    // Create mask: 1 for non-space, 0 for space
    std::vector<int> mask(text.size());
    std::transform(text.begin(),text.end(),mask.begin(),
                            [](char c) { return c != ' ' ? 1 : 0; });
    
    // Apply adjacent_difference with XOR
    std::vector<int> xor_vec(text.size());
    std::adjacent_difference(mask.begin(), mask.end(), xor_vec.begin(),
                            [](int curr, int prev) { return curr ^ prev; });

We get :

Text:     "Hello  world   from C++ "
Mask:      111110011111000111101110
XOR diff:  100001010000100100010001

This part's still kinda mechanical ... 😕

    std::vector<std::string> word_vector;
    std::string current_word;
    int max_length = text.size();
    int cursor = 0;
  
    do {
        current_word += text[cursor];
        cursor++;
        if (cursor < max_length && xor_vec[cursor] == 1) 
        {
                word_vector.push_back(current_word);
                current_word.clear();
                do   ++cursor;
                while (cursor < max_length && xor_vec[cursor] == 0);
        }
    } while (cursor < max_length);
    
    for (auto& w : word_vector)  std::cout << w << "\n";
}   

Comments

-1

As a bonus, here is a code example of a split function and macro that is easy to use and where you can choose the container type :

#include <iostream>
#include <vector>
#include <string>

#define split(str, delim, type) (split_fn<type<std::string>>(str, delim))
 
template <typename Container>
Container split_fn(const std::string& str, char delim = ' ') {
    Container cont{};
    std::size_t current, previous = 0;
    current = str.find(delim);
    while (current != std::string::npos) {
        cont.push_back(str.substr(previous, current - previous));
        previous = current + 1;
        current = str.find(delim, previous);
    }
    cont.push_back(str.substr(previous, current - previous));
    
    return cont;
}

int main() {
    
    auto test = std::string{"This is a great test"};
    auto res = split(test, ' ', std::vector);
    
    for(auto &i : res) {
        std::cout << i << ", "; // "this", "is", "a", "great", "test"
    }
    
    
    return 0;
}

Comments

-5
std::vector<std::string> split(const std::string& s, char c) {
  std::vector<std::string> v;
  unsigned int ii = 0;
  unsigned int j = s.find(c);
  while (j < s.length()) {
    v.push_back(s.substr(i, j - i));
    i = ++j;
    j = s.find(c, j);
    if (j >= s.length()) {
      v.push_back(s.substr(i, s,length()));
      break;
    }
  }
  return v;
}

1 Comment

Please be more accurate. Your code will not compile. See declaration of "i" and the comma instead of a dot.
1
2

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.