0

I have the following code written in C++ to extract a given range of text in a Piece Table data structure. Here is the function of class PieceTable that stores the given range of text in the character array buffer :

void PieceTable::getTextInRange(unsigned __int64 startPos, unsigned __int64 endPos, char buffer[]){

    char* totalBuffer = new char[getSize() + 2];

    getBuffer(totalBuffer);

    if(endPos >= getSize())
        endPos = getSize() - 1; 

    cout<<"startPos : "<<startPos<<endl;
    cout<<"endPos : "<<endPos<<endl;

    memcpy(buffer, &totalBuffer[startPos], endPos - startPos + 1);

    buffer[endPos - startPos + 2] = '\0';

    if(totalBuffer != 0)
        delete[] totalBuffer;
    totalBuffer = 0;
}

Here is the piece of code in the main method which i use to test this code :

temp2 = new char[end - start + 2];  //changing 2 to 3 solves the problem
pieceTable.getTextInRange(Start, end, temp2);
for(int i = 0; i< end - start + 1; i++)
   cout<<temp2[i];
cout<<endl;

if( temp2 != 0)
{
  delete[] temp2;   //this line causes the heap corruption error
  temp2 = 0;
}

Declaration of temp2 : char* temp2;

Whenever the program encounters the delete[] temp2 statement, there is a heap corruption error. The problem does not occur if I allocate memory for temp2 as:
temp2 = new char[end - start + 3] So, basically changing the length solves the problem. I know that I am messing up with the lengths somewhere, but I can't figure out where.

EDIT : getSize() :

__int64 PieceTable::getSize()
{
    return dList.getLength(dList.getBack());
}

I am using a piece table data structure. Here it is, inside this paper:http://www.cs.unm.edu/~crowley/papers/sds.pdf

I may be wrong, but I don't think that there is any problem with getSize(), since the function I use to retrieve the length of the entire buffer getBuffer, works as shown in the code.

7
  • 1
    Print the return value of getSize() + 2 and also the value of endPos - startPos + 2 to make sure that the former is larger than the latter. Commented Dec 9, 2011 at 5:29
  • actually, getSize() + 2 is used for the variable totalBuffer, which is a local variable in the function getTextInRange. Commented Dec 9, 2011 at 5:31
  • 1
    What was your initial allocation? Commented Dec 9, 2011 at 5:32
  • temp2 = 0 is the initial allocation for temp2 Commented Dec 9, 2011 at 5:34
  • @devjeetroy: You have an off-by-one error, show us the definition of getSize(). Commented Dec 9, 2011 at 5:38

3 Answers 3

7

In PieceTable::getTextInRange, you have this line:

buffer[endPos - startPos + 2] = '\0';

and when you allocate the thing that you pass in as buffer you allocate like this:

temp2 = new char[end - start + 2];

Lets put in some real numbers...

buffer[5 - 2 + 2] = '\0';

temp2 = new char[5 - 2 + 2];

which is equivalent to:

buffer[5] = '\0';

temp2 = new char[5];

Well, there's your problem. If you do new char [5] you get an array that has valid indexes from 0 through 4. 5 is not a valid index into this array.

Might I suggest that you make it a rule that you only break in the most of extenuating of circumstances that you always specify ranges in terms of [begin, end) like the STL does. This means you specify one past the last desired index for end. This makes range calculation math much less error prone. Also, the consistency of the interface with the way STL works makes it easier to work with. For example, calculating the size of the range is always end - begin with this scheme.

There is an old (circa 1982) paper by E.W. Dijkstra that gives some good reasons why this scheme for expressing ranges is the best one.

Sign up to request clarification or add additional context in comments.

5 Comments

oh yeah! thanks! I knew I was missing something very trivial! thanks! yeah, I will keep your advice in mind
@devjeetroy: There's actually a rather well known and rather old paper that details all the possible choices for representing ranges and gives some fantastic reasons why [begin, end) is the right answer. I don't have a link handy, but the standard the STL chose is based on a solid foundation.
1) Remark: Although not part of STL, std::string does work with index + size rather than ranges. 2) Could you find that paper?
@ybungalobill: I've been hunting it. I think it may have been written by Dijkstra. This appears to be a reference to it: lambda-the-ultimate.org/node/1950
@ybungalobill: And the original source (or some facsimile thereof): cs.utexas.edu/users/EWD/transcriptions/EWD08xx/EWD831.html
2

The reason changing the 2 to a 3 in the code:

temp2 = new char[end - start + 2];

works is because otherwise you'll write past the end of the buffer in getTextInRange (you're off by one).


You're end and start above correspond to the arguments endPos and startPos in getTextInRange, and in getTextInRange you have:

buffer[endPos - startPos + 2] = '\0';


The range of your array is [0, endPos - startPos + 2); therefore the element at position endPos - startPos + 2 is 1 past the end of your array. Overwriting this value is causing the heap to become corrupted.

Comments

1

It is clear from your code that the last index which you're using in getTextInRange is this:

endPos-startPos+2 //last index

which pretty much explains why you need to allocate memory minimum of size this:

endPos-startPos+3 //number of objects : memory allocation

That is, if you allocate memory for N objects, the last object in the array can be accessed with the index N-1 which is also the maximum index for the array. The index N falls out of the range. Recall that the index stars with 0, so it has to end at N-1, not at N.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.