0

The question is: The character L repeats 20 times in a text file i.e. somewhere in the file we have LLLLLLLLLLLLLLLLLLLL. It takes 20 bytes to store this ‘run’ of L. However, if we write 20L in the file it takes much less. But 20 is not a character. It is a number and we don’t want to write numbers in a text file. There is another way out. Let us use the capital letters to represent the runs i.e. if L occurs once we write AL, if twice, we write BL and so on. So we write TL for 20 occurrences of L. This method can code only upto 26 occurrences. If a character occurs more then we can write one more code for it. Thus, in the coded file, for saving space, a string of DfFAB-ZsAsD AA stands for ffffAAAAAA—sssssssssssssssssssssssssss A. Write a program that reads from a text file and compresses it using this method.

My attempt:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    printf("Enter input(max. 99 characters): ");
    char szInput[100];
    char chInput;
    int iii = 0;
    do
    {
        chInput = getchar();
        szInput[iii] = chInput;
        iii++;
    } while (chInput != '\n');
    szInput[iii--] = '\0';
    char *szOutput = malloc(2 * (iii + 1) * sizeof(char));
    iii = 0;
    int jjj = 0;
    while (szInput[iii] != '\0')
    {
        int nCount = 1;
        while (szInput[iii + nCount] == szInput[iii] && nCount < 26)
        {
            nCount++;
        }
        szOutput[jjj] = nCount + 64;
        szOutput[++jjj] = szInput[iii];
        iii += nCount;
        jjj++;
    }
    szOutput[jjj] = '\0';
    printf("%s", szOutput);
    return 0;
}

When I give an input "eee" or "eeeee", the output is CeA and EeA respectively. It prints an extra A in the end. I can't find the error in my code.

4
  • 1
    Err.. how do we know whether L is a count or a letter, such as your very first example? Commented Apr 23, 2015 at 12:09
  • @WeatherVane I think the OP is saying that its always a fixed length two character encoding. First character is the count. Second character is the letter. Commented Apr 23, 2015 at 12:15
  • I see now, but it's a very inefficient way of RLE for text, where it is rare to have more than two consecutive characters. It doubles the requirement of every singleton. Commented Apr 23, 2015 at 12:19
  • @WeatherVane I think it's more a kind of an assignment to practice rather than something that would be useful as it is. Commented Apr 23, 2015 at 12:41

1 Answer 1

3

Your problem is this :

szInput[iii--] = '\0';

This is not overwriting the \n

You should write:

szInput[--iii] = '\0';
Sign up to request clarification or add additional context in comments.

2 Comments

I was just about to answer with the same thing. Additionally, having a clear name for the pointer variables might help.
Comments wouldn't hurt either.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.