3

I have a program that inputs 2 txt files.

deaths.txt

29.0
122.0
453.0

years.txt

1995
1996
1997

I make lists from the data

deaths = open("deaths.txt").read().splitlines()
years = open("years.txt").read().splitlines()

then I convert the lists to int and floats

for x in years[:-1]:
    x = int(x)

for x in deaths[:-1]:
    x = float(x)

and then the part where it gives the error: ValueError: could not convert string to float

plt.plot(years, deaths)

So it says it cannot convert strings to floats. But I thought I allready did that. What could be the reason?

5
  • Can you provide the content of deaths and years before you convert the lists? I don't get an error with these arrays: deaths = ["29.0", "122.0", "453.0"] years = ["1995", "1996", "1997"] Commented Jun 24, 2016 at 8:28
  • why are you not converting the last element, what is it, and are you sure you want to plot it (because that's what your are doing)? Commented Jun 24, 2016 at 8:38
  • also you can use list comprehension to convert, or even better map Commented Jun 24, 2016 at 8:40
  • 1
    You get error because you skip last element during conversion and after conversation you do not save x element back to arrays years and deaths. Try to write deaths_int= map(int,deaths). And plot deaths_int array. Commented Jun 24, 2016 at 9:28
  • Thands @StanleyR I did indeed skip the last element of the list. The last element would be " ". So it couldn't change it. But I did tell matplotlib to plot it and that is when the error came. Commented Jun 24, 2016 at 10:26

1 Answer 1

3

The following should get you going. Rather than using readlines() to read the whole file, a better approach would be to convert each row as it is read in.

As your two data files have a different number of elements, the code makes use of zip_longest to fill in any missing death data with 0.0:

from itertools import zip_longest
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker

with open('deaths.txt') as f_deaths:
    deaths = [float(row) for row in f_deaths]

with open('years.txt') as f_years:
    years = [int(row) for row in f_years]

# Add these to deal with missing data in your files, (see Q before edit)    
years_deaths = list(zip_longest(years, deaths, fillvalue=0.0))
years = [y for y, d in years_deaths]
deaths = [d for y, d in years_deaths]

print(deaths)
print(years)

plt.xlabel('Year')
plt.ylabel('Deaths')

ax = plt.gca()
ax.xaxis.set_major_formatter(ticker.FormatStrFormatter('%d'))
ax.set_xticks(years)

plt.plot(years, deaths)
plt.show()

This will display the following on the screen, showing that the conversions to ints and floats were correct:

[29.0, 122.0, 453.0, 0.0]
[1995, 1996, 1997, 1998]    

And the following graph will then be displayed:

matplotlib graph

Sign up to request clarification or add additional context in comments.

2 Comments

I figured it out. I like you approach. The thing was that I didn't convert the last character because it was "". The last line of the file. But I did ask matplot lib to plot it. And that is when the error came. I gave it an upvote because it was very helpfull. But it did not solve my problem.
If you are trying to deal with missing data, one approach would be to use zip_longest to pad out missing entries with a fill value. For example 0.0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.