5

I'm running a script to upload 20k+ XML files to an API. About 18k in, I get a memory error. I was looking into it and found the memory is just continually climbing until it reaches the limit and errors out (seemingly on the post call). Anyone know why this is happening or a fix? Thanks. I have tried the streaming uploads found here. The empty strings are due to sensitive data.

def upload(self, oauth_token, full_file_path):
        file_name = os.path.basename(full_file_path)
        upload_endpoint = {'':''}
        params = {'': '','': ''}
        headers = {'': '', '': ''}
        handler = None
        try:
            handler = open(full_file_path, 'rb')
            response = requests.post(url=upload_endpoint[''], params=params, data=handler, headers=headers, auth=oauth_token, verify=False, allow_redirects=False, timeout=600)
            status_code = response.status_code
            # status checking
            return status_code
        finally:
            if handler:
                handler.close()

    def push_data(self):
        oauth_token = self.get_oauth_token()
        files = os.listdir(f_dir)
        for file in files:
            status = self.upload(oauth_token, file_to_upload)
7
  • 1
    I don't know if it's related to the issue, but use context managers to handle files. What is the try doing here? Why is there a self parameters, are these part of a class? Commented Jan 16, 2020 at 21:13
  • 2
    @AMC there is error handling in the status checking area that uses it. I just removed it to shorten the code block. Yes this is part of a class. Commented Jan 16, 2020 at 21:16
  • 1
    What is the class for? Have you benchmarked/profiled the code? Commented Jan 16, 2020 at 21:20
  • 2
    @AMC Just to upload these files. It was passed along to me by another dev team. Profiling it is a little difficult because of where the computer resides. It is remote. Commented Jan 16, 2020 at 21:24
  • Just to upload these files. I hope it's a good use case for a class. Profiling it is a little difficult because of where the computer resides. It is remote. Unfortunate, I guess. Can you run it with the change I suggested on a smaller input, and see if it behaves differently? Commented Jan 16, 2020 at 21:32

1 Answer 1

1

What version of Python are you using? It looks like there is a bug in Python 3.4 causing memory leaks related to network requests. See here for a similar issue: https://github.com/psf/requests/issues/5215

It may help to update Python.

Sign up to request clarification or add additional context in comments.

1 Comment

Looks like that was the issue. Thanks!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.