python - Paginated API requests - Stack Overflow

IT技术

更新时间：2025-03-141

admin管理员组
文章数量:1313386

Description
I am trying to make a paginated request to a public api.
The request limit is 100 and so looped paginated requests are required to pull all records.
The api contains some number of erroneous records that, if contained within the paginated request, will cause to the request to fail.
I have attempted to create a loop that identifies and then skips the erroneous records in an efficient way while maintaining the largest batch limits where possible. However, I feel my approach of halving the batch limit is a bit simplistic and I wonder if there is a more efficient approach than mine?

Current Approach

Set the initial parameters for the API request, including limit and offset.
Create a loop that continues until all records are fetched.
In each iteration, make a request to the API with the current limit and offset.
If the request is successful (status code 200), process the data and extend a results list.
If an error 500 occurs an erroneous record is contained with the batch, halve the current batch limit until a successful request can be made.
If the limit has been reduced to 1 and a 500 error is received an erroneous record is identified. Increment the offset by 1 to skip the record and return the batch limit to maximum.
Continue until all records are fetched.

Code

import requests

# set initial parameters
url = ";
headers = {'accept': 'application/json'}
params = {
    'limit':100,
    'offset':0,
    'timescales' : 'LONG_TERM',
    "statuses": "",
    'productTypes': 'LT_EXPLICIT_ANNUAL, LT_EXPLICIT_SEASONAL, LT_EXPLICIT_QUARTERLY, LT_EXPLICIT_MONTHLY',
    'sortBy': 'BIDDING_PERIOD_START_ASC'
}

# empty list to store data
all_data = []
# initial pull request to get total records count
total_records = requests.get(url = url, headers = headers,params = params).json()['totalCount']

# loop to paginate and get all records
while params['offset'] < total_records:
    
    response = requests.get(url=url, headers=headers, params=params)
    
    # successful request
    if response.status_code == 200:
        print(f"Success: {params['offset']} to {params['offset'] + params['limit']}")
        data = response.json()
        all_data.extend(data['entries'])
        params['offset'] += params['limit']  # Move to the next set of records
        
    # failed request    
    elif response.status_code == 500:
        print(f"Fail: {params['offset']} to {params['offset'] + params['limit']}")
        if params['limit'] > 1:
            params['limit'] = params['limit'] // 2  # Halve the limit to narrow down the search
        else:
            # If limit is 1 and we get a 500 error, skip the problematic record
            params['offset'] += 1  # Increment offset to skip the problematic record
            params['limit'] = 100  # Reset limit back to 100 for the next batch

本文标签： pythonPaginated API requestsStack Overflow

版权声明：本文标题：python - Paginated API requests - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1741915139a2404685.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

python - Paginated API requests - Stack Overflow

更多相关文章